linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
       [not found]                 ` <CAHk-=wiCqDiU7SE3FLn2W26MS_voUAuqj5XFa1V_tiGTrrW-zQ@mail.gmail.com>
@ 2019-09-20 13:46                   ` Ahmed S. Darwish
  2019-09-20 14:33                     ` Andy Lutomirski
  2019-09-20 17:26                     ` Willy Tarreau
  2019-09-26 20:42                   ` [PATCH v5 0/1] random: getrandom(2): warn on large CRNG waits, introduce new flags Ahmed S. Darwish
  1 sibling, 2 replies; 30+ messages in thread
From: Ahmed S. Darwish @ 2019-09-20 13:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Lennart Poettering, Theodore Y. Ts'o, Eric W. Biederman,
	Alexander E. Patrakov, Michael Kerrisk, Willy Tarreau,
	Matthew Garrett, lkml, linux-ext4, linux-api, linux-man

Hi,

On Wed, Sep 18, 2019 at 04:57:58PM -0700, Linus Torvalds wrote:
> On Wed, Sep 18, 2019 at 2:17 PM Ahmed S. Darwish <darwish.07@gmail.com> wrote:
> >
> > Since Linux v3.17, getrandom(2) has been created as a new and more
> > secure interface for pseudorandom data requests.  It attempted to
> > solve three problems, as compared to /dev/urandom:
  > 
> I don't think your patch is really _wrong_, but I think it's silly to
> introduce a new system call, when we have 30 bits left in the flags of
> the old one, and the old system call checked them.
> 
> So it's much simpler and more straightforward to  just introduce a
> single new bit #2 that says "I actually know what I'm doing, and I'm
> explicitly asking for secure/insecure random data".
> 
> And then say that the existing bit #1 just means "I want to wait for entropy".
> 
> So then you end up with this:
> 
>     /*
>      * Flags for getrandom(2)
>      *
>      * GRND_NONBLOCK    Don't block and return EAGAIN instead
>      * GRND_WAIT_ENTROPY        Explicitly wait for entropy
>      * GRND_EXPLICIT    Make it clear you know what you are doing
>      */
>     #define GRND_NONBLOCK               0x0001
>     #define GRND_WAIT_ENTROPY   0x0002
>     #define GRND_EXPLICIT               0x0004
> 
>     #define GRND_SECURE (GRND_EXPLICIT | GRND_WAIT_ENTROPY)
>     #define GRND_INSECURE       (GRND_EXPLICIT | GRND_NONBLOCK)
> 
>     /* Nobody wants /dev/random behavior, nobody should use it */
>     #define GRND_RANDOM 0x0002
> 
> which is actually fairly easy to understand. So now we have three
> bits, and the values are:
> 
>  000  - ambiguous "secure or just lazy/ignorant"
>  001 - -EAGAIN or secure
>  010 - blocking /dev/random DO NOT USE
>  011 - nonblocking /dev/random DO NOT USE
>  100 - nonsense, returns -EINVAL
>  101 - /dev/urandom without warnings
>  110 - blocking secure
>  111 - -EAGAIN or secure
>

Hmmm, the point of the new syscall was **exactly** to avoid the 2^3
combinations above, and to provide developers only two, sane and easy,
options:

  - GRND2_INSECURE
  - GRND2_SECURE_UNBOUNDED_INITIAL_WAIT

You *must* pick one of these, and that's it. (!)

Then the proposed getrandom_wait(7) manpage, also mentioned in the V4
patch WARN message, would provide a big rationale, and encourage
everyone to use the new getrandom2(2) syscall instead.

But yeah, maybe we should add the extra flags to the old getrandom()
instead, and let glibc implement a getrandom_safe(3) wrapper only
with the sane options available.

Problem is, glibc is still *really* slow in adopting linux syscall
wrappers, so I'm not optimistic about that...

I still see the new system call as the sanest path, even provided
the cost of a new syscall number..

@Linus, @Ted:  Final thoughts?

> and people would be encouraged to use one of these three:
> 
>  - GRND_INSECURE
>  - GRND_SECURE
>  - GRND_SECURE | GRND_NONBLOCK
> 
> all of which actually make sense, and none of which have any
> ambiguity. And while "GRND_INSECURE | GRND_NONBLOCK" works, it's
> exactly the same as just plain GRND_INSECURE - the point is that it
> doesn't block for entropy anyway, so non-blocking makes no different.
>

[...]

> 
> There is *one* other small semantic change: The old code did
> urandom_read() which added warnings, but each warning also _reset_ the
> crng_init_cnt. Until it decided not to warn any more, at which point
> it also stops that resetting of crng_init_cnt.
> 
> And that reset of crng_init_cnt, btw, is some cray cray.
> 
> It's basically a "we used up entropy" thing, which is very
> questionable to begin with as the whole discussion has shown, but
> since it stops doing it after 10 cases, it's not even good security
> assuming the "use up entropy" case makes sense in the first place.
> 
> So I didn't copy that insanity either. And I'm wondering if removing
> it from /dev/urandom might also end up helping Ahmed's case of getting
> entropy earlier, when we don't reset the counter.
>

Yeah, noticed that, but I've learned not to change crypto or
speculative-execution code even if the changes "just look the same" at
first glance ;-)

(out of curiosity, I'll do a quick test with this CRNG entropy reset
part removed. Maybe it was indeed part of the problem..)

> But other than those two details, none of the existing semantics
> changed, we just added the three actually _sane_ cases without any
> ambiguity.
> 
> In particular, this still leaves the semantics of that nasty
> "getrandom(0)" as the same "blocking urandom" that it currently is.
> But now it's a separate case, and we can make that perhaps do the
> timeout, or at least the warning.
>

Yeah, I would propose to keep the V4-submitted "timeout then WARN"
logic. This alone will give user-space / distributions time to adapt.

For example, it was interesting that even the 0day bot had limited
entropy on boot (virtio-rng / TRUST_CPU not enabled):

    https://lkml.kernel.org/r/20190920005120.GP15734@shao2-debian

If user-space didn't get its act together, then the other extreme
measures can be implemented later (the getrandom() length test, using
jitter as a credited kernel entropy source, etc., etc.)

> And the new cases are defined to *not* warn. In particular,
> GRND_INSECURE very much does *not* warn about early urandom access
> when crng isn't ready. Because the whole point of that new mode is
> that the user knows it isn't secure.
>
> So that should make getrandom(GRND_INSECURE) palatable to the systemd
> kind of use that wanted to avoid the pointless kernel warning.
>

Yup, that's what was in the submitted V4 patch too. The caller
explicitly asked for "insecure", so they know what they're doing.

getrandom2(2) never prints any kernel message.

> And we could mark this for stable and try to get it backported so that
> it will have better coverage, and encourage people to use the new sane
> _explicit_ waiting (or not) for entropy.
>

ACK. I'll wait for an answer to the "Final thoughts?" question above,
send a V5 with CC:stable, then disappear from this thread ;-)

Thanks a lot everyone!

--
Ahmed Darwish

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 13:46                   ` [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2() Ahmed S. Darwish
@ 2019-09-20 14:33                     ` Andy Lutomirski
  2019-09-20 16:29                       ` Linus Torvalds
  2019-09-20 17:26                     ` Willy Tarreau
  1 sibling, 1 reply; 30+ messages in thread
From: Andy Lutomirski @ 2019-09-20 14:33 UTC (permalink / raw)
  To: Ahmed S. Darwish
  Cc: Linus Torvalds, Lennart Poettering, Theodore Y. Ts'o,
	Eric W. Biederman, Alexander E. Patrakov, Michael Kerrisk,
	Willy Tarreau, Matthew Garrett, lkml, Ext4 Developers List,
	Linux API, linux-man

On Fri, Sep 20, 2019 at 6:46 AM Ahmed S. Darwish <darwish.07@gmail.com> wrote:
>
> Hi,
>
> On Wed, Sep 18, 2019 at 04:57:58PM -0700, Linus Torvalds wrote:
> > On Wed, Sep 18, 2019 at 2:17 PM Ahmed S. Darwish <darwish.07@gmail.com> wrote:
> > >
> > > Since Linux v3.17, getrandom(2) has been created as a new and more
> > > secure interface for pseudorandom data requests.  It attempted to
> > > solve three problems, as compared to /dev/urandom:
>   >
> > I don't think your patch is really _wrong_, but I think it's silly to
> > introduce a new system call, when we have 30 bits left in the flags of
> > the old one, and the old system call checked them.
> >
> > So it's much simpler and more straightforward to  just introduce a
> > single new bit #2 that says "I actually know what I'm doing, and I'm
> > explicitly asking for secure/insecure random data".
> >
> > And then say that the existing bit #1 just means "I want to wait for entropy".
> >
> > So then you end up with this:
> >
> >     /*
> >      * Flags for getrandom(2)
> >      *
> >      * GRND_NONBLOCK    Don't block and return EAGAIN instead
> >      * GRND_WAIT_ENTROPY        Explicitly wait for entropy
> >      * GRND_EXPLICIT    Make it clear you know what you are doing
> >      */
> >     #define GRND_NONBLOCK               0x0001
> >     #define GRND_WAIT_ENTROPY   0x0002
> >     #define GRND_EXPLICIT               0x0004

What is this GRND_EXPLICIT thing?

A few weeks ago, I sent a whole series to address this, and I
obviously didn't cc enough people.  I'll resend a rebased version
today.  Meanwhile, some comments on this whole mess:

As I think everyone mostly agrees in this whole thread, getrandom()
can't just magically start returning non-random results.  That would
be a big problem.

Linus, I disagree that blocking while waiting for randomness is an
error.  Sometimes you want to generate a key, you want to finish as
quickly as possible, and you don't want to be in the business of
fiddling with the setup of the kernel RNG.  I would argue that *most*
crypto applications are in this category.  I think that the kernel
should, instead, handle this mess itself.  As a first pass, it could
be as simple as noticing that someone is blocking on randomness and
kicking off a thread that does some randomish reads to the rootfs.
This would roughly simulate the old behavior in which an ext4 rootfs
did more IO than necessary.  A fancier version would, as discussed in
this thread, do more clever things.

(As an aside, I am not a fan of xoring or adding stuff to the CRNG
state.  We should just use an actual crypto primitive for this.
Accumulate the state in a buffer and SHA-512 it.  Or use something
like the Keccak duplex sponge.  But this is a discussion for another
day.)

So I'm going to resend my series.  You can all fight over whether the
patch that actually goes in should be based on my series or based on
this patch.

--Andy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 14:33                     ` Andy Lutomirski
@ 2019-09-20 16:29                       ` Linus Torvalds
  2019-09-20 17:52                         ` Andy Lutomirski
  0 siblings, 1 reply; 30+ messages in thread
From: Linus Torvalds @ 2019-09-20 16:29 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ahmed S. Darwish, Lennart Poettering, Theodore Y. Ts'o,
	Eric W. Biederman, Alexander E. Patrakov, Michael Kerrisk,
	Willy Tarreau, Matthew Garrett, lkml, Ext4 Developers List,
	Linux API, linux-man

On Fri, Sep 20, 2019 at 7:34 AM Andy Lutomirski <luto@kernel.org> wrote:
>
> What is this GRND_EXPLICIT thing?

Your own email gives the explanation:

> Linus, I disagree that blocking while waiting for randomness is an
> error.  Sometimes you want to generate a key

That's *exactly* why GRND_EXPLICIT needs to be done regardless.

The keyword there is "Sometimes".

But people currently use "getrandom(0)" when they DO NOT want a key,
they just want some miscellaneous random numbers for some totally
non-security-related reason.

And that will continue. Exactly because the people who do not want a
key by definition aren't thinking about it very hard.

So the interface was very much mis-designed from the get-go. It was
designed purely for key people, even though generating keys is by no
means the most common reason for wanting a block of "random" numbers.

So GRND_EXPLICIT is there very much to make sure people who want true
secure keys will say so, and five years from now we will not have the
confusion between "Oh, I wasn't thinking about bootup". Because at a
minimum, in the near future getrandom(0) will warn about the
ambiguity. Or it will use some questionable jitter entropy that some
real key users will look at sideways and go "I don't want that".

This is an ABI design issue. The old ABI was fundamentally misdesigned
and actively encouraged the current situation of mixing secure and
insecure callers for that getrandom(0).

And it's entirely orthogonal to _any_ actual technical change we will
do (like removing the old GRND_RANDOM behavior entirely, which is
insane for other reasons and nobody ever wanted or likely used).

            Linus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 13:46                   ` [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2() Ahmed S. Darwish
  2019-09-20 14:33                     ` Andy Lutomirski
@ 2019-09-20 17:26                     ` Willy Tarreau
  2019-09-20 17:56                       ` Ahmed S. Darwish
  1 sibling, 1 reply; 30+ messages in thread
From: Willy Tarreau @ 2019-09-20 17:26 UTC (permalink / raw)
  To: Ahmed S. Darwish
  Cc: Linus Torvalds, Lennart Poettering, Theodore Y. Ts'o,
	Eric W. Biederman, Alexander E. Patrakov, Michael Kerrisk,
	Matthew Garrett, lkml, linux-ext4, linux-api, linux-man

Hi Ahmed,

On Fri, Sep 20, 2019 at 03:46:09PM +0200, Ahmed S. Darwish wrote:
> Problem is, glibc is still *really* slow in adopting linux syscall
> wrappers, so I'm not optimistic about that...
>
> I still see the new system call as the sanest path, even provided
> the cost of a new syscall number..

New syscalls are always a pain to deal with in userland, because when
they are introduced, everyone wants them long before they're available
in glibc. So userland has to define NR_xxx for each supported arch and
to perform the call itself.

With flags adoption is instantaneous. Just #ifndef/#define, check if
the flag is supported and that's done. The only valid reason for a new
syscall is when the API changes (e.g. one extra arg, a la accept4()),
which doesn't seem to be the case here. Otherwise please by all means
avoid this in general.

Thanks,
Willy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 16:29                       ` Linus Torvalds
@ 2019-09-20 17:52                         ` Andy Lutomirski
  2019-09-20 18:09                           ` Linus Torvalds
                                             ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Andy Lutomirski @ 2019-09-20 17:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andy Lutomirski, Ahmed S. Darwish, Lennart Poettering,
	Theodore Y. Ts'o, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, Willy Tarreau, Matthew Garrett, lkml,
	Ext4 Developers List, Linux API, linux-man

On Fri, Sep 20, 2019 at 9:30 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Fri, Sep 20, 2019 at 7:34 AM Andy Lutomirski <luto@kernel.org> wrote:
> >
> > What is this GRND_EXPLICIT thing?
>
> Your own email gives the explanation:
>
> > Linus, I disagree that blocking while waiting for randomness is an
> > error.  Sometimes you want to generate a key
>
> That's *exactly* why GRND_EXPLICIT needs to be done regardless.
>
> The keyword there is "Sometimes".
>
> But people currently use "getrandom(0)" when they DO NOT want a key,
> they just want some miscellaneous random numbers for some totally
> non-security-related reason.
>
> And that will continue. Exactly because the people who do not want a
> key by definition aren't thinking about it very hard.

I fully agree that this is a problem.  It's a problem we brought on
ourselves because we screwed up the ABI from the beginning.  The
question is what to do about it that doesn't cause its own set of
nasty problems.

> So GRND_EXPLICIT is there very much to make sure people who want true
> secure keys will say so, and five years from now we will not have the
> confusion between "Oh, I wasn't thinking about bootup". Because at a
> minimum, in the near future getrandom(0) will warn about the
> ambiguity. Or it will use some questionable jitter entropy that some
> real key users will look at sideways and go "I don't want that".

There are programs that call getrandom(0) *today* that expect secure
output.  openssl does a horrible dance in which it calls getentropy()
if available and falls back to syscall(__NR_getrandom, buf, buflen, 0)
otherwise.  We can't break this use case.  Changing the semantics of
getrandom(0) out from under them seems like the worst kind of ABI
break -- existing applications will *appear* to continue working but
will, in fact, become insecure.

IMO, from the beginning, we should have done this:

GRND_INSECURE: insecure.  always works.

GRND_SECURE_BLOCKING: does exactly what it says.

0: -EINVAL.

Using it correctly would be obvious.  Something like GRND_EXPLICIT
would be a head-scratcher: people would have to look at the man page
and actually think about it, and it's still easy to get wrong:

getrandom(..., GRND_EXPLICIT): just fscking give me a number.  it
seems to work and it shuts up the warning

And we're back to square one.


I think that, given existing software, we should make two or three
changes to fix the basic problems here:

1. Add GRND_INSECURE: at least let new applications do the right thing
going forward.

2. Fix what is arguably a straight up kernel bug, not even an ABI
issue: when a user program is blocking in getrandom(..., 0), the
kernel happily sits there doing absolutely nothing and deadlocks the
system as a result.  This IMO isn't an ABI issue -- it's an
implementation problem.  How about we make getrandom() (probably
actually wait_for_random_bytes()) do something useful to try to seed
the RNG if the system is otherwise not doing IO.

3. Optionally, entirely in user code: Get glibc to add new *library*
functions: getentropy_secure_blocking() and getentropy_insecure() or
whatever they want to call them.  Deprecate getentropy().

I think #2 is critical.  Right now, suppose someone has a system that
neets to do a secure network request (a la Red Hat's Clevis).  I have
no idea what Clevis actually does, but it wouldn't be particularly
crazy to do a DH exchange or sign with an EC key to ask some network
server to help unlock a dm-crypt volume.  If the system does this at
boot, it needs to use getrandom(..., 0), GRND_EXPLICIT, or whatever,
because it NEEDS a secure random number.  No about of ABI fiddling
will change this.  The kernel should *work* in this case rather than
deadlocking.

--Andy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 17:26                     ` Willy Tarreau
@ 2019-09-20 17:56                       ` Ahmed S. Darwish
  0 siblings, 0 replies; 30+ messages in thread
From: Ahmed S. Darwish @ 2019-09-20 17:56 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Linus Torvalds, Lennart Poettering, Theodore Y. Ts'o,
	Eric W. Biederman, Alexander E. Patrakov, Michael Kerrisk,
	Matthew Garrett, lkml, linux-ext4, linux-api, linux-man

On Fri, Sep 20, 2019 at 07:26:09PM +0200, Willy Tarreau wrote:
> Hi Ahmed,
> 
> On Fri, Sep 20, 2019 at 03:46:09PM +0200, Ahmed S. Darwish wrote:
> > Problem is, glibc is still *really* slow in adopting linux syscall
> > wrappers, so I'm not optimistic about that...
> >
> > I still see the new system call as the sanest path, even provided
> > the cost of a new syscall number..
> 
> New syscalls are always a pain to deal with in userland, because when
> they are introduced, everyone wants them long before they're available
> in glibc. So userland has to define NR_xxx for each supported arch and
> to perform the call itself.
> 
> With flags adoption is instantaneous. Just #ifndef/#define, check if
> the flag is supported and that's done. The only valid reason for a new
> syscall is when the API changes (e.g. one extra arg, a la accept4()),
> which doesn't seem to be the case here. Otherwise please by all means
> avoid this in general.
> 

I see. Thanks a lot for the explanation above :)

--
Ahmed Darwish

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 17:52                         ` Andy Lutomirski
@ 2019-09-20 18:09                           ` Linus Torvalds
  2019-09-20 18:16                             ` Willy Tarreau
                                               ` (2 more replies)
  2019-09-20 18:12                           ` Willy Tarreau
  2019-09-20 18:15                           ` Alexander E. Patrakov
  2 siblings, 3 replies; 30+ messages in thread
From: Linus Torvalds @ 2019-09-20 18:09 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ahmed S. Darwish, Lennart Poettering, Theodore Y. Ts'o,
	Eric W. Biederman, Alexander E. Patrakov, Michael Kerrisk,
	Willy Tarreau, Matthew Garrett, lkml, Ext4 Developers List,
	Linux API, linux-man

On Fri, Sep 20, 2019 at 10:52 AM Andy Lutomirski <luto@kernel.org> wrote:
>
> IMO, from the beginning, we should have done this:
>
> GRND_INSECURE: insecure.  always works.
>
> GRND_SECURE_BLOCKING: does exactly what it says.
>
> 0: -EINVAL.

Violently agreed. And that's kind of what the GRND_EXPLICIT is really
aiming for.

However, it's worth noting that nobody should ever use GRND_EXPLICIT
directly. That's just the name for the bit. The actual users would use
GRND_INSECURE or GRND_SECURE.

And yes, maybe it's worth making the name be GRND_SECURE_BLOCKING just
to make people see what the big deal is.

In the meantime, we need that new bit just to be able to create the
new semantics eventually. With a warning to nudge people in the right
direction.

We may never be able to return -EINVAL, but we can add the pr_notice()
to discourage people from using it.

And yes, we'll have to block - at least for a time - to get some
entropy. But at some point we either start making entropy up, or we
say "0 means jitter-entropy for ten seconds".

That will _work_, but it will also make the security-people nervous,
which is just one more hint that they should move to
GRND_SECURE[_BLOCKING].

> getrandom(..., GRND_EXPLICIT): just fscking give me a number.  it
> seems to work and it shuts up the warning
>
> And we're back to square one.

Actually, you didn't read the GRND_INSECURE patch, did you.

getrandom(GRND_EXPLICIT) on its own returns -EINVAL.

Because yes, I thought about it, and yes, I agree that it's the same
as the old 0.

So GRND_EXPLICIT is a bit that basically means "I am explicit about
what behavior I want". But part of that is that you need to _state_
the behavior too.

So:

 - GRND_INSECURE is (GRND_EXPLICIT | GRND_NONBLOCK)

   As in "I explicitly ask you not to just not ever block": urandom

 - GRND_SECURE_BLOCKING is (GRND_EXPLICIT | GRND_RANDOM)

   As in "I explicitly ask you for those secure random numbers"

 - GRND_SECURE_NONBLOCKING is (GRND_EXPLICIT | GRND_RANDOM | GRND_NONBLOCK)

   As in "I want explicitly secure random numbers, but return -EAGAIN
if that would block".

Which are the three sane behaviors (that last one is useful for the "I
can try to generate entropy if you don't have any" case. I'm not sure
anybody will do it, but it definitely conceptually makes sense).

And I agree that your naming is better.

I had it as just "GRND_SECURE" for the blocking version, and
"GRND_SECURE | GRND_NONBLOCK" for the "secure but return EAGAIN if you
would need to block for entropy" version.

But explicitly stating the blockingness in the name makes it clearer
to the people who just want GRND_INSECURE, and makes them realize that
they don't want the blocking version.

             Linus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 17:52                         ` Andy Lutomirski
  2019-09-20 18:09                           ` Linus Torvalds
@ 2019-09-20 18:12                           ` Willy Tarreau
  2019-09-20 19:22                             ` Andy Lutomirski
  2019-09-20 18:15                           ` Alexander E. Patrakov
  2 siblings, 1 reply; 30+ messages in thread
From: Willy Tarreau @ 2019-09-20 18:12 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Linus Torvalds, Ahmed S. Darwish, Lennart Poettering,
	Theodore Y. Ts'o, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, Matthew Garrett, lkml, Ext4 Developers List,
	Linux API, linux-man

Hi Andy,

On Fri, Sep 20, 2019 at 10:52:30AM -0700, Andy Lutomirski wrote:
> 2. Fix what is arguably a straight up kernel bug, not even an ABI
> issue: when a user program is blocking in getrandom(..., 0), the
> kernel happily sits there doing absolutely nothing and deadlocks the
> system as a result.  This IMO isn't an ABI issue -- it's an
> implementation problem.  How about we make getrandom() (probably
> actually wait_for_random_bytes()) do something useful to try to seed
> the RNG if the system is otherwise not doing IO.

I thought about it as well with my old MSDOS reflexes, but here I
doubt we can do a lot. It seems fishy to me to start to fiddle with
various drivers from within a getrandom() syscall, we could sometimes
even end up waiting even longer because one device is already locked,
and when we have access there there's not much we can do without
risking to cause some harm. On desktop systems you have a bit more
choice than on headless systems (blink keyboard leds and time the
interrupts, run some disk accesses when there's still a disk, get a
copy of the last buffer of the audio input and/or output, turn on
the microphone and/or webcam, and collect some data). Many of them
cannot always be used. We could do some more portable stuff like scan
and hash the totality of the RAM. But that's all quite bad and
unreliable and at this point it's better to tell userland "here's
what I could get for you, if you want better, do it yourself" and the
userland can then ask the user "dear user, I really need valid entropy
this time to generate your GPG key, please type frantically on this
keyboard". And it will be more reliable this way in my opinion.

My analysis of the problem precisely lies in the fact that we've
always considered that the kernel had to provide randoms for any
use case and had to cover the most difficult cases and imposed
their constraints on simplest ones. Better let the application
decide.

Willy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 17:52                         ` Andy Lutomirski
  2019-09-20 18:09                           ` Linus Torvalds
  2019-09-20 18:12                           ` Willy Tarreau
@ 2019-09-20 18:15                           ` Alexander E. Patrakov
  2019-09-20 18:29                             ` Andy Lutomirski
  2 siblings, 1 reply; 30+ messages in thread
From: Alexander E. Patrakov @ 2019-09-20 18:15 UTC (permalink / raw)
  To: Andy Lutomirski, Linus Torvalds
  Cc: Ahmed S. Darwish, Lennart Poettering, Theodore Y. Ts'o,
	Eric W. Biederman, Michael Kerrisk, Willy Tarreau,
	Matthew Garrett, lkml, Ext4 Developers List, Linux API,
	linux-man

[-- Attachment #1: Type: text/plain, Size: 2133 bytes --]

20.09.2019 22:52, Andy Lutomirski пишет:
> I think that, given existing software, we should make two or three
> changes to fix the basic problems here:
> 
> 1. Add GRND_INSECURE: at least let new applications do the right thing
> going forward.
> 
> 2. Fix what is arguably a straight up kernel bug, not even an ABI
> issue: when a user program is blocking in getrandom(..., 0), the
> kernel happily sits there doing absolutely nothing and deadlocks the
> system as a result.  This IMO isn't an ABI issue -- it's an
> implementation problem.  How about we make getrandom() (probably
> actually wait_for_random_bytes()) do something useful to try to seed
> the RNG if the system is otherwise not doing IO.
> 
> 3. Optionally, entirely in user code: Get glibc to add new *library*
> functions: getentropy_secure_blocking() and getentropy_insecure() or
> whatever they want to call them.  Deprecate getentropy().
> 
> I think #2 is critical.  Right now, suppose someone has a system that
> neets to do a secure network request (a la Red Hat's Clevis).  I have
> no idea what Clevis actually does, but it wouldn't be particularly
> crazy to do a DH exchange or sign with an EC key to ask some network
> server to help unlock a dm-crypt volume.  If the system does this at
> boot, it needs to use getrandom(..., 0), GRND_EXPLICIT, or whatever,
> because it NEEDS a secure random number.  No about of ABI fiddling
> will change this.  The kernel should *work* in this case rather than
> deadlocking.

Let me express a little bit of disagreement with the logic here.

I do agree that #2 is critical, and the Clevis use case is a perfect 
example why it is important. I doubt that it is solvable without 
trusting jitter entropy, or without provoking a dummy read on a random 
block device, just for timings, or maybe some other interaction with the 
external world - but Willy already said "it seems fishy". However, _if_ 
it is solved, then we don't need GRND_INSECURE, because solving #2 is 
equivalent to magically making secure random numbers always available.

-- 
Alexander E. Patrakov


[-- Attachment #2: Криптографическая подпись S/MIME --]
[-- Type: application/pkcs7-signature, Size: 4052 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 18:09                           ` Linus Torvalds
@ 2019-09-20 18:16                             ` Willy Tarreau
  2019-09-20 19:12                             ` Andy Lutomirski
  2019-09-21  6:07                             ` Florian Weimer
  2 siblings, 0 replies; 30+ messages in thread
From: Willy Tarreau @ 2019-09-20 18:16 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andy Lutomirski, Ahmed S. Darwish, Lennart Poettering,
	Theodore Y. Ts'o, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, Matthew Garrett, lkml, Ext4 Developers List,
	Linux API, linux-man

On Fri, Sep 20, 2019 at 11:09:53AM -0700, Linus Torvalds wrote:
(...)
> So:
> 
>  - GRND_INSECURE is (GRND_EXPLICIT | GRND_NONBLOCK)
> 
>    As in "I explicitly ask you not to just not ever block": urandom
> 
>  - GRND_SECURE_BLOCKING is (GRND_EXPLICIT | GRND_RANDOM)
> 
>    As in "I explicitly ask you for those secure random numbers"
> 
>  - GRND_SECURE_NONBLOCKING is (GRND_EXPLICIT | GRND_RANDOM | GRND_NONBLOCK)
> 
>    As in "I want explicitly secure random numbers, but return -EAGAIN
> if that would block".
> 
> Which are the three sane behaviors (that last one is useful for the "I
> can try to generate entropy if you don't have any" case. I'm not sure
> anybody will do it, but it definitely conceptually makes sense).
> 
> And I agree that your naming is better.
> 
> I had it as just "GRND_SECURE" for the blocking version, and
> "GRND_SECURE | GRND_NONBLOCK" for the "secure but return EAGAIN if you
> would need to block for entropy" version.
> 
> But explicitly stating the blockingness in the name makes it clearer
> to the people who just want GRND_INSECURE, and makes them realize that
> they don't want the blocking version.

I really like it this way. Explicit and full control for the application
plus reasonable backwards compatibility, it sounds pretty good.

Willy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 18:15                           ` Alexander E. Patrakov
@ 2019-09-20 18:29                             ` Andy Lutomirski
  0 siblings, 0 replies; 30+ messages in thread
From: Andy Lutomirski @ 2019-09-20 18:29 UTC (permalink / raw)
  To: Alexander E. Patrakov
  Cc: Andy Lutomirski, Linus Torvalds, Ahmed S. Darwish,
	Lennart Poettering, Theodore Y. Ts'o, Eric W. Biederman,
	Michael Kerrisk, Willy Tarreau, Matthew Garrett, lkml,
	Ext4 Developers List, Linux API, linux-man



> On Sep 20, 2019, at 11:15 AM, Alexander E. Patrakov <patrakov@gmail.com> wrote:
> 
> 20.09.2019 22:52, Andy Lutomirski пишет:
>> I think that, given existing software, we should make two or three
>> changes to fix the basic problems here:
>> 1. Add GRND_INSECURE: at least let new applications do the right thing
>> going forward.
>> 2. Fix what is arguably a straight up kernel bug, not even an ABI
>> issue: when a user program is blocking in getrandom(..., 0), the
>> kernel happily sits there doing absolutely nothing and deadlocks the
>> system as a result.  This IMO isn't an ABI issue -- it's an
>> implementation problem.  How about we make getrandom() (probably
>> actually wait_for_random_bytes()) do something useful to try to seed
>> the RNG if the system is otherwise not doing IO.
>> 3. Optionally, entirely in user code: Get glibc to add new *library*
>> functions: getentropy_secure_blocking() and getentropy_insecure() or
>> whatever they want to call them.  Deprecate getentropy().
>> I think #2 is critical.  Right now, suppose someone has a system that
>> neets to do a secure network request (a la Red Hat's Clevis).  I have
>> no idea what Clevis actually does, but it wouldn't be particularly
>> crazy to do a DH exchange or sign with an EC key to ask some network
>> server to help unlock a dm-crypt volume.  If the system does this at
>> boot, it needs to use getrandom(..., 0), GRND_EXPLICIT, or whatever,
>> because it NEEDS a secure random number.  No about of ABI fiddling
>> will change this.  The kernel should *work* in this case rather than
>> deadlocking.
> 
> Let me express a little bit of disagreement with the logic here.
> 
> I do agree that #2 is critical, and the Clevis use case is a perfect example why it is important. I doubt that it is solvable without trusting jitter entropy, or without provoking a dummy read on a random block device, just for timings, or maybe some other interaction with the external world - but Willy already said "it seems fishy". However, _if_ it is solved, then we don't need GRND_INSECURE, because solving #2 is equivalent to magically making secure random numbers always available.
> 
> 

I beg to differ. There is a big difference between “do your best *right now*” and “give me a real secure result in a vaguely timely manner”.

For example, the former is useful for ASLR or hash table randomization. The latter is not.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 18:09                           ` Linus Torvalds
  2019-09-20 18:16                             ` Willy Tarreau
@ 2019-09-20 19:12                             ` Andy Lutomirski
  2019-09-20 19:51                               ` Linus Torvalds
  2019-09-21  6:07                             ` Florian Weimer
  2 siblings, 1 reply; 30+ messages in thread
From: Andy Lutomirski @ 2019-09-20 19:12 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andy Lutomirski, Ahmed S. Darwish, Lennart Poettering,
	Theodore Y. Ts'o, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, Willy Tarreau, Matthew Garrett, lkml,
	Ext4 Developers List, Linux API, linux-man

> On Sep 20, 2019, at 11:10 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> On Fri, Sep 20, 2019 at 10:52 AM Andy Lutomirski <luto@kernel.org> wrote:
>>
>> IMO, from the beginning, we should have done this:
>>
>> GRND_INSECURE: insecure.  always works.
>>
>> GRND_SECURE_BLOCKING: does exactly what it says.
>>
>> 0: -EINVAL.
>
> Violently agreed. And that's kind of what the GRND_EXPLICIT is really
> aiming for.
>
> However, it's worth noting that nobody should ever use GRND_EXPLICIT
> directly. That's just the name for the bit. The actual users would use
> GRND_INSECURE or GRND_SECURE.
>
> And yes, maybe it's worth making the name be GRND_SECURE_BLOCKING just
> to make people see what the big deal is.
>
> In the meantime, we need that new bit just to be able to create the
> new semantics eventually. With a warning to nudge people in the right
> direction.
>
> We may never be able to return -EINVAL, but we can add the pr_notice()
> to discourage people from using it.
>

The problem is that new programs will have to try the new flag value
and, if it returns -EINVAL, fall back to 0.  This isn't so great.

> And yes, we'll have to block - at least for a time - to get some
> entropy. But at some point we either start making entropy up, or we
> say "0 means jitter-entropy for ten seconds".
>
> That will _work_, but it will also make the security-people nervous,
> which is just one more hint that they should move to
> GRND_SECURE[_BLOCKING].

Wait, are you suggesting that 0 means invoke jitter-entropy or
whatever and GRND_SECURE_BLOCKING means not wait forever and deadlock?
 That's no good -- people will want to continue using 0 because the
behavior is better. My point here is that asking for secure random
numbers isn’t some legacy oddity — it’s genuinely necessary. The
kernel should do whatever it needs to in order to make it work.  We
really don’t want a situation where 0 means get me secure random
numbers reliably but spam the logs and GRND_SECURE_BLOCKING means
don’t spam the logs but risk deadlocking. This will encourage people
to pass 0 to get the improved behavior.

> So GRND_EXPLICIT is a bit that basically means "I am explicit about
> what behavior I want". But part of that is that you need to _state_
> the behavior too.
>
> So:
>
> - GRND_INSECURE is (GRND_EXPLICIT | GRND_NONBLOCK)
>
>   As in "I explicitly ask you not to just not ever block": urandom

IMO this is confusing.  The GRND_RANDOM flag was IMO a mistake and
should just be retired.  Let's enumerate useful cases and then give
them sane values.

>
> - GRND_SECURE_BLOCKING is (GRND_EXPLICIT | GRND_RANDOM)
>
>   As in "I explicitly ask you for those secure random numbers"
>
> - GRND_SECURE_NONBLOCKING is (GRND_EXPLICIT | GRND_RANDOM | GRND_NONBLOCK)
>
>   As in "I want explicitly secure random numbers, but return -EAGAIN
> if that would block".
>
> Which are the three sane behaviors (that last one is useful for the "I
> can try to generate entropy if you don't have any" case. I'm not sure
> anybody will do it, but it definitely conceptually makes sense).
>
> And I agree that your naming is better.

I think this is the complete list of "good" behaviors for new programs:

"insecure": always works, never warns.

"secure, blocking": always returns *eventually* with secure output,
i.e., does something to avoid deadlocks

"secure, nonblocking" returns secure output immediately or returns -EAGAIN.

And the only real question is how to map existing users to these
semantics.  I see two sensible choices:

1. 0 means "secure, blocking". I think this is not what we'd do if we
could go back in time and chage the ABI from day 1, but I think it's
actually good enough.  As long as this mode won't deadlock, it's not
*that* bad if programs are using it when they wanted "insecure".

2. 0 means "secure, blocking, but warn".  Some new value means
"secure, blocking, don't warn".  The problem is that new applications
will have to fall back to 0 to continue supporting old kernels.

I briefly thought that maybe GRND_RANDOM would be a reasonable choice
for "secure, blocking, don't warn", but the effect on new programs on
old kernels will be unfortunate.

I'm willing to go along with #2 if you like it better than #1, and
I'll update my patches accordingly, but I prefer #1.

I do think we should make all the ABI changes that we want to make all
in one release.  Let's not make programs think about their behavior on
more versions than necessary.  So I'd like to get rid of the current
/dev/random semantics, add "insecure" mode, and do whatever deadlock
avoidance scheme we settle on in a single release.

--Andy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 18:12                           ` Willy Tarreau
@ 2019-09-20 19:22                             ` Andy Lutomirski
  2019-09-20 19:37                               ` Willy Tarreau
  2019-09-20 20:02                               ` Linus Torvalds
  0 siblings, 2 replies; 30+ messages in thread
From: Andy Lutomirski @ 2019-09-20 19:22 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Andy Lutomirski, Linus Torvalds, Ahmed S. Darwish,
	Lennart Poettering, Theodore Y. Ts'o, Eric W. Biederman,
	Alexander E. Patrakov, Michael Kerrisk, Matthew Garrett, lkml,
	Ext4 Developers List, Linux API, linux-man

On Fri, Sep 20, 2019 at 11:12 AM Willy Tarreau <w@1wt.eu> wrote:
>
> Hi Andy,
>
> On Fri, Sep 20, 2019 at 10:52:30AM -0700, Andy Lutomirski wrote:
> > 2. Fix what is arguably a straight up kernel bug, not even an ABI
> > issue: when a user program is blocking in getrandom(..., 0), the
> > kernel happily sits there doing absolutely nothing and deadlocks the
> > system as a result.  This IMO isn't an ABI issue -- it's an
> > implementation problem.  How about we make getrandom() (probably
> > actually wait_for_random_bytes()) do something useful to try to seed
> > the RNG if the system is otherwise not doing IO.
>
> I thought about it as well with my old MSDOS reflexes, but here I
> doubt we can do a lot. It seems fishy to me to start to fiddle with
> various drivers from within a getrandom() syscall, we could sometimes
> even end up waiting even longer because one device is already locked,
> and when we have access there there's not much we can do without
> risking to cause some harm. On desktop systems you have a bit more
> choice than on headless systems (blink keyboard leds and time the
> interrupts, run some disk accesses when there's still a disk, get a
> copy of the last buffer of the audio input and/or output, turn on
> the microphone and/or webcam, and collect some data). Many of them
> cannot always be used. We could do some more portable stuff like scan
> and hash the totality of the RAM. But that's all quite bad and
> unreliable and at this point it's better to tell userland "here's
> what I could get for you, if you want better, do it yourself" and the
> userland can then ask the user "dear user, I really need valid entropy
> this time to generate your GPG key, please type frantically on this
> keyboard". And it will be more reliable this way in my opinion.

Perhaps userland could register a helper that takes over and does
something better?  But I think the kernel really should do something
vaguely reasonable all by itself.  If nothing else, we want the ext4
patch that provoked this whole discussion to be applied, which means
that we need to unbreak userspace somehow, and returning garbage it to
is not a good choice.

Here are some possible approaches that come to mind:

int count;
while (crng isn't inited) {
  msleep(1);
}

and modify add_timer_randomness() to at least credit a tiny bit to
crng_init_cnt.

Or we do something like intentionally triggering readahead on some
offset on the root block device.  We should definitely not trigger
*blocking* IO.

Also, I wonder if the real problem preventing the RNG from staring up
is that the crng_init_cnt threshold is too high.  We have a rather
baroque accounting system, and it seems like we can accumulate and
credit entropy for a very long time indeed without actually
considering ourselves done.

--Andy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 19:22                             ` Andy Lutomirski
@ 2019-09-20 19:37                               ` Willy Tarreau
  2019-09-20 19:52                                 ` Andy Lutomirski
  2019-09-20 20:02                               ` Linus Torvalds
  1 sibling, 1 reply; 30+ messages in thread
From: Willy Tarreau @ 2019-09-20 19:37 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Linus Torvalds, Ahmed S. Darwish, Lennart Poettering,
	Theodore Y. Ts'o, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, Matthew Garrett, lkml, Ext4 Developers List,
	Linux API, linux-man

On Fri, Sep 20, 2019 at 12:22:17PM -0700, Andy Lutomirski wrote:
> Perhaps userland could register a helper that takes over and does
> something better?

If userland sees the failure it can do whatever the developer/distro
packager thought suitable for the system facing this condition.

> But I think the kernel really should do something
> vaguely reasonable all by itself.

Definitely, that's what Linus' proposal was doing. Sleeping for some time
is what I call "vaguely reasonable".

> If nothing else, we want the ext4
> patch that provoked this whole discussion to be applied,

Oh absolutely!

> which means
> that we need to unbreak userspace somehow, and returning garbage it to
> is not a good choice.

It depends how it's used. I'd claim that we certainly use randoms for
other things (such as ASLR/hashtables) *before* using them to generate
long lived keys thus we can have a bit more time to get some more
entropy before reaching the point of producing these keys.

> Here are some possible approaches that come to mind:
> 
> int count;
> while (crng isn't inited) {
>   msleep(1);
> }
> 
> and modify add_timer_randomness() to at least credit a tiny bit to
> crng_init_cnt.

Without a timeout it's sure we'll still face some situations where
it blocks forever, which is the current problem.

> Or we do something like intentionally triggering readahead on some
> offset on the root block device.

You don't necessarily have such a device, especially when you're
in an initramfs. It's precisely where userland can be smarter. When
the caller is sfdisk for example, it does have more chances to try
to perform I/O than when it's a tiny http server starting to present
a configuration page.

> We should definitely not trigger *blocking* IO.

I think I agree.

> Also, I wonder if the real problem preventing the RNG from staring up
> is that the crng_init_cnt threshold is too high.  We have a rather
> baroque accounting system, and it seems like we can accumulate and
> credit entropy for a very long time indeed without actually
> considering ourselves done.

I have no opinion on this, lacking the skills to evaluate the situation.
What I can say for sure is that I've faced the non-booting issue quite a
number of times on headless systems, and conversely in the 2.4 era, my
front reverse-proxy by then had the same SSH key as 89 other machines on
the net. So there's surely a sweet spot to find between those two extremes.
I tend to think that waiting *a little bit* for the *first* random is
acceptable, even 10-15s, by the time the user starts to think about
pressing the reset button the system might finish to boot. Hashing some
RAM locations and the RTC when present can also help a little bit. If
at least my machine by then had combined the RTC's date and time with
the hash, chances for a key collision would have gone down to one over
many thousands.

Willy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 19:12                             ` Andy Lutomirski
@ 2019-09-20 19:51                               ` Linus Torvalds
  2019-09-20 20:11                                 ` Alexander E. Patrakov
                                                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Linus Torvalds @ 2019-09-20 19:51 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ahmed S. Darwish, Lennart Poettering, Theodore Y. Ts'o,
	Eric W. Biederman, Alexander E. Patrakov, Michael Kerrisk,
	Willy Tarreau, Matthew Garrett, lkml, Ext4 Developers List,
	Linux API, linux-man

On Fri, Sep 20, 2019 at 12:12 PM Andy Lutomirski <luto@kernel.org> wrote:
>
> The problem is that new programs will have to try the new flag value
> and, if it returns -EINVAL, fall back to 0.  This isn't so great.

Don't be silly.

Of course they will do that, but so what? With a new kernel, they'll
get the behavior they expect. And with an old kernel, they'll get the
behavior they expect.

They'd never fall back to to "0 means something I didn't want",
exactly because we'd make this new flag be the first change.

> Wait, are you suggesting that 0 means invoke jitter-entropy or
> whatever and GRND_SECURE_BLOCKING means not wait forever and deadlock?
>  That's no good -- people will want to continue using 0 because the
> behavior is better.

I assume that "not wait forever" was meant to be "wait forever".

So the one thing we have to do is break the "0 waits forever".  I
guarantee that will happen. I will override Ted if he just NAk's it,
because we simply _cannot_ continue with it.

So we absolutely _will_ come up with some way 0 ends the wait. Whether
it's _just_ a timeout, or whether it's jitter-entropy or whatever, it
will happen.

But we'll also make getrandom(0) do the annoying warning, because it's
just ambiguous. And I suspect you'll find that a lot of security
people don't really like jitter-entropy, at least not in whatever
cut-down format we'll likely have to use in the kernel.

And we'll also have to make getrandom(0) be really _timely_. Security
people would likely rather wait for minutes before they are happy with
it. But because it's a boot constraint as things are now, it will not
just be jitter-entropy, it will be _accelerated_ jitter-entropy in 15
seconds or whatever, and since it can't use up all of CPU time, it's
realistically more like "15 second timeout, but less of actual CPU
time for jitter".

We can try to be clever with a background thread and a lot of
yielding(), so that if the CPU is actually idle we'll get most of that
15 seconds for whatever jitter, but end result is that it's still
accelerated.

Do I believe we can do a good job in that kind of timeframe?
Absolutely. The whole point should be that it's still "good enough",
and as has been pointed out, that same jitter entropy that people are
worried about is just done in user space right now instead.

But do I believe that security people would prefer a non-accelerated
GRND_SECURE_BLOCKING? Yes I do. That doesn't mean that
GRND_SECURE_BLOCKING shouldn't use jitter entropy too, but it doesn't
need the same kind of "let's hurry this up because it might be during
early boot and block things".

That said, if we can all convince everybody (hah!) that jitter entropy
in the kernel would be sufficient, then we can make the whole point
entirely moot, and just say "we'll just change crng_wait() to do
jitter entropy instead and be done with it. Then any getrandom() user
will just basically wait for a (very limited) time and the system will
be happy.

If that is the case we wouldn't need new flags at all. But I don't
think you can make everybody agree to that, which is why I suspect
we'll need the new flag, and I'll just take the heat for saying "0 is
now off limits, because it does this thing that a lot of people
dislike".

> IMO this is confusing.  The GRND_RANDOM flag was IMO a mistake and
> should just be retired.  Let's enumerate useful cases and then give
> them sane values.


That's basically what I'm doing. I enumerate the new values.

But the enumerations have hidden meaning, because the actual bits do
matter. The GRND_EXPLICIT bit isn't supposed to be used by any user,
but it has the value it has because it makes old kernels return
-EINVAL.

But if people hate the bit names, we can just do an enum and be done with it:

   enum grnd_flags {
      GRND_NONBLOCK = 1,
      GRND_RANDOM, // Don't use!
      GRND_RANDOM_NONBLOCK, // Don't use
      GRND_UNUSED,
      GRND_INSECURE,
      GRND_SECURE_BLOCKING,
      GRND_SECURE_NONBLOCKING,
  };

but the values now have a _hidden_ pattern (because we currently have
that "| GRND_NONBLOCK" pattern that I want to make sure still
continues to work, rather than give unexpected behavior in case
somebody continues to use it).

So the _only_ difference between the above and what I suggested is
that I made the bit pattern explicit rather than hidden in the value.

> And the only real question is how to map existing users to these
> semantics.  I see two sensible choices:
>
> 1. 0 means "secure, blocking". I think this is not what we'd do if we
> could go back in time and chage the ABI from day 1, but I think it's
> actually good enough.  As long as this mode won't deadlock, it's not
> *that* bad if programs are using it when they wanted "insecure".

It's exactly that "as long as it won't deadlock" that is our current problem.

It *does* deadlock.

So it can't mean "blocking" in any long-term meaning.

It can mean "blocks for up to 15 seconds" or something like that. I'd
honestly prefer a smaller number, but I think 15 seconds is an
acceptable "your user space is buggy, but we won't make you think the
machine hung".

> 2. 0 means "secure, blocking, but warn".  Some new value means
> "secure, blocking, don't warn".  The problem is that new applications
> will have to fall back to 0 to continue supporting old kernels.

The same comment about blocking.

Maybe you came in in the middle, and didn't see the whole "reduced IO
patterns means that boot blocks forever" part of the original problem.

THAT is why 0 will absolutely change behaviour.

                Linus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 19:37                               ` Willy Tarreau
@ 2019-09-20 19:52                                 ` Andy Lutomirski
  0 siblings, 0 replies; 30+ messages in thread
From: Andy Lutomirski @ 2019-09-20 19:52 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Andy Lutomirski, Linus Torvalds, Ahmed S. Darwish,
	Lennart Poettering, Theodore Y. Ts'o, Eric W. Biederman,
	Alexander E. Patrakov, Michael Kerrisk, Matthew Garrett, lkml,
	Ext4 Developers List, Linux API, linux-man



> On Sep 20, 2019, at 12:37 PM, Willy Tarreau <w@1wt.eu> wrote:
> 
> On Fri, Sep 20, 2019 at 12:22:17PM -0700, Andy Lutomirski wrote:
>> Perhaps userland could register a helper that takes over and does
>> something better?
> 
> If userland sees the failure it can do whatever the developer/distro
> packager thought suitable for the system facing this condition.
> 
>> But I think the kernel really should do something
>> vaguely reasonable all by itself.
> 
> Definitely, that's what Linus' proposal was doing. Sleeping for some time
> is what I call "vaguely reasonable".

I don’t buy it. We have existing programs that can deadlock on boot. Just throwing -EAGAIN at them in a syscall that didn’t previously block does not strike me as reasonable.

> 
>> If nothing else, we want the ext4
>> patch that provoked this whole discussion to be applied,
> 
> Oh absolutely!
> 
>> which means
>> that we need to unbreak userspace somehow, and returning garbage it to
>> is not a good choice.
> 
> It depends how it's used. I'd claim that we certainly use randoms for
> other things (such as ASLR/hashtables) *before* using them to generate
> long lived keys thus we can have a bit more time to get some more
> entropy before reaching the point of producing these keys.

The problem is that we don’t know what userspace is doing with the output from getrandom(..., 0), so I think we have to be conservative. New kernels need to work with old user code. It’s okay if they’re slower to boot than they could be.

> 
>> Here are some possible approaches that come to mind:
>> 
>> int count;
>> while (crng isn't inited) {
>>  msleep(1);
>> }
>> 
>> and modify add_timer_randomness() to at least credit a tiny bit to
>> crng_init_cnt.
> 
> Without a timeout it's sure we'll still face some situations where
> it blocks forever, which is the current problem.

The point is that we keep the timer running by looping like this, which should cause add_timer_randomness() to get called continuously, which should prevent the deadlock.  I assume the deadlock is because we go into nohz-idle and we sit there with nothing happening at all.

> 
>> Or we do something like intentionally triggering readahead on some
>> offset on the root block device.
> 
> You don't necessarily have such a device, especially when you're
> in an initramfs. It's precisely where userland can be smarter. When
> the caller is sfdisk for example, it does have more chances to try
> to perform I/O than when it's a tiny http server starting to present
> a configuration page.

What I mean is: allow user code to register a usermode helper that helps get entropy. Or just convince distros to bundle some useful daemon that starts at early boot and lives in the initramfs.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 19:22                             ` Andy Lutomirski
  2019-09-20 19:37                               ` Willy Tarreau
@ 2019-09-20 20:02                               ` Linus Torvalds
  1 sibling, 0 replies; 30+ messages in thread
From: Linus Torvalds @ 2019-09-20 20:02 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Willy Tarreau, Ahmed S. Darwish, Lennart Poettering,
	Theodore Y. Ts'o, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, Matthew Garrett, lkml, Ext4 Developers List,
	Linux API, linux-man

On Fri, Sep 20, 2019 at 12:22 PM Andy Lutomirski <luto@kernel.org> wrote:
>
> Here are some possible approaches that come to mind:
>
> int count;
> while (crng isn't inited) {
>   msleep(1);
> }
>
> and modify add_timer_randomness() to at least credit a tiny bit to
> crng_init_cnt.

I'd love that, but we don't actually call add_timer_randomness() for timers.

Yeah, the name is misleading.

What the "timer" in add_timer_randomness() means is that we look at
the timing between calls. And we may actually have (long ago) called
it for timer interrupts. But we don't any more.

The only actual users of add_timer_randomness() is
add_input_randomness() and add_disk_randomness(). And it turns out
that even disk IO doesn't really call add_disk_randomness(), so the
only _real_ user is that keyboard input thing.

Which means that unless you sit at the machine and type things in,
add_timer_randomness() _never_ gets called.

No, the real source of entropy right now is
add_interrupt_randomness(), which is called for all device interrupts.

But note the "device interrupts" part. Not the timer interrupt. That's
special, and has its own low-level architecture rules. So only the
normal IO interrupts (like disk/network/etc).

So timers right now do not add _anything_ to the randomness pool. Not
noise, not entropy.

But yes, what you can do is a jitter entropy thing, which basically
does what you suggest, except instead of "msleep(1)" it does something
like

   while (crng isn't inited) {
       sched_yield();
       do_a_round_of_memory_accesses_etc();
       add_cycle_counter_entropy();
   }

and with a lot of handwaving you'll convince a certain amount of
people that yes, the timing of the above is unpredictable enough that
the entropy you add is real.

             Linus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 19:51                               ` Linus Torvalds
@ 2019-09-20 20:11                                 ` Alexander E. Patrakov
  2019-09-20 20:17                                 ` Matthew Garrett
  2019-09-20 20:51                                 ` Andy Lutomirski
  2 siblings, 0 replies; 30+ messages in thread
From: Alexander E. Patrakov @ 2019-09-20 20:11 UTC (permalink / raw)
  To: Linus Torvalds, Andy Lutomirski
  Cc: Ahmed S. Darwish, Lennart Poettering, Theodore Y. Ts'o,
	Eric W. Biederman, Michael Kerrisk, Willy Tarreau,
	Matthew Garrett, lkml, Ext4 Developers List, Linux API,
	linux-man

[-- Attachment #1: Type: text/plain, Size: 1459 bytes --]

21.09.2019 00:51, Linus Torvalds пишет:

> And we'll also have to make getrandom(0) be really _timely_. Security
> people would likely rather wait for minutes before they are happy with
> it. But because it's a boot constraint as things are now, it will not
> just be jitter-entropy, it will be _accelerated_ jitter-entropy in 15
> seconds or whatever, and since it can't use up all of CPU time, it's
> realistically more like "15 second timeout, but less of actual CPU
> time for jitter".

I don't think that "accelerated jitter" makes sense. The jitterentropy 
hwrng that I sent earlier fills the entropy buffer in less than 2 
seconds, even with quality=4, so there is no need to accelerate it even 
more.

> That said, if we can all convince everybody (hah!) that jitter entropy
> in the kernel would be sufficient, then we can make the whole point
> entirely moot, and just say "we'll just change crng_wait() to do
> jitter entropy instead and be done with it. Then any getrandom() user
> will just basically wait for a (very limited) time and the system will
> be happy.
> 
> If that is the case we wouldn't need new flags at all. But I don't
> think you can make everybody agree to that, which is why I suspect
> we'll need the new flag, and I'll just take the heat for saying "0 is
> now off limits, because it does this thing that a lot of people
> dislike".

I 100% agree with that.

-- 
Alexander E. Patrakov


[-- Attachment #2: Криптографическая подпись S/MIME --]
[-- Type: application/pkcs7-signature, Size: 4052 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 19:51                               ` Linus Torvalds
  2019-09-20 20:11                                 ` Alexander E. Patrakov
@ 2019-09-20 20:17                                 ` Matthew Garrett
  2019-09-20 20:51                                 ` Andy Lutomirski
  2 siblings, 0 replies; 30+ messages in thread
From: Matthew Garrett @ 2019-09-20 20:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andy Lutomirski, Ahmed S. Darwish, Lennart Poettering,
	Theodore Y. Ts'o, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, Willy Tarreau, lkml, Ext4 Developers List,
	Linux API, linux-man

On Fri, Sep 20, 2019 at 12:51:12PM -0700, Linus Torvalds wrote:

> So we absolutely _will_ come up with some way 0 ends the wait. Whether
> it's _just_ a timeout, or whether it's jitter-entropy or whatever, it
> will happen.

FWIW, Zircon uses the jitter entropy generator to seed the CRNG and 
documented their findings in 
https://fuchsia.dev/fuchsia-src/zircon/jitterentropy/config-basic .
-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 19:51                               ` Linus Torvalds
  2019-09-20 20:11                                 ` Alexander E. Patrakov
  2019-09-20 20:17                                 ` Matthew Garrett
@ 2019-09-20 20:51                                 ` Andy Lutomirski
  2019-09-20 22:44                                   ` Linus Torvalds
  2 siblings, 1 reply; 30+ messages in thread
From: Andy Lutomirski @ 2019-09-20 20:51 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andy Lutomirski, Ahmed S. Darwish, Lennart Poettering,
	Theodore Y. Ts'o, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, Willy Tarreau, Matthew Garrett, lkml,
	Ext4 Developers List, Linux API, linux-man

On Fri, Sep 20, 2019 at 12:51 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> > And the only real question is how to map existing users to these
> > semantics.  I see two sensible choices:
> >
> > 1. 0 means "secure, blocking". I think this is not what we'd do if we
> > could go back in time and chage the ABI from day 1, but I think it's
> > actually good enough.  As long as this mode won't deadlock, it's not
> > *that* bad if programs are using it when they wanted "insecure".
>
> It's exactly that "as long as it won't deadlock" that is our current problem.
>
> It *does* deadlock.
>
> So it can't mean "blocking" in any long-term meaning.
>
> It can mean "blocks for up to 15 seconds" or something like that. I'd
> honestly prefer a smaller number, but I think 15 seconds is an
> acceptable "your user space is buggy, but we won't make you think the
> machine hung".

To be clear, when I say "blocking", I mean "blocks until we're ready,
but we make sure we're ready in a moderately timely manner".

Rather than answering everything point by point, here's a updated
mini-proposal and some thoughts.  There are two families of security
people that I think we care about.  One is the FIPS or CC or PCI
crowd, and they might, quite reasonably, demand actual hardware RNGs.
We should make the hwrng API stop sucking and they should be happy.
(This means expose an hwrng device node per physical device, IMO.)
The other is the one who wants getrandom(), etc to be convincingly
secure and is willing to do some actual analysis.  And I think we can
make them quite happy like this:

In the kernel, we have two types of requests for random numbers: a
request for "secure" bytes and a request for "insecure" bytes.
Requests for "secure" bytes can block or return -EAGAIN.  Requests for
"insecure" bytes succeed without waiting.  In addition, we have a
jitter entropy mechanism (maybe the one mjg59 referenced, maybe
Alexander's -- doesn't really matter) and we *guarantee* that jitter
entropy, by itself, is enough to get the "secure" generator working
after, say, 5s of effort.  By this, I mean that, on an idle system, it
finishes in 5s and, on a fully loaded system, it's allowed to take a
little while longer but not too much longer.

In other words, I want GRND_SECURE_BLOCKING and /dev/random reads to
genuinely always work and to genuinely never take much longer than 5s.
I don't want a special case where they fail.

The exposed user APIs are, subject to bikeshedding that can happen
later over the actual values, etc:

GRND_SECURE_BLOCKING: returns "secure" output and blocks until it's
ready.  This never fails, but it also never blocks forever.

GRND_SECURE_NONBLOCKING: same but returns -EAGAIN instead of blocking.

GRND_INSECURE: returns "insecure" output immediately.  I think we do
need this -- the "secure" mode may take a little while at early boot,
and libraries that initialize themselves with some randomness really
do want a way to get some numbers without any delay whatsoever.

0: either the same as GRND_SECURE_BLOCKING plus a warning or the
"accelerated" version.  The "accelerated" version means wait up to 2s
for secure numbers and, if there still aren't any, fall back to
"insecure".

GRND_RANDOM: either the same as 0 or the same as GRND_SECURE_BLOCKING
but with a warning.  I don't particularly care either way.

I'm okay with a well-defined semantic like I proposed for an
accelerated mode.  I don't really want to try to define what a
secure-but-not-as-secure mode means as a separate complication that
the underlying RNG needs to support forever.  I don't think the
security folks would like that either.

How does this sound?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 20:51                                 ` Andy Lutomirski
@ 2019-09-20 22:44                                   ` Linus Torvalds
  2019-09-20 23:30                                     ` Andy Lutomirski
  0 siblings, 1 reply; 30+ messages in thread
From: Linus Torvalds @ 2019-09-20 22:44 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ahmed S. Darwish, Lennart Poettering, Theodore Y. Ts'o,
	Eric W. Biederman, Alexander E. Patrakov, Michael Kerrisk,
	Willy Tarreau, Matthew Garrett, lkml, Ext4 Developers List,
	Linux API, linux-man

On Fri, Sep 20, 2019 at 1:51 PM Andy Lutomirski <luto@kernel.org> wrote:
>
> To be clear, when I say "blocking", I mean "blocks until we're ready,
> but we make sure we're ready in a moderately timely manner".

.. an I want a pony.

The problem is that you start from an assumption that we simply can't
seem to do.

> In other words, I want GRND_SECURE_BLOCKING and /dev/random reads to
> genuinely always work and to genuinely never take much longer than 5s.
> I don't want a special case where they fail.

Honestly, if that's the case and we _had_ such a methoc of
initializing the rng, then I suspect we could just ignore the flags
entirely, with the possible exception of GRND_NONBLOCK. And even that
is "possible exception", because once your worst-case is a one-time
delay of 5s at boot time thing, you might as well consider it
nonblocking in general.

Yes, there are some in-kernel users that really can't afford to do
even that 5s delay (not just may they be atomic, but more likely it's
just that we don't want to delay _everything_ by 5s), but they don't
use the getrandom() system call anyway.

> The exposed user APIs are, subject to bikeshedding that can happen
> later over the actual values, etc:

So the thing is, you start from the impossible assumption, and _if_
you hold that assumption then we might as well just keep the existing
"zero means blocking", because nobody mind.

I'd love to say "yes, we can guarantee good enough entropy for
everybody in 5s and we don't even need to warn about it, because
everybody will be comfortable with the state of our entropy at that
point".

It sounds like a _lovely_ model.

But honestly, it simply sounds unlikely.

Now, there are different kinds of unlikely.

In particular, if you actually have a CPU cycle counter that actually
runs at least on the same order of magnitude as the CPU frequency -
then I believe in the jitter entropy more than in many other cases.

Sadly, many platforms don't have that kind of cycle counter.

I've also not seen a hugely believable "yes, the jitter entropy is
real" paper. Alexander points to the existing jitterentropy crypto
code, and claims it can fill all our entropy needs in two seconds, but
there are big caveats:

 (a) that code uses get_random_entropy(), which on a PC is that nice
fast TSC that we want. On other platforms (or on really old PC's - we
technically support CPU's still that don't have rdtsc)? It might be
zero. Every time.

 (b) How was it tested? There are lots of randomness tests, but most
of them can be fooled with a simple counter through a cryptographic
hash - which you basically need to do anyway on whatever entropy
source you have in order to "whiten" it. It's simply _really_ hard to
decide on entropy.

So it's really easy to make the randomness of some input look really
good, without any real idea how good it truly is. And maybe it really
is very very good on one particular machine, and then on another one
(with either a simpler in-order core or a lower-frequency timestamp
counter) it might be horrendously bad, and you'll never know,

So I'd love to believe in your simple model. Really. I just don't see
how to get there reliably.

Matthew Garrettpointed to one analysis on jitterentropy, and that one
wasn't all that optimistic.

I do think jitterentropy would likely be good enough in practice - at
least on PC's with a TSC - for the fairly small window at boot and
getrandom(0). As I mentioned, I don't think it will make anybody
_happy_, but it might be one of those things where it's a compromise
that at least works for people, with the key generation people who are
really unhappy with it having a new option for their case.

And maybe Alexander can convince people that when you run the
jitterentropy code a hundred billion times, the end result (not the
random stream from it, but the jitter bits themselves - but I'm not
even sure how to boil it down) - really is random.

             Linus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 22:44                                   ` Linus Torvalds
@ 2019-09-20 23:30                                     ` Andy Lutomirski
  2019-09-21  3:05                                       ` Willy Tarreau
  0 siblings, 1 reply; 30+ messages in thread
From: Andy Lutomirski @ 2019-09-20 23:30 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andy Lutomirski, Ahmed S. Darwish, Lennart Poettering,
	Theodore Y. Ts'o, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, Willy Tarreau, Matthew Garrett, lkml,
	Ext4 Developers List, Linux API, linux-man

On Fri, Sep 20, 2019 at 3:44 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Fri, Sep 20, 2019 at 1:51 PM Andy Lutomirski <luto@kernel.org> wrote:
> >
> > To be clear, when I say "blocking", I mean "blocks until we're ready,
> > but we make sure we're ready in a moderately timely manner".
>
> .. an I want a pony.
>
> The problem is that you start from an assumption that we simply can't
> seem to do.

Eh, fair enough, I wasn't thinking about platforms without fast clocks.

I'm very nervous about allowing getrandom(..., 0) to fail with
-EAGAIN, though.  On a very, very brief search, I didn't find any
programs that would incorrectly assume it worked, but I can easily
imagine programs crashing, and that might be bad, too.  At the end of
the day, most user programmers who call getrandom() really did notice
that we flubbed the ABI, and either they were too lazy to fall back to
/dev/urandom, or they didn't want to for some reason, or they
genuinely want the blocking behavior.  And people who work with little
embedded systems without good clocks that basically can't generate
random numbers already know this, and they have little scripts to help
out.

So I think that just improving the
getrandom()-is-blocking-on-x86-and-arm behavior, adding GRND_INSECURE
and GRND_SECURE_BLOCKING, and adding the warning if 0 is passed is
good enough.  I suppose we could also have separate
GRND_SECURE_BLOCKING and GRND_SECURE_BLOCK_FOREVER.  We could also say
that, if you want to block forever, you should poll() on /dev/random
(with my patches applied, where this actually does what users would
want).

--Andy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 23:30                                     ` Andy Lutomirski
@ 2019-09-21  3:05                                       ` Willy Tarreau
  0 siblings, 0 replies; 30+ messages in thread
From: Willy Tarreau @ 2019-09-21  3:05 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Linus Torvalds, Ahmed S. Darwish, Lennart Poettering,
	Theodore Y. Ts'o, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, Matthew Garrett, lkml, Ext4 Developers List,
	Linux API, linux-man

On Fri, Sep 20, 2019 at 04:30:20PM -0700, Andy Lutomirski wrote:
> So I think that just improving the
> getrandom()-is-blocking-on-x86-and-arm behavior, adding GRND_INSECURE
> and GRND_SECURE_BLOCKING, and adding the warning if 0 is passed is
> good enough.

I think so as well. Anyway, keep in mind that *with a sane API*,
userland can improve very quickly (faster than kernel deployments in
field). But userland developers need reliable and testable support for
features. If it's enough to do #ifndef GRND_xxx/#define GRND_xxx and
call getrandom() with these flags to detect support, it's basically 5
reliable lines of code to add to userland to make a warning disappear
and/or to allow a system that previously failed to boot to now boot. So
this gives strong incentive to userland to adopt the new API, provided
there's a way for the developer to understand what's happening (which
the warning does).

If we do it right, all we'll hear are userland developers complaining
that those stupid kernel developers have changed their API again and
really don't know what they want. That will be a good sign that the
warning flows back to them and that adoption is taking.

And if the change is small enough, maybe it could make sense to backport
it to stable versions to fix boot issues. With a testable feature it
does make sense.

Willy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-20 18:09                           ` Linus Torvalds
  2019-09-20 18:16                             ` Willy Tarreau
  2019-09-20 19:12                             ` Andy Lutomirski
@ 2019-09-21  6:07                             ` Florian Weimer
  2019-09-23 18:33                               ` Andy Lutomirski
  2 siblings, 1 reply; 30+ messages in thread
From: Florian Weimer @ 2019-09-21  6:07 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andy Lutomirski, Ahmed S. Darwish, Lennart Poettering,
	Theodore Y. Ts'o, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, Willy Tarreau, Matthew Garrett, lkml,
	Ext4 Developers List, Linux API, linux-man

* Linus Torvalds:

> Violently agreed. And that's kind of what the GRND_EXPLICIT is really
> aiming for.
>
> However, it's worth noting that nobody should ever use GRND_EXPLICIT
> directly. That's just the name for the bit. The actual users would use
> GRND_INSECURE or GRND_SECURE.

Should we switch glibc's getentropy to GRND_EXPLICIT?  Or something
else?

I don't think we want to print a kernel warning for this function.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-21  6:07                             ` Florian Weimer
@ 2019-09-23 18:33                               ` Andy Lutomirski
  2019-09-26 21:11                                 ` Ahmed S. Darwish
  0 siblings, 1 reply; 30+ messages in thread
From: Andy Lutomirski @ 2019-09-23 18:33 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Linus Torvalds, Andy Lutomirski, Ahmed S. Darwish,
	Lennart Poettering, Theodore Y. Ts'o, Eric W. Biederman,
	Alexander E. Patrakov, Michael Kerrisk, Willy Tarreau,
	Matthew Garrett, lkml, Ext4 Developers List, Linux API,
	linux-man

On Fri, Sep 20, 2019 at 11:07 PM Florian Weimer <fweimer@redhat.com> wrote:
>
> * Linus Torvalds:
>
> > Violently agreed. And that's kind of what the GRND_EXPLICIT is really
> > aiming for.
> >
> > However, it's worth noting that nobody should ever use GRND_EXPLICIT
> > directly. That's just the name for the bit. The actual users would use
> > GRND_INSECURE or GRND_SECURE.
>
> Should we switch glibc's getentropy to GRND_EXPLICIT?  Or something
> else?
>
> I don't think we want to print a kernel warning for this function.
>

Contemplating this question, I think the answer is that we should just
not introduce GRND_EXPLICIT or anything like it.  glibc is going to
have to do *something*, and getentropy() is unlikely to just go away.
The explicitly documented semantics are that it blocks if the RNG
isn't seeded.

Similarly, FreeBSD has getrandom():

https://www.freebsd.org/cgi/man.cgi?query=getrandom&sektion=2&manpath=freebsd-release-ports

and if we make getrandom(..., 0) warn, then we have a situation where
the *correct* (if regrettable) way to use the function on FreeBSD
causes a warning on Linux.

Let's just add GRND_INSECURE, make the blocking mode work better, and,
if we're feeling a bit more adventurous, add GRND_SECURE_BLOCKING as a
better replacement for 0, convince FreeBSD to add it too, and then
worry about deprecating 0 once we at least get some agreement from the
FreeBSD camp.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 0/1] random: getrandom(2): warn on large CRNG waits, introduce new flags
       [not found]                 ` <CAHk-=wiCqDiU7SE3FLn2W26MS_voUAuqj5XFa1V_tiGTrrW-zQ@mail.gmail.com>
  2019-09-20 13:46                   ` [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2() Ahmed S. Darwish
@ 2019-09-26 20:42                   ` Ahmed S. Darwish
  2019-09-26 20:44                     ` [PATCH v5 1/1] " Ahmed S. Darwish
  1 sibling, 1 reply; 30+ messages in thread
From: Ahmed S. Darwish @ 2019-09-26 20:42 UTC (permalink / raw)
  To: Linus Torvalds, Theodore Y. Ts'o
  Cc: Florian Weimer, Willy Tarreau, Matthew Garrett, Andy Lutomirski,
	Lennart Poettering, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, lkml, linux-ext4, linux-api, linux-man

Summary / Changelog-v5:

  - Add the new flags GRND_INSECURE and GRND_SECURE_UNBOUNDED_INITIAL_WAIT
    to getrandom(2), instead of introducing a new getrandom2(2) system
    call, which nobody liked.

  - Fix a bug discovered through testing where "int ret =
    wait_event_interruptible_timeout(waitq, true, MAX_SCHEDULE_TIMEOUT)"
    returns failure (-1) due to implicit LONG_MAX => int truncation

  - WARN if a process is stuck on getrandom(,,flags=0) for more than 30
    seconds ... defconfig and bootparam configurable

  - Add documentation for "random.getrandom_wait_threshold" kernel param

  - Extra comments @ include/uapi/linux/random.h and random.c::getrandom.
    Explicit recommendations to *exclusively* use the new flags.

  - GRND_INSECURE never issue any warning, even if CRNG is not inited.
    Similarly for GRND_SECURE_UNBOUNDED_INITIAL_WAIT, no matter how
    big the unbounded wait is.

In a reply to the V4 patch, Linus posted a related patch [*] with the
following additions:

  - Drop the original random.c behavior of having each /dev/urandom
    "CRNG not inited" warning also _reset_ the crng_init_cnt entropy.

    This is not included in this patch, as IMHO this can be done as a
    separate patch on top.

 - Limit GRND_RANDOM max count/buflen to 32MB instead of 2GB.  This
   is very sane obviously, and can be done in a separate patch on
   top.

   This V5 patch just tries to be as conservative as possible.

 - GRND_WAIT_ENTROPY and GRND_EXCPLICIT: AFAIK these were primarily
   added so that getrandom(,,flags=0) can be changed to return
   weaker non-blocking crypto from non-inited CRG in a possible
   future.

   I hope we don't have to resort to that extreme measure.. Hopefully
   the WARN() on this patch will be enough in nudging distributions to
   enable more hwrng sources (RDRAND, etc.) .. and also for the
   user-space developres badly pointed at (hi GDM and Qt) to fix their
   code.

[*] https://lkml.kernel.org/r/CAHk-=wiCqDiU7SE3FLn2W26MS_voUAuqj5XFa1V_tiGTrrW-zQ@mail.gmail.com

Ahmed S. Darwish (1):
  random: getrandom(2): warn on large CRNG waits, introduce new flags

 .../admin-guide/kernel-parameters.txt         |   7 ++
 drivers/char/Kconfig                          |  60 ++++++++++-
 drivers/char/random.c                         | 102 +++++++++++++++---
 include/uapi/linux/random.h                   |  27 ++++-
 4 files changed, 177 insertions(+), 19 deletions(-)

--
2.23.0

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 1/1] random: getrandom(2): warn on large CRNG waits, introduce new flags
  2019-09-26 20:42                   ` [PATCH v5 0/1] random: getrandom(2): warn on large CRNG waits, introduce new flags Ahmed S. Darwish
@ 2019-09-26 20:44                     ` Ahmed S. Darwish
  2019-09-26 21:39                       ` Andy Lutomirski
  0 siblings, 1 reply; 30+ messages in thread
From: Ahmed S. Darwish @ 2019-09-26 20:44 UTC (permalink / raw)
  To: Linus Torvalds, Theodore Y. Ts'o
  Cc: Florian Weimer, Willy Tarreau, Matthew Garrett, Andy Lutomirski,
	Lennart Poettering, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, lkml, linux-ext4, linux-api, linux-man

Since Linux v3.17, getrandom(2) has been created as a new and more
secure interface for pseudorandom data requests.  It attempted to
solve three problems, as compared to /dev/urandom:

  1. the need to access filesystem paths, which can fail, e.g. under a
     chroot

  2. the need to open a file descriptor, which can fail under file
     descriptor exhaustion attacks

  3. the possibility of getting not-so-random data from /dev/urandom,
     due to an incompletely initialized kernel entropy pool

To solve the third point, getrandom(2) was made to block until a
proper amount of entropy has been accumulated to initialize the CRNG
ChaCha20 cipher.  This made the system call have no guaranteed
upper-bound for its initial waiting time.

Thus when it was introduced at c6e9d6f38894 ("random: introduce
getrandom(2) system call"), it came with a clear warning: "Any
userspace program which uses this new functionality must take care to
assure that if it is used during the boot process, that it will not
cause the init scripts or other portions of the system startup to hang
indefinitely."

Unfortunately, due to multiple factors, including not having this
warning written in a scary-enough language in the manpages, and due to
glibc since v2.25 implementing a BSD-like getentropy(3) in terms of
getrandom(2), modern user-space is calling getrandom(2) in the boot
path everywhere (e.g. Qt, GDM, etc.)

Embedded Linux systems were first hit by this, and reports of embedded
systems "getting stuck at boot" began to be common.  Over time, the
issue began to even creep into consumer-level x86 laptops: mainstream
distributions, like Debian Buster, began to recommend installing
haveged as a duct-tape workaround... just to let the system boot.

Moreover, filesystem optimizations in EXT4 and XFS, e.g. b03755ad6f33
("ext4: make __ext4_get_inode_loc plug"), which merged directory
lookup code inode table IO, and very fast systemd boots, further
exaggerated the problem by limiting interrupt-based entropy sources.
This led to large delays until the kernel's cryptographic random
number generator (CRNG) got initialized.

On a Thinkpad E480 x86 laptop and an ArchLinux user-space, the ext4
commit earlier mentioned reliably blocked the system on GDM boot.
Mitigate the problem, as a first step, in two ways:

  1. Issue a big WARN_ON when any process gets stuck on getrandom(2)
     for more than CONFIG_GETRANDOM_WAIT_THRESHOLD_SEC seconds.

  2. Introduce new getrandom(2) flags, with clear semantics that can
     hopefully guide user-space in doing the right thing.

Set CONFIG_GETRANDOM_WAIT_THRESHOLD_SEC to a heuristic 30-second
default value. System integrators and distribution builders are deeply
encouraged not to increase it much: during system boot, you either
have entropy, or you don't. And if you didn't have entropy, it will
stay like this forever, because if you had, you wouldn't have blocked
in the first place. It's an atomic "either/or" situation, with no
middle ground. Please think twice.

For the new getrandom(2) flags, be much more explicit.  As Linus
mentioned several times in the bug report thread, Linux should've
never provided /dev/random and the getrandom(GRND_RANDOM) APIs. These
interfaces are broken by design due to their almost-permanent
blockage, leading to the current misuse of /dev/urandom and
getrandom(flags=0) calls. Thus introduce the flags:

  1. GRND_INSECURE
  2. GRND_SECURE_UNBOUNDED_INITIAL_WAIT

where both extract randomness _exclusively_ from the urandom source.

Due to the explicit semantics of these new flags, GRND_INSECURE will
never issue a kernel warning message even if the CRNG is not yet
inited.  Similarly, GRND_SECURE_UNBOUNDED_INITIAL_WAIT will never
cause any any kernel WARN, no matter how large the unbounded wait is.

Rreported-by: Ahmed S. Darwish <darwish.07@gmail.com>
Link: https://lkml.kernel.org/r/20190910042107.GA1517@darwi-home-pc
Link: https://lkml.kernel.org/r/20190912034421.GA2085@darwi-home-pc
Link: https://lkml.kernel.org/r/20190914222432.GC19710@mit.edu
Link: https://lkml.kernel.org/r/20180514003034.GI14763@thunk.org
Link: https://lkml.kernel.org/r/CAHk-=wjyH910+JRBdZf_Y9G54c1M=LBF8NKXB6vJcm9XjLnRfg@mail.gmail.com
Link: https://lkml.kernel.org/r/20190917052438.GA26923@1wt.eu
Link: https://lkml.kernel.org/r/20190917160844.GC31567@gardel-login
Link: https://lkml.kernel.org/r/CAHk-=wjABG3+daJFr4w3a+OWuraVcZpi=SMUg=pnZ+7+O0E2FA@mail.gmail.com
Link: https://lkml.kernel.org/r/CAHk-=wjQeiYu8Q_wcMgM-nAcW7KsBfG1+90DaTD5WF2cCeGCgA@mail.gmail.com
Link: https://factorable.net ("Widespread Weak Keys in Network Devices")
Link: https://man.openbsd.org/man4/random.4
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---
 .../admin-guide/kernel-parameters.txt         |   7 ++
 drivers/char/Kconfig                          |  60 ++++++++++-
 drivers/char/random.c                         | 102 +++++++++++++++---
 include/uapi/linux/random.h                   |  27 ++++-
 4 files changed, 177 insertions(+), 19 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 6ef205fd7c97..d82eafc6a62a 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3728,6 +3728,13 @@
 			fully seed the kernel's CRNG. Default is controlled
 			by CONFIG_RANDOM_TRUST_CPU.

+	random.getrandom_wait_threshold=
+			Maximum amount, in seconds, for a process to block
+			in a getrandom(,,flags=0) systemcall without a loud
+			warning in the kernel logs. Default is controlled by
+			CONFIG_RANDOM_GETRANDOM_WAIT_THRESHOLD_SEC. Check
+			the config option help text for more information.
+
 	ras=option[,option,...]	[KNL] RAS-specific options

 		cec_disable	[X86]
diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
index df0fc997dc3e..adc9bc63d27c 100644
--- a/drivers/char/Kconfig
+++ b/drivers/char/Kconfig
@@ -535,8 +535,6 @@ config ADI
 	  and SSM (Silicon Secured Memory).  Intended consumers of this
 	  driver include crash and makedumpfile.

-endmenu
-
 config RANDOM_TRUST_CPU
 	bool "Trust the CPU manufacturer to initialize Linux's CRNG"
 	depends on X86 || S390 || PPC
@@ -559,4 +557,60 @@ config RANDOM_TRUST_BOOTLOADER
 	device randomness. Say Y here to assume the entropy provided by the
 	booloader is trustworthy so it will be added to the kernel's entropy
 	pool. Otherwise, say N here so it will be regarded as device input that
-	only mixes the entropy pool.
\ No newline at end of file
+	only mixes the entropy pool.
+
+config RANDOM_GETRANDOM_WAIT_THRESHOLD_SEC
+	int
+	default 30
+	help
+	  The getrandom(2) system call, when asking for entropy from the
+	  urandom source, blocks until the kernel's Cryptographic Random
+	  Number Generator (CRNG) gets initialized. This configuration
+	  option sets the maximum wait time, in seconds, for a process
+	  to get blocked on such a system call before the kernel issues
+	  a loud warning. Rationale follows:
+
+	  When the getrandom(2) system call was created, it came with
+	  the clear warning: "Any userspace program which uses this new
+	  functionality must take care to assure that if it is used
+	  during the boot process, that it will not cause the init
+	  scripts or other portions of the system startup to hang
+	  indefinitely.
+
+	  Unfortunately, due to multiple factors, including not having
+	  this warning written in a scary-enough language in the
+	  manpages, and due to glibc since v2.25 implementing a BSD-like
+	  getentropy(3) in terms of getrandom(2), modern user-space is
+	  calling getrandom(2) in the boot path everywhere.
+
+	  Embedded Linux systems were first hit by this, and reports of
+	  embedded system "getting stuck at boot" began to be
+	  common. Over time, the issue began to even creep into consumer
+	  level x86 laptops: mainstream distributions, like Debian
+	  Buster, began to recommend installing haveged as a workaround,
+	  just to let the system boot.
+
+	  Filesystem optimizations in EXT4 and XFS exaggerated the
+	  problem, due to aggressive batching of IO requests, and thus
+	  minimizing sources of entropy at boot. This led to large
+	  delays until the kernel's CRNG got initialized.
+
+	  System integrators and distribution builders are not
+	  encouraged to considerably increase this value: during system
+	  boot, you either have entropy, or you don't. And if you didn't
+	  have entropy, it will stay like this forever, because if you
+	  had, you wouldn't have blocked in the first place. It's an
+	  atomic "either/or" situation, with no middle ground. Please
+	  think twice.
+
+	  Ideally, systems would be configured with hardware random
+	  number generators, and/or configured to trust the CPU-provided
+	  RNG's (CONFIG_RANDOM_TRUST_CPU) or boot-loader provided ones
+	  (CONFIG_RANDOM_TRUST_BOOTLOADER).  In addition, userspace
+	  should generate cryptographic keys only as late as possible,
+	  when they are needed, instead of during early boot.  For
+	  non-cryptographic use cases, such as dictionary seeds or MIT
+	  Magic Cookies, the getrandom2(GRND2_INSECURE) system call,
+	  or even random(3), may be more appropriate.
+
+endmenu
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 566922df4b7b..37c00cff1c08 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -322,6 +322,7 @@
 #include <linux/interrupt.h>
 #include <linux/mm.h>
 #include <linux/nodemask.h>
+#include <linux/sched.h>
 #include <linux/spinlock.h>
 #include <linux/kthread.h>
 #include <linux/percpu.h>
@@ -854,12 +855,21 @@ static void invalidate_batched_entropy(void);
 static void numa_crng_init(void);

 static bool trust_cpu __ro_after_init = IS_ENABLED(CONFIG_RANDOM_TRUST_CPU);
+static int getrandom_wait_threshold __ro_after_init =
+				CONFIG_RANDOM_GETRANDOM_WAIT_THRESHOLD_SEC;
+
 static int __init parse_trust_cpu(char *arg)
 {
 	return kstrtobool(arg, &trust_cpu);
 }
 early_param("random.trust_cpu", parse_trust_cpu);

+static int __init parse_getrandom_wait_threshold(char *arg)
+{
+	return kstrtoint(arg, 0, &getrandom_wait_threshold);
+}
+early_param("random.getrandom_wait_threshold", parse_getrandom_wait_threshold);
+
 static void crng_initialize(struct crng_state *crng)
 {
 	int		i;
@@ -1960,7 +1970,7 @@ random_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
 }

 static ssize_t
-urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
+_urandom_read(char __user *buf, size_t nbytes, bool warn_on_noninited_crng)
 {
 	unsigned long flags;
 	static int maxwarn = 10;
@@ -1968,7 +1978,7 @@ urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)

 	if (!crng_ready() && maxwarn > 0) {
 		maxwarn--;
-		if (__ratelimit(&urandom_warning))
+		if (warn_on_noninited_crng && __ratelimit(&urandom_warning))
 			printk(KERN_NOTICE "random: %s: uninitialized "
 			       "urandom read (%zd bytes read)\n",
 			       current->comm, nbytes);
@@ -1982,6 +1992,13 @@ urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
 	return ret;
 }

+static ssize_t
+urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
+{
+	/* warn on non-inited CRNG */
+	return _urandom_read(buf, nbytes, true);
+}
+
 static __poll_t
 random_poll(struct file *file, poll_table * wait)
 {
@@ -2118,13 +2135,55 @@ const struct file_operations urandom_fops = {
 	.llseek = noop_llseek,
 };

+static int geturandom_wait(char __user *buf, size_t count,
+			   bool warn_on_large_wait)
+{
+	long ret, timeout = MAX_SCHEDULE_TIMEOUT;
+
+	if (warn_on_large_wait && (getrandom_wait_threshold > 0))
+		timeout = HZ * getrandom_wait_threshold;
+
+	do {
+		ret = wait_event_interruptible_timeout(crng_init_wait,
+						       crng_ready(),
+						       timeout);
+		if (ret < 0)
+			return ret;
+
+		if (ret == 0) {
+			WARN(1, "random: %s[%d]: getrandom(%zu bytes) "
+			     "is blocked for more than %d seconds. Check "
+			     "getrandom_wait(7)\n", current->comm,
+			     task_pid_nr(current), count,
+			     getrandom_wait_threshold);
+
+			/* warn once per caller */
+			timeout = MAX_SCHEDULE_TIMEOUT;
+		}
+
+	} while (ret == 0);
+
+	return _urandom_read(buf, count, true);
+}
+
 SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count,
 		unsigned int, flags)
 {
-	int ret;
+	unsigned int i, invalid_combs[] = {
+		GRND_INSECURE|GRND_SECURE_UNBOUNDED_INITIAL_WAIT,
+		GRND_INSECURE|GRND_RANDOM,
+	};

-	if (flags & ~(GRND_NONBLOCK|GRND_RANDOM))
+	if (flags & ~(GRND_NONBLOCK | \
+		      GRND_RANDOM   | \
+		      GRND_INSECURE | \
+		      GRND_SECURE_UNBOUNDED_INITIAL_WAIT)) {
 		return -EINVAL;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(invalid_combs); i++)
+		if ((flags & invalid_combs[i]) == invalid_combs[i])
+			return -EINVAL;

 	if (count > INT_MAX)
 		count = INT_MAX;
@@ -2132,14 +2191,33 @@ SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count,
 	if (flags & GRND_RANDOM)
 		return _random_read(flags & GRND_NONBLOCK, buf, count);

-	if (!crng_ready()) {
-		if (flags & GRND_NONBLOCK)
+	/*
+	 * urandom: explicit request *not* to wait for CRNG init, and
+	 * thus no "uninitialized urandom read" warnings.
+	 */
+	if (flags & GRND_INSECURE)
+		return _urandom_read(buf, count, false);
+
+	/* urandom: nonblocking access */
+	if ((flags & GRND_NONBLOCK) && !crng_ready())
 			return -EAGAIN;
-		ret = wait_for_random_bytes();
-		if (unlikely(ret))
-			return ret;
-	}
-	return urandom_read(NULL, buf, count, NULL);
+
+	/*
+	 * urandom: explicit request *to* wait for CRNG init, and thus
+	 * no "getrandom is blocked for more than X seconds" warnings
+	 * on large waits.
+	 */
+	if (flags & GRND_SECURE_UNBOUNDED_INITIAL_WAIT)
+		return geturandom_wait(buf, count, false);
+
+	/*
+	 * urandom: *implicit* request to wait for CRNG init (flags=0)
+	 *
+	 * User-space has been badly abusing this by calling getrandom
+	 * with flags=0 in the boot path, and thus blocking system
+	 * boots forever in absence of entropy. Warn on large waits.
+	 */
+	return geturandom_wait(buf, count, true);
 }

 /********************************************************************
@@ -2458,4 +2536,4 @@ void add_bootloader_randomness(const void *buf, unsigned int size)
 	else
 		add_device_randomness(buf, size);
 }
-EXPORT_SYMBOL_GPL(add_bootloader_randomness);
\ No newline at end of file
+EXPORT_SYMBOL_GPL(add_bootloader_randomness);
diff --git a/include/uapi/linux/random.h b/include/uapi/linux/random.h
index 26ee91300e3e..5a3df92270a7 100644
--- a/include/uapi/linux/random.h
+++ b/include/uapi/linux/random.h
@@ -8,6 +8,7 @@
 #ifndef _UAPI_LINUX_RANDOM_H
 #define _UAPI_LINUX_RANDOM_H

+#include <linux/bits.h>
 #include <linux/types.h>
 #include <linux/ioctl.h>
 #include <linux/irqnr.h>
@@ -23,7 +24,7 @@
 /* Get the contents of the entropy pool.  (Superuser only.) */
 #define RNDGETPOOL	_IOR( 'R', 0x02, int [2] )

-/*
+/*
  * Write bytes into the entropy pool and add to the entropy count.
  * (Superuser only.)
  */
@@ -47,10 +48,28 @@ struct rand_pool_info {
 /*
  * Flags for getrandom(2)
  *
+ * 0			discouraged - don't use (see below)
  * GRND_NONBLOCK	Don't block and return EAGAIN instead
- * GRND_RANDOM		Use the /dev/random pool instead of /dev/urandom
+ * GRND_RANDOM		discouraged - don't use (uses /dev/random pool)
+ * GRND_INSECURE	Use urandom pool, never block even if CRNG isn't inited
+ * GRND_SECURE_UNBOUNDED_INITIAL_WAIT
+ *			Use urandom pool, block until CRNG is inited
+ *
+ * User-space has been badly abusing getrandom(flags=0) by calling
+ * it in the boot path, and thus blocking system boots forever in
+ * the absence of entropy (a blocked system cannot generate more
+ * entropy, by definition).
+ *
+ * Thus if a process blocks on a getrandom(flags=0), waithing for
+ * more than CONFIG_RANDOM_GETRANDOM_WAIT_THRESHOLD_SEC seconds,
+ * the kernel will issue a loud warning.
+ *
+ * In general, don't use flags=0. Always use either GRND_INSECURE
+ * or GRND_SECURE_UNBOUNDED_INITIAL_WAIT instead.
  */
-#define GRND_NONBLOCK	0x0001
-#define GRND_RANDOM	0x0002
+#define GRND_NONBLOCK				BIT(0)
+#define GRND_RANDOM				BIT(1)
+#define GRND_INSECURE				BIT(2)
+#define GRND_SECURE_UNBOUNDED_INITIAL_WAIT	BIT(3)

 #endif /* _UAPI_LINUX_RANDOM_H */
--
2.23.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-23 18:33                               ` Andy Lutomirski
@ 2019-09-26 21:11                                 ` Ahmed S. Darwish
  0 siblings, 0 replies; 30+ messages in thread
From: Ahmed S. Darwish @ 2019-09-26 21:11 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Florian Weimer, Linus Torvalds, Lennart Poettering,
	Theodore Y. Ts'o, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, Willy Tarreau, Matthew Garrett, lkml,
	Ext4 Developers List, Linux API, linux-man

On Mon, Sep 23, 2019 at 11:33:21AM -0700, Andy Lutomirski wrote:
> On Fri, Sep 20, 2019 at 11:07 PM Florian Weimer <fweimer@redhat.com> wrote:
> >
> > * Linus Torvalds:
> >
> > > Violently agreed. And that's kind of what the GRND_EXPLICIT is really
> > > aiming for.
> > >
> > > However, it's worth noting that nobody should ever use GRND_EXPLICIT
> > > directly. That's just the name for the bit. The actual users would use
> > > GRND_INSECURE or GRND_SECURE.
> >
> > Should we switch glibc's getentropy to GRND_EXPLICIT?  Or something
> > else?
> >
> > I don't think we want to print a kernel warning for this function.
> >
> 
> Contemplating this question, I think the answer is that we should just
> not introduce GRND_EXPLICIT or anything like it.  glibc is going to
> have to do *something*, and getentropy() is unlikely to just go away.
> The explicitly documented semantics are that it blocks if the RNG
> isn't seeded.
> 
> Similarly, FreeBSD has getrandom():
> 
> https://www.freebsd.org/cgi/man.cgi?query=getrandom&sektion=2&manpath=freebsd-release-ports
> 
> and if we make getrandom(..., 0) warn, then we have a situation where
> the *correct* (if regrettable) way to use the function on FreeBSD
> causes a warning on Linux.
> 
> Let's just add GRND_INSECURE, make the blocking mode work better, and,
> if we're feeling a bit more adventurous, add GRND_SECURE_BLOCKING as a
> better replacement for 0, ...

This is what's now done in the just-submitted V5, except the "make the
blocking mode work better" part:

    https://lkml.kernel.org/r/20190926204217.GA1366@pc

It's a very conservative patch so far IMHO (minus the loud warning).

Thanks,
--
Ahmed Darwish

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 1/1] random: getrandom(2): warn on large CRNG waits, introduce new flags
  2019-09-26 20:44                     ` [PATCH v5 1/1] " Ahmed S. Darwish
@ 2019-09-26 21:39                       ` Andy Lutomirski
  2019-09-28  9:30                         ` Ahmed S. Darwish
  0 siblings, 1 reply; 30+ messages in thread
From: Andy Lutomirski @ 2019-09-26 21:39 UTC (permalink / raw)
  To: Ahmed S. Darwish, Linus Torvalds, Theodore Y. Ts'o
  Cc: Florian Weimer, Willy Tarreau, Matthew Garrett,
	Lennart Poettering, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, lkml, linux-ext4, linux-api, linux-man

On 9/26/19 1:44 PM, Ahmed S. Darwish wrote:
> Since Linux v3.17, getrandom(2) has been created as a new and more
> secure interface for pseudorandom data requests.  It attempted to
> solve three problems, as compared to /dev/urandom:
> 
>    1. the need to access filesystem paths, which can fail, e.g. under a
>       chroot
> 
>    2. the need to open a file descriptor, which can fail under file
>       descriptor exhaustion attacks
> 
>    3. the possibility of getting not-so-random data from /dev/urandom,
>       due to an incompletely initialized kernel entropy pool
> 
> To solve the third point, getrandom(2) was made to block until a
> proper amount of entropy has been accumulated to initialize the CRNG
> ChaCha20 cipher.  This made the system call have no guaranteed
> upper-bound for its initial waiting time.
> 
> Thus when it was introduced at c6e9d6f38894 ("random: introduce
> getrandom(2) system call"), it came with a clear warning: "Any
> userspace program which uses this new functionality must take care to
> assure that if it is used during the boot process, that it will not
> cause the init scripts or other portions of the system startup to hang
> indefinitely."
> 
> Unfortunately, due to multiple factors, including not having this
> warning written in a scary-enough language in the manpages, and due to
> glibc since v2.25 implementing a BSD-like getentropy(3) in terms of
> getrandom(2), modern user-space is calling getrandom(2) in the boot
> path everywhere (e.g. Qt, GDM, etc.)
> 
> Embedded Linux systems were first hit by this, and reports of embedded
> systems "getting stuck at boot" began to be common.  Over time, the
> issue began to even creep into consumer-level x86 laptops: mainstream
> distributions, like Debian Buster, began to recommend installing
> haveged as a duct-tape workaround... just to let the system boot.
> 
> Moreover, filesystem optimizations in EXT4 and XFS, e.g. b03755ad6f33
> ("ext4: make __ext4_get_inode_loc plug"), which merged directory
> lookup code inode table IO, and very fast systemd boots, further
> exaggerated the problem by limiting interrupt-based entropy sources.
> This led to large delays until the kernel's cryptographic random
> number generator (CRNG) got initialized.
> 
> On a Thinkpad E480 x86 laptop and an ArchLinux user-space, the ext4
> commit earlier mentioned reliably blocked the system on GDM boot.
> Mitigate the problem, as a first step, in two ways:
> 
>    1. Issue a big WARN_ON when any process gets stuck on getrandom(2)
>       for more than CONFIG_GETRANDOM_WAIT_THRESHOLD_SEC seconds.
> 
>    2. Introduce new getrandom(2) flags, with clear semantics that can
>       hopefully guide user-space in doing the right thing.
> 
> Set CONFIG_GETRANDOM_WAIT_THRESHOLD_SEC to a heuristic 30-second
> default value. System integrators and distribution builders are deeply
> encouraged not to increase it much: during system boot, you either
> have entropy, or you don't. And if you didn't have entropy, it will
> stay like this forever, because if you had, you wouldn't have blocked
> in the first place. It's an atomic "either/or" situation, with no
> middle ground. Please think twice.

So what do we expect glibc's getentropy() to do?  If it just adds the 
new flag to shut up the warning, we haven't really accomplished much.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 1/1] random: getrandom(2): warn on large CRNG waits, introduce new flags
  2019-09-26 21:39                       ` Andy Lutomirski
@ 2019-09-28  9:30                         ` Ahmed S. Darwish
  0 siblings, 0 replies; 30+ messages in thread
From: Ahmed S. Darwish @ 2019-09-28  9:30 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Linus Torvalds, Theodore Y. Ts'o, Florian Weimer,
	Willy Tarreau, Matthew Garrett, Lennart Poettering,
	Eric W. Biederman, Alexander E. Patrakov, Michael Kerrisk, lkml,
	linux-ext4, linux-api, linux-man

On Thu, Sep 26, 2019 at 02:39:44PM -0700, Andy Lutomirski wrote:
> On 9/26/19 1:44 PM, Ahmed S. Darwish wrote:
> > Since Linux v3.17, getrandom(2) has been created as a new and more
> > secure interface for pseudorandom data requests.  It attempted to
> > solve three problems, as compared to /dev/urandom:
> > 
> >    1. the need to access filesystem paths, which can fail, e.g. under a
> >       chroot
> > 
> >    2. the need to open a file descriptor, which can fail under file
> >       descriptor exhaustion attacks
> > 
> >    3. the possibility of getting not-so-random data from /dev/urandom,
> >       due to an incompletely initialized kernel entropy pool
> > 
> > To solve the third point, getrandom(2) was made to block until a
> > proper amount of entropy has been accumulated to initialize the CRNG
> > ChaCha20 cipher.  This made the system call have no guaranteed
> > upper-bound for its initial waiting time.
> > 
> > Thus when it was introduced at c6e9d6f38894 ("random: introduce
> > getrandom(2) system call"), it came with a clear warning: "Any
> > userspace program which uses this new functionality must take care to
> > assure that if it is used during the boot process, that it will not
> > cause the init scripts or other portions of the system startup to hang
> > indefinitely."
> > 
> > Unfortunately, due to multiple factors, including not having this
> > warning written in a scary-enough language in the manpages, and due to
> > glibc since v2.25 implementing a BSD-like getentropy(3) in terms of
> > getrandom(2), modern user-space is calling getrandom(2) in the boot
> > path everywhere (e.g. Qt, GDM, etc.)
> > 
> > Embedded Linux systems were first hit by this, and reports of embedded
> > systems "getting stuck at boot" began to be common.  Over time, the
> > issue began to even creep into consumer-level x86 laptops: mainstream
> > distributions, like Debian Buster, began to recommend installing
> > haveged as a duct-tape workaround... just to let the system boot.
> > 
> > Moreover, filesystem optimizations in EXT4 and XFS, e.g. b03755ad6f33
> > ("ext4: make __ext4_get_inode_loc plug"), which merged directory
> > lookup code inode table IO, and very fast systemd boots, further
> > exaggerated the problem by limiting interrupt-based entropy sources.
> > This led to large delays until the kernel's cryptographic random
> > number generator (CRNG) got initialized.
> > 
> > On a Thinkpad E480 x86 laptop and an ArchLinux user-space, the ext4
> > commit earlier mentioned reliably blocked the system on GDM boot.
> > Mitigate the problem, as a first step, in two ways:
> > 
> >    1. Issue a big WARN_ON when any process gets stuck on getrandom(2)
> >       for more than CONFIG_GETRANDOM_WAIT_THRESHOLD_SEC seconds.
> > 
> >    2. Introduce new getrandom(2) flags, with clear semantics that can
> >       hopefully guide user-space in doing the right thing.
> > 
> > Set CONFIG_GETRANDOM_WAIT_THRESHOLD_SEC to a heuristic 30-second
> > default value. System integrators and distribution builders are deeply
> > encouraged not to increase it much: during system boot, you either
> > have entropy, or you don't. And if you didn't have entropy, it will
> > stay like this forever, because if you had, you wouldn't have blocked
> > in the first place. It's an atomic "either/or" situation, with no
> > middle ground. Please think twice.
> 
> So what do we expect glibc's getentropy() to do?  If it just adds the new
> flag to shut up the warning, we haven't really accomplished much.

Yes, if glibc adds GRND_SECURE_UNBOUNDED_INITIAL_WAIT to gentropy(3),
then this exercise would indeed be invalidated. Hopefully,
coordination with glibc will be done so it won't happen... @Florian?

Afterwards, a sane approach would be for gentropy(3) to be deprecated,
and to add getentropy_secure_unbounded_initial_wait(3) and
getentropy_insecure(3).

Note that this V5 patch does not claim to fully solve the problem, but
it will:

  1. Pinpoint to the processes causing system boots to block
  
  2. Tell people what correct alternative to use when facing problem
     #1 above, through the proposed getrandom_wait(7) manpage. That
     manpage page will fully describe the problem, and advise
     user-space to either use the new getrandom flags, or the new
     glibc gentropy_*() variants.

thanks,

--
Ahmed Darwish

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2019-09-28  9:30 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190912034421.GA2085@darwi-home-pc>
     [not found] ` <20190912082530.GA27365@mit.edu>
     [not found]   ` <CAHk-=wjyH910+JRBdZf_Y9G54c1M=LBF8NKXB6vJcm9XjLnRfg@mail.gmail.com>
     [not found]     ` <20190914122500.GA1425@darwi-home-pc>
     [not found]       ` <008f17bc-102b-e762-a17c-e2766d48f515@gmail.com>
     [not found]         ` <20190915052242.GG19710@mit.edu>
     [not found]           ` <CAHk-=wgg2T=3KxrO-BY3nHJgMEyApjnO3cwbQb_0vxsn9qKN8Q@mail.gmail.com>
     [not found]             ` <20190918211503.GA1808@darwi-home-pc>
     [not found]               ` <20190918211713.GA2225@darwi-home-pc>
     [not found]                 ` <CAHk-=wiCqDiU7SE3FLn2W26MS_voUAuqj5XFa1V_tiGTrrW-zQ@mail.gmail.com>
2019-09-20 13:46                   ` [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2() Ahmed S. Darwish
2019-09-20 14:33                     ` Andy Lutomirski
2019-09-20 16:29                       ` Linus Torvalds
2019-09-20 17:52                         ` Andy Lutomirski
2019-09-20 18:09                           ` Linus Torvalds
2019-09-20 18:16                             ` Willy Tarreau
2019-09-20 19:12                             ` Andy Lutomirski
2019-09-20 19:51                               ` Linus Torvalds
2019-09-20 20:11                                 ` Alexander E. Patrakov
2019-09-20 20:17                                 ` Matthew Garrett
2019-09-20 20:51                                 ` Andy Lutomirski
2019-09-20 22:44                                   ` Linus Torvalds
2019-09-20 23:30                                     ` Andy Lutomirski
2019-09-21  3:05                                       ` Willy Tarreau
2019-09-21  6:07                             ` Florian Weimer
2019-09-23 18:33                               ` Andy Lutomirski
2019-09-26 21:11                                 ` Ahmed S. Darwish
2019-09-20 18:12                           ` Willy Tarreau
2019-09-20 19:22                             ` Andy Lutomirski
2019-09-20 19:37                               ` Willy Tarreau
2019-09-20 19:52                                 ` Andy Lutomirski
2019-09-20 20:02                               ` Linus Torvalds
2019-09-20 18:15                           ` Alexander E. Patrakov
2019-09-20 18:29                             ` Andy Lutomirski
2019-09-20 17:26                     ` Willy Tarreau
2019-09-20 17:56                       ` Ahmed S. Darwish
2019-09-26 20:42                   ` [PATCH v5 0/1] random: getrandom(2): warn on large CRNG waits, introduce new flags Ahmed S. Darwish
2019-09-26 20:44                     ` [PATCH v5 1/1] " Ahmed S. Darwish
2019-09-26 21:39                       ` Andy Lutomirski
2019-09-28  9:30                         ` Ahmed S. Darwish

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).