16.09.2019 22:21, Theodore Y. Ts'o пишет:
> On Mon, Sep 16, 2019 at 09:17:10AM -0700, Linus Torvalds wrote:
>> So the semantics that getrandom() should have had are:
>>
>>   getrandom(0) - just give me reasonable random numbers for any of a
>> million non-strict-long-term-security use (ie the old urandom)
>>
>>      - the nonblocking flag makes no sense here and would be a no-op
> 
> That change is what I consider highly problematic.  There are a *huge*
> number of applications which use cryptography which assumes that
> getrandom(0) means, "I'm guaranteed to get something safe
> cryptographic use".  Changing his now would expose a very large number
> of applications to be insecure.  Part of the problem here is that
> there are many different actors.  There is the application or
> cryptographic library developer, who may want to be sure they have
> cryptographically secure random numbers.  They are the ones who will
> select getrandom(0).
> 
> Then you have the distribution or consumer-grade electronics
> developers who may choose to run them too early in some init script or
> systemd unit files.  And some of these people may do something stupid,
> like run things too early, or omit the a hardware random number
> generator in their design, even though it's for a security critical
> purpose (say, a digital wallet for bitcoin).  Because some of these
> people might do something stupid, one argument (not mine) is that we
> must therefore not let getrandom() block.  But doing this penalizes
> the security of all the users of the application, not just the stupid
> ones.

On Linux, there is no such thing as "too early", that's the problem.

First, we already had one lesson about this, regarding applications that 
require libraries from /usr. There, it was due to various programs that 
run from udev rules, and dynamic/unpredictable dependencies. See 
https://freedesktop.org/wiki/Software/systemd/separate-usr-is-broken/, 
almost all arguments from there apply 1:1 here.

Second, people/distributions put unexpected stuff into their initramfs 
images, and we cannot say that they have no right to do so. E.g., on my 
system that's "cryptsetup" that unlocks the root partition, but manages 
to read a few bytes of uninitialized urandom before that. A warning here 
is almost unavoidable, and thus will be treated as SPAM.

No such considerations apply to OpenBSD (initramfs does not exist, and 
there is no equivalent of udev that reacts to cold-plug events by 
running programs), that's why the getentropy() design works there.

If we were to fix it, we should focus on making true entropy available 
unconditionally, even before /init in the initramfs starts, and warn not 
on the first access to urandom, but on the exec of /init. Look - 
distributions are already running "haveged" which harvests entropy from 
clock jitter. And they still manage to do it wrong (regardless whether 
the "haveged" idea is wrong by itself), by running it too late (at least 
I don't know any kind of stock initramfs with either it or rngd 
included). So it's too complex, and needs to be simplified.

The kernel already has jitterentropy-rng, it uses the same idea as 
"haveged", but, alas, it is exposed as a crypto rng algorithm, not a 
hwrng. And I think it is a bug: cryptoapi rng algorithms are for things 
that get a seed and generate random numbers by rehashing it over and 
over, while jitterentropy-rng requires no seed. Would a patch be 
accepted to convert it to hwrng? (this is essentially the reverse of 
what commit c46ea13 did for exynos-rng)

> 
>>   getrandom(GRND_RANDOM) - get me actual _secure_ random numbers with
>> blocking until entropy pool fills (but not the completely invalid
>> entropy decrease accounting)
>>
>>      - the nonblocking flag is useful for bootup and for "I will
>> actually try to generate entropy".
>>
>> and both of those are very very sensible actions. That would actually
>> have _fixed_ the problems we had with /dev/[u]random, both from a
>> performance standpoint and for a filesystem access standpoint.
>>
>> But that is sadly not what we have right now.
>>
>> And I suspect we can't fix it, since people have grown to depend on
>> the old behavior, and already know to avoid GRND_RANDOM because it's
>> useless with old kernels even if we fixed it with new ones.
> 
> I don't think we can fix it, because it's the changing of
> getrandom(0)'s behavior which is the problem, not GRND_RANDOM.  People
> *expect* getrandom(0) to always return secure results.  I don't think
> we can make it sometimes return not-necessarily secure results
> depending on when the systems integrator or distribution decides to
> run the application, and depending on the hardware platform (yes,
> traditional x86 systems are probably fine, and fortunately x86
> embedded CPU are too expensive and have lousy power management, so no
> one really uses x86 for embedded yet, despite Intel's best efforts).
> That would just be a purely irresponsible thing to do, IMO.
> 
>> Does anybody really seriously debate the above? Ted? Are you seriously
>> trying to claim that the existing GRND_RANDOM has any sensible use?
>> Are you seriously trying to claim that the fact that we don't have a
>> sane urandom source is a "feature"?
> 
> There are people who can debate that GRND_RANDOM has any sensible use
> cases.  GPG uses /dev/random, and that was a fully informed choice.
> I'm not convinced, because I think that at least for now the CRNG is
> perfectly fine for 99.999% of the use cases.  Yes, in a post-quantum
> cryptography world, the CRNG might be screwed --- but so will most of
> the other cryptographic algorithms in the kernel.  So if anyone ever
> gets post-quantum cryptoanalytic attacks working, the use of the CRNG
> is going to be least of our problems.
> 
> As I mentioned to you in Lisbon, I've been going back and forth about
> whether or not to rip out the entire /dev/random infrastructure,
> mainly for code maintainability reasons.  The only reason why I've
> been holding back is because there are (very few) non-insane people
> who do want to use it.  There are also a much larger of rational
> people who use it because they want some insane PCI compliance labs to
> go away.  What I suspect most of them are actually doing in practice
> is they use /dev/random, but they also use a hardware random number
> generator so /dev/random never actually blocks in practice.  The use
> of /dev/random is enough to make the PCI compliance lab go away, and
> the hardware random number generator (or virtio-rng on a VM) makes
> /dev/random useable.

Please don't forget about people who run Linux on Hyper-V, not on KVM, 
and thus have no access to virtio-rng ;)

> 
> But I don't think we can reuse GRND_RANDOM for that reason.
> 
> We could create a new flag, GRND_INSECURE, which never blocks.  And
> that that allows us to solve the problem for silly applications that
> are using getrandom(2) for non-cryptographic use cases.  Use cases
> might include Python dictionary seeds, gdm for MIT Magic Cookie, UUID
> generation where best efforts probably is good enough, etc.  The
> answer today is they should just use /dev/urandom, since that exists
> today, and we have to support it for backwards compatibility anyway.
> It sounds like gdm recently switched to getrandom(2), and I suspect
> that it's going to get caught on some hardware configs anyway, even
> without the ext4 optimization patch.  So I suspect gdm will switch
> back to /dev/urandom, and this particular pain point will probably go
> away.
> 
> 						- Ted
> 

Well, at this point, I see that there is a lot of disagreement about how 
getrandom() should behave, aggravated by the baggage of existing 
applications and libraries with contradictory requirements regarding 
getrandom(0) (so not really solvable). I am almost convinced that we 
might want to return -ENOSYS unconditionally, and create a different 
system call with sane flags.

-- 
Alexander E. Patrakov