linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: Jann Horn <jannh@google.com>
Cc: Colm MacCarthaigh <colmmacc@amazon.com>,
	"Catangiu, Adrian Costin" <acatan@amazon.com>,
	Andy Lutomirski <luto@kernel.org>,
	Jason Donenfeld <Jason@zx2c4.com>,
	"Theodore Y. Ts'o" <tytso@mit.edu>,
	Eric Biggers <ebiggers@kernel.org>,
	"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
	kernel list <linux-kernel@vger.kernel.org>,
	"open list:VIRTIO GPU DRIVER" 
	<virtualization@lists.linux-foundation.org>, "Graf (AWS),
	Alexander" <graf@amazon.de>,
	"Woodhouse, David" <dwmw@amazon.co.uk>,
	bonzini@gnu.org, "Singh, Balbir" <sblbir@amazon.com>,
	"Weiss, Radu" <raduweis@amazon.com>,
	oridgar@gmail.com, ghammer@redhat.com,
	Jonathan Corbet <corbet@lwn.net>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Qemu Developers <qemu-devel@nongnu.org>,
	KVM list <kvm@vger.kernel.org>, Michal Hocko <mhocko@kernel.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Pavel Machek <pavel@ucw.cz>,
	Linux API <linux-api@vger.kernel.org>
Subject: Re: [PATCH] drivers/virt: vmgenid: add vm generation id driver
Date: Sat, 17 Oct 2020 08:44:42 +0200	[thread overview]
Message-ID: <20201017064442.GA14117@1wt.eu> (raw)
In-Reply-To: <CAG48ez1h0ynXfGap_KiHiPVTfcB8NBQJ-2dnj08ZNfuhrW0jWA@mail.gmail.com>

On Sat, Oct 17, 2020 at 07:52:48AM +0200, Jann Horn wrote:
> On Sat, Oct 17, 2020 at 7:37 AM Willy Tarreau <w@1wt.eu> wrote:
> > On Sat, Oct 17, 2020 at 07:01:31AM +0200, Jann Horn wrote:
> > > Microsoft's documentation
> > > (http://go.microsoft.com/fwlink/?LinkId=260709) says that the VM
> > > Generation ID that we get after a fork "is a 128-bit,
> > > cryptographically random integer value". If multiple people use the
> > > same image, it guarantees that each use of the image gets its own,
> > > fresh ID:
> >
> > No. It cannot be more unique than the source that feeds that cryptographic
> > transformation. All it guarantees is that the entropy source is protected
> > from being guessed based on the output. Applying cryptography on a simple
> > counter provides apparently random numbers that will be unique for a long
> > period for the same source, but as soon as you duplicate that code between
> > users and they start from the same counter they'll get the same IDs.
> >
> > This is why I think that using a counter is better if you really need something
> > unique. Randoms only reduce predictability which helps avoiding collisions.
> 
> Microsoft's spec tells us that they're giving us cryptographically
> random numbers. Where they're getting those from is not our problem.
> (And if even the hypervisor is not able to collect enough entropy to
> securely generate random numbers, worrying about RNG reseeding in the
> guest would be kinda pointless, we'd be fairly screwed anyway.)

Sorry if I sound annoying, but it's a matter of terminology and needs.

Cryptograhically random means safe for use with cryptography in that it
is unguessable enough so that you can use it for encryption keys that
nobody will be able to guess. It in no ways guarantees uniqueness, just
like you don't really care if the symmetric crypto key of you VPN has
already been used once somewhere else as long as there's no way to know.
However with the good enough distribution that a CSPRNG provides,
collisions within a *same* generator are bound to a very low, predictable
rate which is by generally considered as acceptable for all use cases.

Something random (cryptographically or not) *cannot* be unique by
definition, otherwise it's not random anymore, since each draw has an
influence on the remaining list of possible draws, which is contrary to
randomness. And conversely something unique cannot be completely random
because if you know it's unique, you can already rule out all other known
values from the candidates, thus it's more predictable than random.

With this in mind, picking randoms from a same RNG is often highly
sufficient to consider they're highly likely unique within a long
period. But it's not a guarantee. And it's even less one between two
RNGs (e.g. if uniqueness is required between multiple hypervisors in
case VMs are migrated or centrally managed, which I don't know).

If what is sought here is a strong guarantee of uniqueness, using a
counter as you first suggested is better. If what is sought is pure
randomness (in the sense that it's unpredictable, which I don't think
is needed here), then randoms are better. If both are required, just
concatenate a counter and a random. And if you need them to be spatially
unique, just include a node identifier.

Now the initial needs in the forwarded message are not entirely clear
to me but I wanted to rule out the apparent mismatch between the expressed
needs for uniqueness and the proposed solutions solely based on randomness.

Cheers,
Willy

  reply	other threads:[~2020-10-17  6:45 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <AQHWo8lIfZnFKGe8nkGmhTCXwq5R3w==>
2020-10-16 14:33 ` [PATCH] drivers/virt: vmgenid: add vm generation id driver Catangiu, Adrian Costin
2020-10-16 15:00   ` Catangiu, Adrian Costin
2020-10-16 15:14   ` gregkh
2020-10-17  1:40   ` Jann Horn
2020-10-17  3:36     ` Willy Tarreau
2020-10-17  4:02       ` Jann Horn
     [not found]         ` <6CC3DB03-27BA-4F5E-8ADA-BE605D83A85C@amazon.com>
2020-10-17  5:01           ` Jann Horn
2020-10-17  5:29             ` Colm MacCarthaigh
2020-10-17  5:37             ` Willy Tarreau
2020-10-17  5:52               ` Jann Horn
2020-10-17  6:44                 ` Willy Tarreau [this message]
2020-10-17  6:55                   ` Jann Horn
2020-10-17  7:17                     ` Willy Tarreau
2020-10-17 13:24                     ` Jason A. Donenfeld
2020-10-17 18:06                       ` Catangiu, Adrian Costin
2020-10-17 18:09                       ` Alexander Graf
2020-10-18  2:08                         ` Jann Horn
2020-10-20  9:35                         ` Christian Borntraeger
2020-10-20  9:54                           ` Alexander Graf
2020-10-20 16:54                         ` Catangiu, Adrian Costin
2020-10-18  3:14                       ` Colm MacCarthaigh
2020-10-18 15:52                       ` Michael S. Tsirkin
2020-10-18 15:54                         ` Andy Lutomirski
2020-10-18 15:59                           ` Michael S. Tsirkin
2020-10-18 16:14                             ` Andy Lutomirski
2020-10-19 15:00                               ` Michael S. Tsirkin
2020-10-17 18:10     ` Andy Lutomirski
2020-10-19 17:15       ` Mathieu Desnoyers
2020-10-20 10:00         ` Alexander Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201017064442.GA14117@1wt.eu \
    --to=w@1wt.eu \
    --cc=Jason@zx2c4.com \
    --cc=acatan@amazon.com \
    --cc=bonzini@gnu.org \
    --cc=colmmacc@amazon.com \
    --cc=corbet@lwn.net \
    --cc=dwmw@amazon.co.uk \
    --cc=ebiggers@kernel.org \
    --cc=ghammer@redhat.com \
    --cc=graf@amazon.de \
    --cc=gregkh@linuxfoundation.org \
    --cc=jannh@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mhocko@kernel.org \
    --cc=mst@redhat.com \
    --cc=oridgar@gmail.com \
    --cc=pavel@ucw.cz \
    --cc=qemu-devel@nongnu.org \
    --cc=raduweis@amazon.com \
    --cc=rafael@kernel.org \
    --cc=sblbir@amazon.com \
    --cc=tytso@mit.edu \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).