All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>,
	fes@google.com, aarcange@redhat.com, riel@redhat.com,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	mikew@google.com, yinghan@google.com,
	virtualization@lists.linux-foundation.org, yvugenfi@redhat.com,
	vrozenfe@redhat.com
Subject: Re: [PATCH] virtio-balloon spec: provide a version of the "silent deflate" feature that works
Date: Fri, 7 Sep 2012 15:17:12 +0300	[thread overview]
Message-ID: <20120907121712.GA17397@redhat.com> (raw)
In-Reply-To: <5049D899.60705@redhat.com>

On Fri, Sep 07, 2012 at 01:20:57PM +0200, Paolo Bonzini wrote:
> Il 07/09/2012 12:53, Michael S. Tsirkin ha scritto:
> > Let us start with what is broken currently. Looking at
> > it very closely, I think the answer is nothing.
> > Even migration in qemu is not broken as you claimed initially.
> 
> Correct, migration would be broken as soon as QEMU starts using
> MUST_TELL_HOST.  I'm trying to think ahead, since we have many ideas
> floating around on the implementation of ballooning.
> 
> If you implement the mlock/munlock trick, you must start using
> MUST_TELL_HOST in QEMU to advertise it to guests, and migration breaks.

Migration does not break.

Since I wrote this code in qemu let me explain what is going on.

qemu requires that local and remote side are started with
same feature bits.
To support cross version migration, code in hw/pc_piix.c
disables features if you require migration from/to old qemu.

At some point I added a sanity check:
if we get guest features we know that any bit
set there must be set in host features.
Yes, this catches some user mistakes.

This was never intended as a compatibility guarantee.
User is still required to start qemu such
that host features match exactly, anything else
can lead to failures some of them hard to debug.

Here is a simple example:

1. guest reads host features
2. guest is migrated - check passes since no features are acked
3. guest acks features -> failure

This applies to any feature. Nothing special with this one.

Yes, we can if we want to make this more robust
against user errors, e.g. by migrating host feature
bits. Patches welcome. If we do it will help all
features, not just this one.


> > Next, consider the interface proposed here. You defacto declare
> > all existing drivers buggy.
> 
> No, only Windows (and it is buggy, it calls tell_host last).

It is not buggy. It does not ack MUST_TELL_HOST. So it is free to tell
host at any point, it behaves exactly
to spec. You can not retroactively declare drivers buggy like that.

>  Linux and
> BSD drivers do negotiate MUST_TELL_HOST, and are not buggy.
> 
> > This is a wrong thing to do.
> > You also use two feature bits for a single simple thing,
> > this is inelegant.
> 
> True, but the choice is:
> 
> 1) add a once-only hack to QEMU that fixes migration of
> VIRTIO_BALLOON_F_MUST_TELL_HOST;
> 
> 2) always advertise VIRTIO_BALLOON_F_MUST_TELL_HOST.  If you do this,
> guests cannot use anymore silent deflate, which is a regression.
> 
> 3) use two bits.  One tells the device that the driver supports chatty
> deflate; one tells the driver that the device supports silent deflate.

The right thing to do is either
4. realize we can not address all user errors, so no real issue
5. address this class of user errors by migrating host features

> So in the end you do use two feature bits for two different things.
> Plus, both feature bits are "positive" and I'm happy.

I am not happy.
We lose compatibility with all existing drivers
so it will take years until the feature is actually
useful.

> > Last, let us consider how existing feature can be used
> > in the hypervisor. If driver did not ack
> > MUST_TELL_HOST, it is *not* buggy but it means we can not
> > do munlock. This applies to current windows drivers.
> > If driver *did* ack MUST_TELL_HOST, we can munlock
> > and mlock back on leak.
> > Seems useful, driver support is already there,
> > so removing the MUST_TELL_HOST bit seems like a bad idea.
> 
> Indeed, repurposing MUST_TELL_HOST to WILL_TELL_HOST is better than
> killing it.
> 
> Paolo

Is this just a matter of naming? Same functionality:
driver that acks this bit will tell host first,
driver that does not will not?

If yes that is fine.

-- 
MST

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: fes@google.com, aarcange@redhat.com, riel@redhat.com,
	yvugenfi@redhat.com, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, mikew@google.com,
	yinghan@google.com, virtualization@lists.linux-foundation.org
Subject: Re: [PATCH] virtio-balloon spec: provide a version of the "silent deflate" feature that works
Date: Fri, 7 Sep 2012 15:17:12 +0300	[thread overview]
Message-ID: <20120907121712.GA17397@redhat.com> (raw)
In-Reply-To: <5049D899.60705@redhat.com>

On Fri, Sep 07, 2012 at 01:20:57PM +0200, Paolo Bonzini wrote:
> Il 07/09/2012 12:53, Michael S. Tsirkin ha scritto:
> > Let us start with what is broken currently. Looking at
> > it very closely, I think the answer is nothing.
> > Even migration in qemu is not broken as you claimed initially.
> 
> Correct, migration would be broken as soon as QEMU starts using
> MUST_TELL_HOST.  I'm trying to think ahead, since we have many ideas
> floating around on the implementation of ballooning.
> 
> If you implement the mlock/munlock trick, you must start using
> MUST_TELL_HOST in QEMU to advertise it to guests, and migration breaks.

Migration does not break.

Since I wrote this code in qemu let me explain what is going on.

qemu requires that local and remote side are started with
same feature bits.
To support cross version migration, code in hw/pc_piix.c
disables features if you require migration from/to old qemu.

At some point I added a sanity check:
if we get guest features we know that any bit
set there must be set in host features.
Yes, this catches some user mistakes.

This was never intended as a compatibility guarantee.
User is still required to start qemu such
that host features match exactly, anything else
can lead to failures some of them hard to debug.

Here is a simple example:

1. guest reads host features
2. guest is migrated - check passes since no features are acked
3. guest acks features -> failure

This applies to any feature. Nothing special with this one.

Yes, we can if we want to make this more robust
against user errors, e.g. by migrating host feature
bits. Patches welcome. If we do it will help all
features, not just this one.


> > Next, consider the interface proposed here. You defacto declare
> > all existing drivers buggy.
> 
> No, only Windows (and it is buggy, it calls tell_host last).

It is not buggy. It does not ack MUST_TELL_HOST. So it is free to tell
host at any point, it behaves exactly
to spec. You can not retroactively declare drivers buggy like that.

>  Linux and
> BSD drivers do negotiate MUST_TELL_HOST, and are not buggy.
> 
> > This is a wrong thing to do.
> > You also use two feature bits for a single simple thing,
> > this is inelegant.
> 
> True, but the choice is:
> 
> 1) add a once-only hack to QEMU that fixes migration of
> VIRTIO_BALLOON_F_MUST_TELL_HOST;
> 
> 2) always advertise VIRTIO_BALLOON_F_MUST_TELL_HOST.  If you do this,
> guests cannot use anymore silent deflate, which is a regression.
> 
> 3) use two bits.  One tells the device that the driver supports chatty
> deflate; one tells the driver that the device supports silent deflate.

The right thing to do is either
4. realize we can not address all user errors, so no real issue
5. address this class of user errors by migrating host features

> So in the end you do use two feature bits for two different things.
> Plus, both feature bits are "positive" and I'm happy.

I am not happy.
We lose compatibility with all existing drivers
so it will take years until the feature is actually
useful.

> > Last, let us consider how existing feature can be used
> > in the hypervisor. If driver did not ack
> > MUST_TELL_HOST, it is *not* buggy but it means we can not
> > do munlock. This applies to current windows drivers.
> > If driver *did* ack MUST_TELL_HOST, we can munlock
> > and mlock back on leak.
> > Seems useful, driver support is already there,
> > so removing the MUST_TELL_HOST bit seems like a bad idea.
> 
> Indeed, repurposing MUST_TELL_HOST to WILL_TELL_HOST is better than
> killing it.
> 
> Paolo

Is this just a matter of naming? Same functionality:
driver that acks this bit will tell host first,
driver that does not will not?

If yes that is fine.

-- 
MST

  reply	other threads:[~2012-09-07 12:16 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-06  7:46 [PATCH] virtio-balloon spec: provide a version of the "silent deflate" feature that works Paolo Bonzini
2012-09-06  7:46 ` Paolo Bonzini
2012-09-06  8:47 ` Michael S. Tsirkin
2012-09-06  8:47   ` Michael S. Tsirkin
2012-09-06  9:24   ` Paolo Bonzini
2012-09-06  9:24     ` Paolo Bonzini
2012-09-06  9:44     ` Michael S. Tsirkin
2012-09-06  9:44       ` Michael S. Tsirkin
2012-09-06  9:57       ` Paolo Bonzini
2012-09-06  9:57         ` Paolo Bonzini
2012-09-06 10:53         ` Michael S. Tsirkin
2012-09-06 10:53           ` Michael S. Tsirkin
2012-09-06 12:13           ` Paolo Bonzini
2012-09-06 12:13             ` Paolo Bonzini
2012-09-06 12:51             ` Michael S. Tsirkin
2012-09-06 12:51               ` Michael S. Tsirkin
2012-09-06 13:12               ` Paolo Bonzini
2012-09-06 13:12                 ` Paolo Bonzini
2012-09-06 23:45             ` Rusty Russell
2012-09-07  5:42               ` Michael S. Tsirkin
2012-09-07  5:42                 ` Michael S. Tsirkin
2012-09-07  6:39                 ` Rusty Russell
2012-09-07  6:39                   ` Rusty Russell
2012-09-07  9:27                   ` Paolo Bonzini
2012-09-07  9:27                     ` Paolo Bonzini
2012-09-07 10:53                     ` Michael S. Tsirkin
2012-09-07 10:53                       ` Michael S. Tsirkin
2012-09-07 11:20                       ` Paolo Bonzini
2012-09-07 11:20                         ` Paolo Bonzini
2012-09-07 12:17                         ` Michael S. Tsirkin [this message]
2012-09-07 12:17                           ` Michael S. Tsirkin
2012-09-07 12:22                           ` Paolo Bonzini
2012-09-07 12:22                             ` Paolo Bonzini
2012-09-07 12:44                             ` Michael S. Tsirkin
2012-09-07 12:44                               ` Michael S. Tsirkin
2012-09-07 14:04                               ` Paolo Bonzini
2012-09-07 14:04                                 ` Paolo Bonzini
2012-09-07 14:25                                 ` Michael S. Tsirkin
2012-09-07 14:25                                   ` Michael S. Tsirkin
2012-09-07 14:44                                   ` Paolo Bonzini
2012-09-07 14:44                                     ` Paolo Bonzini
2012-09-08 22:22                                     ` Michael S. Tsirkin
2012-09-08 22:22                                       ` Michael S. Tsirkin
2012-09-10  5:50                                       ` Paolo Bonzini
2012-09-10  5:50                                       ` Paolo Bonzini
2012-09-10  6:03                                         ` Michael S. Tsirkin
2012-09-10  6:03                                           ` Michael S. Tsirkin
2012-09-10  6:38                                           ` Paolo Bonzini
2012-09-10  6:38                                             ` Paolo Bonzini
2012-09-10  6:47                                             ` Michael S. Tsirkin
2012-09-10  6:47                                               ` Michael S. Tsirkin
2012-09-10  6:55                                               ` Paolo Bonzini
2012-09-10  6:55                                                 ` Paolo Bonzini
2012-09-07 10:43                   ` Michael S. Tsirkin
2012-09-07 10:43                     ` Michael S. Tsirkin
2012-09-08  5:06                     ` Rusty Russell
2012-09-08  5:06                       ` Rusty Russell
2012-09-08 10:32                       ` Paolo Bonzini
2012-09-08 10:32                       ` Paolo Bonzini
2012-09-08 22:37                       ` Michael S. Tsirkin
2012-09-08 22:37                         ` Michael S. Tsirkin
2012-09-10  1:43                         ` Rusty Russell
2012-09-10  1:43                           ` Rusty Russell
2012-09-10  5:51                           ` Paolo Bonzini
2012-09-10  5:51                             ` Paolo Bonzini
2012-09-10  5:51                           ` Michael S. Tsirkin
2012-09-10  5:51                             ` Michael S. Tsirkin
2012-09-12  6:24                             ` Rusty Russell
2012-09-12  6:24                               ` Rusty Russell
2012-09-06 23:41 ` Michael S. Tsirkin
2012-09-06 23:41   ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120907121712.GA17397@redhat.com \
    --to=mst@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=fes@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikew@google.com \
    --cc=pbonzini@redhat.com \
    --cc=riel@redhat.com \
    --cc=rusty@rustcorp.com.au \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=vrozenfe@redhat.com \
    --cc=yinghan@google.com \
    --cc=yvugenfi@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.