All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: ehabkost@redhat.com, kvm@vger.kernel.org, mtosatti@redhat.com,
	qemu-devel@nongnu.org, dgilbert@redhat.com,
	Liran Alon <liran.alon@oracle.com>,
	rth@twiddle.net, jmattson@google.com
Subject: Re: [QEMU PATCH v2 0/2]: KVM: i386: Add support for save and restore nested state
Date: Fri, 2 Nov 2018 16:54:09 +0000	[thread overview]
Message-ID: <20181102165409.GF21191@redhat.com> (raw)
In-Reply-To: <12c26c34-8dd1-a442-7826-86b93ff978f8@redhat.com>

On Fri, Nov 02, 2018 at 10:40:35AM +0100, Paolo Bonzini wrote:
> On 02/11/2018 04:46, Liran Alon wrote:
> >> On Thu, Nov1, 2018 at 09:45 AM, Jim Mattson <jmattson@google.com> wrote:
> > 
> >>> On Thu, Nov 1, 2018 at 8:56 AM, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> > 
> >>> So if I have matching host kernels it should always work?
> >>> What happens if I upgrade the source kernel to increase it's maximum
> >>> nested size, can I force it to keep things small for some VMs?
> > 
> >> Any change to the format of the nested state should be gated by a
> >> KVM_CAP set by userspace. (Unlike, say, how the
> >> KVM_VCPUEVENT_VALID_SMM flag was added to the saved VCPU events state
> >> in commit f077825a8758d.) KVM has traditionally been quite bad about
> >> maintaining backwards compatibility, but I hope the community is more
> >> cognizant of the issues now.
> > 
> >> As a cloud provider, one would only enable the new capability from
> >> userspace once all hosts in the pool have a kernel that supports it.
> >> During the transition, the capability would not be enabled on the
> >> hosts with a new kernel, and these hosts would continue to provide
> >> nested state that could be consumed by hosts running the older kernel.
> > 
> > Hmm this makes sense.
> > 
> > This means though that the patch I have submitted here isn't good enough.
> > My patch currently assumes that when it attempts to get nested state from KVM,
> > QEMU should always set nested_state->size to max size supported by KVM as received
> > from kvm_check_extension(s, KVM_CAP_NESTED_STATE);
> > (See kvm_get_nested_state() introduced on my patch).
> > This indeed won't allow migration from host with new KVM to host with old KVM if
> > nested_state size was enlarged between these KVM versions.
> > Which is obviously an issue.
> 
> Actually I think this is okay, because unlike the "new" capability was
> enabled, KVM would always reduce nested_state->size to a value that is
> compatible with current kernels.
> 
> > But on second thought, I'm not sure that this is the right approach as-well.
> > We don't really want the used version of nested_state to be determined on kvm_init().
> > * On source QEMU, we actually want to determine it when preparing for migration based
> > on to the support given by our destination host. If it's an old host, we would like to
> > save an old version nested_state and if it's a new host, we will like to save our newest
> > supported nested_state.
> 
> No, that's wrong because it would lead to losing state.  If the source
> QEMU supports more state than the destination QEMU, and the current VM
> state needs to transmit it for migration to be _correct_, then migration
> to that destination QEMU must fail.
> 
> In particular, enabling the new KVM capability needs to be gated by a
> new machine type and/or -cpu flag, if migration compatibility is needed.
>  (In particular, this is one reason why I haven't considered this series
> for 3.1.  Right now, migration of nested hypervisors is completely
> busted but if we make it "almost" work, pre-3.1 machine types would not
> ever be able to add support for KVM_CAP_EXCEPTION_PAYLOAD.  Therefore,
> it's better for users if we wait for one release more, and add support
> for KVM_CAP_NESTED_STATE and KVM_CAP_EXCEPTION_PAYLOAD at the same time).
> 
> Personally, I would like to say that, starting from QEMU 3.2, enabling
> nested VMX requires a 4.20 kernel.  It's a bit bold, but I think it's a
> good way to keep some sanity.  Any opinions on that?

We have usually followed a rule that new machine types must not
affect runability of a VM on a host. IOW new machine types should
not introduce dependancies on specific kernels, or hardware features
such as CPU flags.

Anything that requires a new kernel feature thus ought to be an
opt-in config tunable on the CLI, separate from machine type
choice.  Alternatively in this case, it could potentially be a
migration parameter settable via QMP. QEMU on each side could
advertize whether the migration parameter is available, and
the mgmt app (which can see both sides of the migration) can
then decide whether to enable it.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

WARNING: multiple messages have this Message-ID (diff)
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Liran Alon <liran.alon@oracle.com>,
	jmattson@google.com, ehabkost@redhat.com, kvm@vger.kernel.org,
	mtosatti@redhat.com, qemu-devel@nongnu.org, dgilbert@redhat.com,
	rth@twiddle.net
Subject: Re: [Qemu-devel] [QEMU PATCH v2 0/2]: KVM: i386: Add support for save and restore nested state
Date: Fri, 2 Nov 2018 16:54:09 +0000	[thread overview]
Message-ID: <20181102165409.GF21191@redhat.com> (raw)
In-Reply-To: <12c26c34-8dd1-a442-7826-86b93ff978f8@redhat.com>

On Fri, Nov 02, 2018 at 10:40:35AM +0100, Paolo Bonzini wrote:
> On 02/11/2018 04:46, Liran Alon wrote:
> >> On Thu, Nov1, 2018 at 09:45 AM, Jim Mattson <jmattson@google.com> wrote:
> > 
> >>> On Thu, Nov 1, 2018 at 8:56 AM, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> > 
> >>> So if I have matching host kernels it should always work?
> >>> What happens if I upgrade the source kernel to increase it's maximum
> >>> nested size, can I force it to keep things small for some VMs?
> > 
> >> Any change to the format of the nested state should be gated by a
> >> KVM_CAP set by userspace. (Unlike, say, how the
> >> KVM_VCPUEVENT_VALID_SMM flag was added to the saved VCPU events state
> >> in commit f077825a8758d.) KVM has traditionally been quite bad about
> >> maintaining backwards compatibility, but I hope the community is more
> >> cognizant of the issues now.
> > 
> >> As a cloud provider, one would only enable the new capability from
> >> userspace once all hosts in the pool have a kernel that supports it.
> >> During the transition, the capability would not be enabled on the
> >> hosts with a new kernel, and these hosts would continue to provide
> >> nested state that could be consumed by hosts running the older kernel.
> > 
> > Hmm this makes sense.
> > 
> > This means though that the patch I have submitted here isn't good enough.
> > My patch currently assumes that when it attempts to get nested state from KVM,
> > QEMU should always set nested_state->size to max size supported by KVM as received
> > from kvm_check_extension(s, KVM_CAP_NESTED_STATE);
> > (See kvm_get_nested_state() introduced on my patch).
> > This indeed won't allow migration from host with new KVM to host with old KVM if
> > nested_state size was enlarged between these KVM versions.
> > Which is obviously an issue.
> 
> Actually I think this is okay, because unlike the "new" capability was
> enabled, KVM would always reduce nested_state->size to a value that is
> compatible with current kernels.
> 
> > But on second thought, I'm not sure that this is the right approach as-well.
> > We don't really want the used version of nested_state to be determined on kvm_init().
> > * On source QEMU, we actually want to determine it when preparing for migration based
> > on to the support given by our destination host. If it's an old host, we would like to
> > save an old version nested_state and if it's a new host, we will like to save our newest
> > supported nested_state.
> 
> No, that's wrong because it would lead to losing state.  If the source
> QEMU supports more state than the destination QEMU, and the current VM
> state needs to transmit it for migration to be _correct_, then migration
> to that destination QEMU must fail.
> 
> In particular, enabling the new KVM capability needs to be gated by a
> new machine type and/or -cpu flag, if migration compatibility is needed.
>  (In particular, this is one reason why I haven't considered this series
> for 3.1.  Right now, migration of nested hypervisors is completely
> busted but if we make it "almost" work, pre-3.1 machine types would not
> ever be able to add support for KVM_CAP_EXCEPTION_PAYLOAD.  Therefore,
> it's better for users if we wait for one release more, and add support
> for KVM_CAP_NESTED_STATE and KVM_CAP_EXCEPTION_PAYLOAD at the same time).
> 
> Personally, I would like to say that, starting from QEMU 3.2, enabling
> nested VMX requires a 4.20 kernel.  It's a bit bold, but I think it's a
> good way to keep some sanity.  Any opinions on that?

We have usually followed a rule that new machine types must not
affect runability of a VM on a host. IOW new machine types should
not introduce dependancies on specific kernels, or hardware features
such as CPU flags.

Anything that requires a new kernel feature thus ought to be an
opt-in config tunable on the CLI, separate from machine type
choice.  Alternatively in this case, it could potentially be a
migration parameter settable via QMP. QEMU on each side could
advertize whether the migration parameter is available, and
the mgmt app (which can see both sides of the migration) can
then decide whether to enable it.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

  parent reply	other threads:[~2018-11-02 16:54 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-16 12:46 [QEMU PATCH v2 0/2]: KVM: i386: Add support for save and restore nested state Liran Alon
2018-09-16 12:46 ` [Qemu-devel] " Liran Alon
2018-09-16 12:46 ` [QEMU PATCH v2 1/2] i386: Compile CPUX86State xsave_buf only when support KVM or HVF Liran Alon
2018-09-16 12:46   ` [Qemu-devel] " Liran Alon
2018-09-16 12:46 ` [QEMU PATCH v2 2/2] KVM: i386: Add support for save and restore nested state Liran Alon
2018-09-16 12:46   ` [Qemu-devel] " Liran Alon
2018-10-08 17:21 ` [QEMU PATCH v2 0/2]: " Liran Alon
2018-10-08 17:21   ` [Qemu-devel] " Liran Alon
2018-10-15 18:10   ` Liran Alon
2018-10-15 18:10     ` [Qemu-devel] " Liran Alon
2018-10-31  1:03     ` Liran Alon
2018-10-31  1:03       ` [Qemu-devel] " Liran Alon
2018-10-31 18:17       ` Eduardo Habkost
2018-10-31 18:17         ` [Qemu-devel] " Eduardo Habkost
2018-10-31 18:19         ` Paolo Bonzini
2018-10-31 18:19           ` [Qemu-devel] " Paolo Bonzini
2018-10-31 18:50           ` Liran Alon
2018-10-31 18:50             ` [Qemu-devel] " Liran Alon
2018-10-31 18:59             ` Dr. David Alan Gilbert
2018-10-31 18:59               ` [Qemu-devel] " Dr. David Alan Gilbert
2018-10-31 23:17               ` Liran Alon
2018-10-31 23:17                 ` [Qemu-devel] " Liran Alon
2018-11-01 13:10                 ` Dr. David Alan Gilbert
2018-11-01 13:10                   ` [Qemu-devel] " Dr. David Alan Gilbert
2018-11-01 15:23                   ` Liran Alon
2018-11-01 15:23                     ` [Qemu-devel] " Liran Alon
2018-11-01 15:56                     ` Dr. David Alan Gilbert
2018-11-01 15:56                       ` [Qemu-devel] " Dr. David Alan Gilbert
2018-11-01 16:45                       ` Jim Mattson via Qemu-devel
2018-11-01 16:45                         ` [Qemu-devel] " Jim Mattson
2018-11-02  3:46                         ` Liran Alon
2018-11-02  3:46                           ` [Qemu-devel] " Liran Alon
2018-11-02  9:40                           ` Paolo Bonzini
2018-11-02  9:40                             ` [Qemu-devel] " Paolo Bonzini
2018-11-02 12:35                             ` Dr. David Alan Gilbert
2018-11-02 12:35                               ` [Qemu-devel] " Dr. David Alan Gilbert
2018-11-02 12:40                               ` Daniel P. Berrangé
2018-11-02 12:40                                 ` [Qemu-devel] " Daniel P. Berrangé
2018-11-04 22:12                               ` Paolo Bonzini
2018-11-04 22:12                                 ` [Qemu-devel] " Paolo Bonzini
2018-11-02 12:59                             ` Liran Alon
2018-11-02 12:59                               ` [Qemu-devel] " Liran Alon
2018-11-02 16:44                               ` Jim Mattson via Qemu-devel
2018-11-02 16:44                                 ` [Qemu-devel] " Jim Mattson
2018-11-02 16:58                                 ` Daniel P. Berrangé
2018-11-02 16:58                                   ` [Qemu-devel] " Daniel P. Berrangé
2018-11-02 17:01                                   ` Jim Mattson via Qemu-devel
2018-11-02 17:01                                     ` [Qemu-devel] " Jim Mattson
2018-11-02 16:54                             ` Daniel P. Berrangé [this message]
2018-11-02 16:54                               ` Daniel P. Berrangé
2018-11-02 16:58                               ` Dr. David Alan Gilbert
2018-11-02 16:58                                 ` [Qemu-devel] " Dr. David Alan Gilbert
2018-11-04 22:19                               ` Paolo Bonzini
2018-11-04 22:19                                 ` [Qemu-devel] " Paolo Bonzini
2018-11-12 16:18                                 ` Daniel P. Berrangé
2018-11-12 16:18                                   ` [Qemu-devel] " Daniel P. Berrangé
2018-11-12 16:50                                   ` Dr. David Alan Gilbert
2018-11-12 16:50                                     ` [Qemu-devel] " Dr. David Alan Gilbert
2018-11-12 16:53                                     ` Paolo Bonzini
2018-11-12 16:53                                       ` [Qemu-devel] " Paolo Bonzini
2018-11-12 16:54                                     ` Daniel P. Berrangé
2018-11-12 16:54                                       ` [Qemu-devel] " Daniel P. Berrangé
2018-11-13  0:00                                       ` Liran Alon
2018-11-13  0:00                                         ` [Qemu-devel] " Liran Alon
2018-11-13  0:07                                         ` Jim Mattson via Qemu-devel
2018-11-13  0:07                                           ` [Qemu-devel] " Jim Mattson
2018-11-13  0:09                                           ` Liran Alon
2018-11-13  0:09                                             ` [Qemu-devel] " Liran Alon
2018-11-12 23:58                                     ` Liran Alon
2018-11-12 23:58                                       ` [Qemu-devel] " Liran Alon
2018-11-02 16:39                           ` Jim Mattson via Qemu-devel
2018-11-02 16:39                             ` [Qemu-devel] " Jim Mattson
2018-11-03  2:02                             ` Liran Alon
2018-11-03  2:02                               ` [Qemu-devel] " Liran Alon
2018-11-08  0:13                               ` Liran Alon
2018-11-08  0:13                                 ` [Qemu-devel] " Liran Alon
2018-11-08  0:45                                 ` Jim Mattson via Qemu-devel
2018-11-08  0:45                                   ` [Qemu-devel] " Jim Mattson
2018-11-08  9:50                                   ` Paolo Bonzini
2018-11-08  9:50                                     ` [Qemu-devel] " Paolo Bonzini
2018-11-08  9:57                                     ` Liran Alon
2018-11-08  9:57                                       ` [Qemu-devel] " Liran Alon
2018-11-08 17:02                                       ` Paolo Bonzini
2018-11-08 17:02                                         ` [Qemu-devel] " Paolo Bonzini
2018-11-08 18:41                                         ` Liran Alon
2018-11-08 18:41                                           ` [Qemu-devel] " Liran Alon
2018-11-08 20:34                                           ` Paolo Bonzini
2018-11-08 20:34                                             ` [Qemu-devel] " Paolo Bonzini
2018-11-12 14:51                                           ` Dr. David Alan Gilbert
2018-11-12 14:51                                             ` [Qemu-devel] " Dr. David Alan Gilbert
2018-11-01 19:03                       ` Liran Alon
2018-11-01 19:03                         ` [Qemu-devel] " Liran Alon
2018-11-01 19:07                         ` Dr. David Alan Gilbert
2018-11-01 19:07                           ` [Qemu-devel] " Dr. David Alan Gilbert
2018-11-01 19:41                           ` Jim Mattson via Qemu-devel
2018-11-01 19:41                             ` [Qemu-devel] " Jim Mattson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181102165409.GF21191@redhat.com \
    --to=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=liran.alon@oracle.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.