All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] vhost-vsock: fix migration issue when seqpacket is supported
@ 2021-09-07 12:49 Stefano Garzarella
  2021-09-07 13:22 ` Daniel P. Berrangé
  0 siblings, 1 reply; 12+ messages in thread
From: Stefano Garzarella @ 2021-09-07 12:49 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Jiang Wang, Michael S. Tsirkin, qemu-stable,
	Stefan Hajnoczi, Arseny Krasnov

Commit 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
enabled the SEQPACKET feature bit.
This commit is released with QEMU 6.1, so if we try to migrate a VM where
the host kernel supports SEQPACKET but machine type version is less than
6.1, we get the following errors:

    Features 0x130000002 unsupported. Allowed features: 0x179000000
    Failed to load virtio-vhost_vsock:virtio
    error while loading state for instance 0x0 of device '0000:00:05.0/virtio-vhost_vsock'
    load of migration failed: Operation not permitted

Let's disable the feature bit for machine types < 6.1, adding a
`features` field to VHostVSock to simplify the handling of upcoming
features we will support.

Fixes: 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
Cc: qemu-stable@nongnu.org
Reported-by: Jiang Wang <jiang.wang@bytedance.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
 include/hw/virtio/vhost-vsock.h | 1 +
 hw/core/machine.c               | 1 +
 hw/virtio/vhost-vsock.c         | 6 +++++-
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/hw/virtio/vhost-vsock.h b/include/hw/virtio/vhost-vsock.h
index 84f4e727c7..7da92a8883 100644
--- a/include/hw/virtio/vhost-vsock.h
+++ b/include/hw/virtio/vhost-vsock.h
@@ -29,6 +29,7 @@ struct VHostVSock {
     /*< private >*/
     VHostVSockCommon parent;
     VHostVSockConf conf;
+    uint64_t features;
 
     /*< public >*/
 };
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 067f42b528..7e2851feb9 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -46,6 +46,7 @@ GlobalProperty hw_compat_6_0[] = {
     { "nvme-ns", "eui64-default", "off"},
     { "e1000", "init-vet", "off" },
     { "e1000e", "init-vet", "off" },
+    { "vhost-vsock-device", "seqpacket", "off" },
 };
 const size_t hw_compat_6_0_len = G_N_ELEMENTS(hw_compat_6_0);
 
diff --git a/hw/virtio/vhost-vsock.c b/hw/virtio/vhost-vsock.c
index 1b1a5c70ed..9458d4eeb4 100644
--- a/hw/virtio/vhost-vsock.c
+++ b/hw/virtio/vhost-vsock.c
@@ -114,8 +114,10 @@ static uint64_t vhost_vsock_get_features(VirtIODevice *vdev,
                                          Error **errp)
 {
     VHostVSockCommon *vvc = VHOST_VSOCK_COMMON(vdev);
+    VHostVSock *vsock = VHOST_VSOCK(vdev);
+
+    requested_features |= vsock->features;
 
-    virtio_add_feature(&requested_features, VIRTIO_VSOCK_F_SEQPACKET);
     return vhost_get_features(&vvc->vhost_dev, feature_bits,
                                 requested_features);
 }
@@ -218,6 +220,8 @@ static void vhost_vsock_device_unrealize(DeviceState *dev)
 static Property vhost_vsock_properties[] = {
     DEFINE_PROP_UINT64("guest-cid", VHostVSock, conf.guest_cid, 0),
     DEFINE_PROP_STRING("vhostfd", VHostVSock, conf.vhostfd),
+    DEFINE_PROP_BIT64("seqpacket", VHostVSock, features,
+                      VIRTIO_VSOCK_F_SEQPACKET, true),
     DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] vhost-vsock: fix migration issue when seqpacket is supported
  2021-09-07 12:49 [PATCH] vhost-vsock: fix migration issue when seqpacket is supported Stefano Garzarella
@ 2021-09-07 13:22 ` Daniel P. Berrangé
  2021-09-07 13:47   ` Stefano Garzarella
  2021-09-09  8:47   ` Michael S. Tsirkin
  0 siblings, 2 replies; 12+ messages in thread
From: Daniel P. Berrangé @ 2021-09-07 13:22 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: Eduardo Habkost, Jiang Wang, Michael S. Tsirkin, qemu-stable,
	qemu-devel, Stefan Hajnoczi, Arseny Krasnov

On Tue, Sep 07, 2021 at 02:49:35PM +0200, Stefano Garzarella wrote:
> Commit 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
> enabled the SEQPACKET feature bit.
> This commit is released with QEMU 6.1, so if we try to migrate a VM where
> the host kernel supports SEQPACKET but machine type version is less than
> 6.1, we get the following errors:
> 
>     Features 0x130000002 unsupported. Allowed features: 0x179000000
>     Failed to load virtio-vhost_vsock:virtio
>     error while loading state for instance 0x0 of device '0000:00:05.0/virtio-vhost_vsock'
>     load of migration failed: Operation not permitted
> 
> Let's disable the feature bit for machine types < 6.1, adding a
> `features` field to VHostVSock to simplify the handling of upcoming
> features we will support.

IIUC, this will still leave migration broken for anyone migrating
a >= 6.1 machine type between a kernel that supports SEQPACKET and
a kernel lacking that, or vica-verca.  

If a feature is dependant on a host kernel feature we can't turn
that on automatically as part of the machine type, as we need
ABI stability across migration indepdant of kernel version.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] vhost-vsock: fix migration issue when seqpacket is supported
  2021-09-07 13:22 ` Daniel P. Berrangé
@ 2021-09-07 13:47   ` Stefano Garzarella
  2021-09-08 13:41     ` Stefano Garzarella
  2021-09-09  8:47   ` Michael S. Tsirkin
  1 sibling, 1 reply; 12+ messages in thread
From: Stefano Garzarella @ 2021-09-07 13:47 UTC (permalink / raw)
  To: Daniel P. Berrangé, Michael S. Tsirkin
  Cc: Eduardo Habkost, Jiang Wang, qemu-stable, qemu-devel,
	Stefan Hajnoczi, Arseny Krasnov

On Tue, Sep 07, 2021 at 02:22:24PM +0100, Daniel P. Berrangé wrote:
>On Tue, Sep 07, 2021 at 02:49:35PM +0200, Stefano Garzarella wrote:
>> Commit 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
>> enabled the SEQPACKET feature bit.
>> This commit is released with QEMU 6.1, so if we try to migrate a VM where
>> the host kernel supports SEQPACKET but machine type version is less than
>> 6.1, we get the following errors:
>>
>>     Features 0x130000002 unsupported. Allowed features: 0x179000000
>>     Failed to load virtio-vhost_vsock:virtio
>>     error while loading state for instance 0x0 of device '0000:00:05.0/virtio-vhost_vsock'
>>     load of migration failed: Operation not permitted
>>
>> Let's disable the feature bit for machine types < 6.1, adding a
>> `features` field to VHostVSock to simplify the handling of upcoming
>> features we will support.
>
>IIUC, this will still leave migration broken for anyone migrating
>a >= 6.1 machine type between a kernel that supports SEQPACKET and
>a kernel lacking that, or vica-verca.

This should be true for migrating from kernel that supports SEQPACKET to 
a kernel lacking that.

For vice-versa I'm not sure, since vhost_get_features() will disable 
that feature if the host kernel doesn't support it, and the guest will 
not have acked it.

>
>If a feature is dependant on a host kernel feature we can't turn
>that on automatically as part of the machine type, as we need
>ABI stability across migration indepdant of kernel version.
>

How do we typically handle this?

I wrongly thought it was an expected behavior that migrating a guest 
using a vhost device from a new kernel to an old one can fail if not all 
features are supported.

I need to take a look at the other vhost devices.

Thanks,
Stefano



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] vhost-vsock: fix migration issue when seqpacket is supported
  2021-09-07 13:47   ` Stefano Garzarella
@ 2021-09-08 13:41     ` Stefano Garzarella
  2021-09-08 13:48       ` Daniel P. Berrangé
  0 siblings, 1 reply; 12+ messages in thread
From: Stefano Garzarella @ 2021-09-08 13:41 UTC (permalink / raw)
  To: Daniel P. Berrangé, Michael S. Tsirkin, Jason Wang
  Cc: Eduardo Habkost, Jiang Wang, qemu-stable, qemu-devel,
	Stefan Hajnoczi, Arseny Krasnov

On Tue, Sep 07, 2021 at 03:47:56PM +0200, Stefano Garzarella wrote:
>On Tue, Sep 07, 2021 at 02:22:24PM +0100, Daniel P. Berrangé wrote:
>>On Tue, Sep 07, 2021 at 02:49:35PM +0200, Stefano Garzarella wrote:
>>>Commit 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
>>>enabled the SEQPACKET feature bit.
>>>This commit is released with QEMU 6.1, so if we try to migrate a VM where
>>>the host kernel supports SEQPACKET but machine type version is less than
>>>6.1, we get the following errors:
>>>
>>>    Features 0x130000002 unsupported. Allowed features: 0x179000000
>>>    Failed to load virtio-vhost_vsock:virtio
>>>    error while loading state for instance 0x0 of device '0000:00:05.0/virtio-vhost_vsock'
>>>    load of migration failed: Operation not permitted
>>>
>>>Let's disable the feature bit for machine types < 6.1, adding a
>>>`features` field to VHostVSock to simplify the handling of upcoming
>>>features we will support.
>>
>>IIUC, this will still leave migration broken for anyone migrating
>>a >= 6.1 machine type between a kernel that supports SEQPACKET and
>>a kernel lacking that, or vica-verca.
>
>This should be true for migrating from kernel that supports SEQPACKET 
>to a kernel lacking that.
>
>For vice-versa I'm not sure, since vhost_get_features() will disable 
>that feature if the host kernel doesn't support it, and the guest will 
>not have acked it.

I did some testing and the migration is only broken in the case of
kernel 5.14+ (SEQPACKET supported) -> kernel 5.13 (SEQPACKET not 
supported).

Vice-versa works well because the feature is not acked.

>
>>
>>If a feature is dependant on a host kernel feature we can't turn
>>that on automatically as part of the machine type, as we need
>>ABI stability across migration indepdant of kernel version.
>>
>
>How do we typically handle this?
>
>I wrongly thought it was an expected behavior that migrating a guest 
>using a vhost device from a new kernel to an old one can fail if not 
>all features are supported.
>
>I need to take a look at the other vhost devices.

I took a look at vhost-net and vhost-scsi and we don't seem to handle 
this case. Maybe I'm missing something...

So following your advice, the best thing would be to have this feature 
disabled by default and require the user to enable it explicitly so we 
are sure it is needed. At this point a migration to a kernel that 
doesn't support it is rightly broken.

Or is there something better we can do?

@Michael @Jason any thoughts?

Thanks,
Stefano



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] vhost-vsock: fix migration issue when seqpacket is supported
  2021-09-08 13:41     ` Stefano Garzarella
@ 2021-09-08 13:48       ` Daniel P. Berrangé
  0 siblings, 0 replies; 12+ messages in thread
From: Daniel P. Berrangé @ 2021-09-08 13:48 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: Eduardo Habkost, Jiang Wang, Michael S. Tsirkin, Jason Wang,
	qemu-stable, qemu-devel, Stefan Hajnoczi, Arseny Krasnov

On Wed, Sep 08, 2021 at 03:41:35PM +0200, Stefano Garzarella wrote:
> On Tue, Sep 07, 2021 at 03:47:56PM +0200, Stefano Garzarella wrote:
> > On Tue, Sep 07, 2021 at 02:22:24PM +0100, Daniel P. Berrangé wrote:
> > > On Tue, Sep 07, 2021 at 02:49:35PM +0200, Stefano Garzarella wrote:
> > > > Commit 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
> > > > enabled the SEQPACKET feature bit.
> > > > This commit is released with QEMU 6.1, so if we try to migrate a VM where
> > > > the host kernel supports SEQPACKET but machine type version is less than
> > > > 6.1, we get the following errors:
> > > > 
> > > >    Features 0x130000002 unsupported. Allowed features: 0x179000000
> > > >    Failed to load virtio-vhost_vsock:virtio
> > > >    error while loading state for instance 0x0 of device '0000:00:05.0/virtio-vhost_vsock'
> > > >    load of migration failed: Operation not permitted
> > > > 
> > > > Let's disable the feature bit for machine types < 6.1, adding a
> > > > `features` field to VHostVSock to simplify the handling of upcoming
> > > > features we will support.
> > > 
> > > IIUC, this will still leave migration broken for anyone migrating
> > > a >= 6.1 machine type between a kernel that supports SEQPACKET and
> > > a kernel lacking that, or vica-verca.
> > 
> > This should be true for migrating from kernel that supports SEQPACKET to
> > a kernel lacking that.
> > 
> > For vice-versa I'm not sure, since vhost_get_features() will disable
> > that feature if the host kernel doesn't support it, and the guest will
> > not have acked it.
> 
> I did some testing and the migration is only broken in the case of
> kernel 5.14+ (SEQPACKET supported) -> kernel 5.13 (SEQPACKET not supported).
> 
> Vice-versa works well because the feature is not acked.
> 
> > 
> > > 
> > > If a feature is dependant on a host kernel feature we can't turn
> > > that on automatically as part of the machine type, as we need
> > > ABI stability across migration indepdant of kernel version.
> > > 
> > 
> > How do we typically handle this?
> > 
> > I wrongly thought it was an expected behavior that migrating a guest
> > using a vhost device from a new kernel to an old one can fail if not all
> > features are supported.
> > 
> > I need to take a look at the other vhost devices.
> 
> I took a look at vhost-net and vhost-scsi and we don't seem to handle this
> case. Maybe I'm missing something...

We've never done very well at having a consistent story wrt deps
on kernel features. So I wouldn't be surprised to see differences
or omissions anywhere and people not notice the issue.

> So following your advice, the best thing would be to have this feature
> disabled by default and require the user to enable it explicitly so we are
> sure it is needed. At this point a migration to a kernel that doesn't
> support it is rightly broken.
> 
> Or is there something better we can do?
> 
> @Michael @Jason any thoughts?

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] vhost-vsock: fix migration issue when seqpacket is supported
  2021-09-07 13:22 ` Daniel P. Berrangé
  2021-09-07 13:47   ` Stefano Garzarella
@ 2021-09-09  8:47   ` Michael S. Tsirkin
  2021-09-09  9:02     ` Daniel P. Berrangé
  1 sibling, 1 reply; 12+ messages in thread
From: Michael S. Tsirkin @ 2021-09-09  8:47 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Eduardo Habkost, Jiang Wang, qemu-stable, qemu-devel,
	Stefan Hajnoczi, Arseny Krasnov, Stefano Garzarella

On Tue, Sep 07, 2021 at 02:22:24PM +0100, Daniel P. Berrangé wrote:
> On Tue, Sep 07, 2021 at 02:49:35PM +0200, Stefano Garzarella wrote:
> > Commit 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
> > enabled the SEQPACKET feature bit.
> > This commit is released with QEMU 6.1, so if we try to migrate a VM where
> > the host kernel supports SEQPACKET but machine type version is less than
> > 6.1, we get the following errors:
> > 
> >     Features 0x130000002 unsupported. Allowed features: 0x179000000
> >     Failed to load virtio-vhost_vsock:virtio
> >     error while loading state for instance 0x0 of device '0000:00:05.0/virtio-vhost_vsock'
> >     load of migration failed: Operation not permitted
> > 
> > Let's disable the feature bit for machine types < 6.1, adding a
> > `features` field to VHostVSock to simplify the handling of upcoming
> > features we will support.
> 
> IIUC, this will still leave migration broken for anyone migrating
> a >= 6.1 machine type between a kernel that supports SEQPACKET and
> a kernel lacking that, or vica-verca.  
> 
> If a feature is dependant on a host kernel feature we can't turn
> that on automatically as part of the machine type, as we need
> ABI stability across migration indepdant of kernel version.
> 
> 
> Regards,
> Daniel

This is a fundamental problem we have with kernel accelerators.
A higher level solution at management level is needed.
For now yes, we do turn features on by default,
consistent kernels on source and destination are assumed.
For downstreams not a problem at all as they update
userspace and kernel in concert.


> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] vhost-vsock: fix migration issue when seqpacket is supported
  2021-09-09  8:47   ` Michael S. Tsirkin
@ 2021-09-09  9:02     ` Daniel P. Berrangé
  2021-09-10  6:35       ` Michael S. Tsirkin
  0 siblings, 1 reply; 12+ messages in thread
From: Daniel P. Berrangé @ 2021-09-09  9:02 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Eduardo Habkost, Jiang Wang, qemu-stable, qemu-devel,
	Stefan Hajnoczi, Arseny Krasnov, Stefano Garzarella

On Thu, Sep 09, 2021 at 04:47:42AM -0400, Michael S. Tsirkin wrote:
> On Tue, Sep 07, 2021 at 02:22:24PM +0100, Daniel P. Berrangé wrote:
> > On Tue, Sep 07, 2021 at 02:49:35PM +0200, Stefano Garzarella wrote:
> > > Commit 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
> > > enabled the SEQPACKET feature bit.
> > > This commit is released with QEMU 6.1, so if we try to migrate a VM where
> > > the host kernel supports SEQPACKET but machine type version is less than
> > > 6.1, we get the following errors:
> > > 
> > >     Features 0x130000002 unsupported. Allowed features: 0x179000000
> > >     Failed to load virtio-vhost_vsock:virtio
> > >     error while loading state for instance 0x0 of device '0000:00:05.0/virtio-vhost_vsock'
> > >     load of migration failed: Operation not permitted
> > > 
> > > Let's disable the feature bit for machine types < 6.1, adding a
> > > `features` field to VHostVSock to simplify the handling of upcoming
> > > features we will support.
> > 
> > IIUC, this will still leave migration broken for anyone migrating
> > a >= 6.1 machine type between a kernel that supports SEQPACKET and
> > a kernel lacking that, or vica-verca.  
> > 
> > If a feature is dependant on a host kernel feature we can't turn
> > that on automatically as part of the machine type, as we need
> > ABI stability across migration indepdant of kernel version.
> > 
> > 
> > Regards,
> > Daniel
> 
> This is a fundamental problem we have with kernel accelerators.
> A higher level solution at management level is needed.
> For now yes, we do turn features on by default,
> consistent kernels on source and destination are assumed.
> For downstreams not a problem at all as they update
> userspace and kernel in concert.

Even downstream in RHEL that is not actually valid anymore. Container
based deployment has killed any assumptions that can be made in this
respect. Even if the userspace and kernel are updated in lockstep in
a particular RHEL release, you cannot assume the running environment
will have a matched pair.

Users can be running QEMU userspace from RHEL-8.5 inside a container
that has been deployed on a host using a 8.3 kernel. We've even had
cases of running QEMU from RHEL-8, on a RHEL-7 host.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] vhost-vsock: fix migration issue when seqpacket is supported
  2021-09-09  9:02     ` Daniel P. Berrangé
@ 2021-09-10  6:35       ` Michael S. Tsirkin
  2021-09-13 12:51         ` Stefano Garzarella
  0 siblings, 1 reply; 12+ messages in thread
From: Michael S. Tsirkin @ 2021-09-10  6:35 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Eduardo Habkost, Jiang Wang, qemu-stable, qemu-devel,
	Stefan Hajnoczi, Arseny Krasnov, Stefano Garzarella

On Thu, Sep 09, 2021 at 10:02:12AM +0100, Daniel P. Berrangé wrote:
> On Thu, Sep 09, 2021 at 04:47:42AM -0400, Michael S. Tsirkin wrote:
> > On Tue, Sep 07, 2021 at 02:22:24PM +0100, Daniel P. Berrangé wrote:
> > > On Tue, Sep 07, 2021 at 02:49:35PM +0200, Stefano Garzarella wrote:
> > > > Commit 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
> > > > enabled the SEQPACKET feature bit.
> > > > This commit is released with QEMU 6.1, so if we try to migrate a VM where
> > > > the host kernel supports SEQPACKET but machine type version is less than
> > > > 6.1, we get the following errors:
> > > > 
> > > >     Features 0x130000002 unsupported. Allowed features: 0x179000000
> > > >     Failed to load virtio-vhost_vsock:virtio
> > > >     error while loading state for instance 0x0 of device '0000:00:05.0/virtio-vhost_vsock'
> > > >     load of migration failed: Operation not permitted
> > > > 
> > > > Let's disable the feature bit for machine types < 6.1, adding a
> > > > `features` field to VHostVSock to simplify the handling of upcoming
> > > > features we will support.
> > > 
> > > IIUC, this will still leave migration broken for anyone migrating
> > > a >= 6.1 machine type between a kernel that supports SEQPACKET and
> > > a kernel lacking that, or vica-verca.  
> > > 
> > > If a feature is dependant on a host kernel feature we can't turn
> > > that on automatically as part of the machine type, as we need
> > > ABI stability across migration indepdant of kernel version.
> > > 
> > > 
> > > Regards,
> > > Daniel
> > 
> > This is a fundamental problem we have with kernel accelerators.
> > A higher level solution at management level is needed.
> > For now yes, we do turn features on by default,
> > consistent kernels on source and destination are assumed.
> > For downstreams not a problem at all as they update
> > userspace and kernel in concert.
> 
> Even downstream in RHEL that is not actually valid anymore. Container
> based deployment has killed any assumptions that can be made in this
> respect. Even if the userspace and kernel are updated in lockstep in
> a particular RHEL release, you cannot assume the running environment
> will have a matched pair.
> 
> Users can be running QEMU userspace from RHEL-8.5 inside a container
> that has been deployed on a host using a 8.3 kernel. We've even had
> cases of running QEMU from RHEL-8, on a RHEL-7 host.
> 
> Regards,
> Daniel

Is there finally an interest in addressing this then?  This would
involve collecting host features across a cluster and for each host
figuring out a configuration that works for migration. IIRC a tool was
proposed for the task (to live alongside e.g. qemu-img).

As long as we just stick to the machine type the best we can do is
probably to keep doing what we do now (hope that the two host kernels
are more or less consistent) as otherwise we'd have to never enable any
new features in vsock.

> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] vhost-vsock: fix migration issue when seqpacket is supported
  2021-09-10  6:35       ` Michael S. Tsirkin
@ 2021-09-13 12:51         ` Stefano Garzarella
  2021-09-13 13:46           ` Michael S. Tsirkin
  0 siblings, 1 reply; 12+ messages in thread
From: Stefano Garzarella @ 2021-09-13 12:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Daniel P. Berrangé,
	Eduardo Habkost, Jiang Wang, qemu-stable, qemu-devel,
	Stefan Hajnoczi, Arseny Krasnov

On Fri, Sep 10, 2021 at 02:35:53AM -0400, Michael S. Tsirkin wrote:
>On Thu, Sep 09, 2021 at 10:02:12AM +0100, Daniel P. Berrangé wrote:
>> On Thu, Sep 09, 2021 at 04:47:42AM -0400, Michael S. Tsirkin wrote:
>> > On Tue, Sep 07, 2021 at 02:22:24PM +0100, Daniel P. Berrangé wrote:
>> > > On Tue, Sep 07, 2021 at 02:49:35PM +0200, Stefano Garzarella wrote:
>> > > > Commit 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
>> > > > enabled the SEQPACKET feature bit.
>> > > > This commit is released with QEMU 6.1, so if we try to migrate a VM where
>> > > > the host kernel supports SEQPACKET but machine type version is less than
>> > > > 6.1, we get the following errors:
>> > > >
>> > > >     Features 0x130000002 unsupported. Allowed features: 0x179000000
>> > > >     Failed to load virtio-vhost_vsock:virtio
>> > > >     error while loading state for instance 0x0 of device '0000:00:05.0/virtio-vhost_vsock'
>> > > >     load of migration failed: Operation not permitted
>> > > >
>> > > > Let's disable the feature bit for machine types < 6.1, adding a
>> > > > `features` field to VHostVSock to simplify the handling of upcoming
>> > > > features we will support.
>> > >
>> > > IIUC, this will still leave migration broken for anyone migrating
>> > > a >= 6.1 machine type between a kernel that supports SEQPACKET and
>> > > a kernel lacking that, or vica-verca.
>> > >
>> > > If a feature is dependant on a host kernel feature we can't turn
>> > > that on automatically as part of the machine type, as we need
>> > > ABI stability across migration indepdant of kernel version.
>> > >
>> > >
>> > > Regards,
>> > > Daniel
>> >
>> > This is a fundamental problem we have with kernel accelerators.
>> > A higher level solution at management level is needed.
>> > For now yes, we do turn features on by default,
>> > consistent kernels on source and destination are assumed.
>> > For downstreams not a problem at all as they update
>> > userspace and kernel in concert.
>>
>> Even downstream in RHEL that is not actually valid anymore. Container
>> based deployment has killed any assumptions that can be made in this
>> respect. Even if the userspace and kernel are updated in lockstep in
>> a particular RHEL release, you cannot assume the running environment
>> will have a matched pair.
>>
>> Users can be running QEMU userspace from RHEL-8.5 inside a container
>> that has been deployed on a host using a 8.3 kernel. We've even had
>> cases of running QEMU from RHEL-8, on a RHEL-7 host.
>>
>> Regards,
>> Daniel
>
>Is there finally an interest in addressing this then?  This would
>involve collecting host features across a cluster and for each host
>figuring out a configuration that works for migration. IIRC a tool was
>proposed for the task (to live alongside e.g. qemu-img).

Apart from the tool, what if we provide a mechanism for adding/removing 
device features at run-time?
After migration we could tell the guest that a feature is no longer 
available.

Maybe it's too complicated, but it would allow us to solve the problem 
of migrating between different kernels or, with vDPA, between different 
devices that don't support all features.

>
>As long as we just stick to the machine type the best we can do is
>probably to keep doing what we do now (hope that the two host kernels
>are more or less consistent) as otherwise we'd have to never enable any
>new features in vsock.

Should we at least merge this patch to allow to migrate a VM between a 
new and an old qemu even if the kernel is the same?

Thanks,
Stefano



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] vhost-vsock: fix migration issue when seqpacket is supported
  2021-09-13 12:51         ` Stefano Garzarella
@ 2021-09-13 13:46           ` Michael S. Tsirkin
  2021-09-14 10:42             ` Stefano Garzarella
  0 siblings, 1 reply; 12+ messages in thread
From: Michael S. Tsirkin @ 2021-09-13 13:46 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: Daniel P. Berrangé,
	Eduardo Habkost, Jiang Wang, qemu-stable, qemu-devel,
	Stefan Hajnoczi, Arseny Krasnov

On Mon, Sep 13, 2021 at 02:51:42PM +0200, Stefano Garzarella wrote:
> On Fri, Sep 10, 2021 at 02:35:53AM -0400, Michael S. Tsirkin wrote:
> > On Thu, Sep 09, 2021 at 10:02:12AM +0100, Daniel P. Berrangé wrote:
> > > On Thu, Sep 09, 2021 at 04:47:42AM -0400, Michael S. Tsirkin wrote:
> > > > On Tue, Sep 07, 2021 at 02:22:24PM +0100, Daniel P. Berrangé wrote:
> > > > > On Tue, Sep 07, 2021 at 02:49:35PM +0200, Stefano Garzarella wrote:
> > > > > > Commit 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
> > > > > > enabled the SEQPACKET feature bit.
> > > > > > This commit is released with QEMU 6.1, so if we try to migrate a VM where
> > > > > > the host kernel supports SEQPACKET but machine type version is less than
> > > > > > 6.1, we get the following errors:
> > > > > >
> > > > > >     Features 0x130000002 unsupported. Allowed features: 0x179000000
> > > > > >     Failed to load virtio-vhost_vsock:virtio
> > > > > >     error while loading state for instance 0x0 of device '0000:00:05.0/virtio-vhost_vsock'
> > > > > >     load of migration failed: Operation not permitted
> > > > > >
> > > > > > Let's disable the feature bit for machine types < 6.1, adding a
> > > > > > `features` field to VHostVSock to simplify the handling of upcoming
> > > > > > features we will support.
> > > > >
> > > > > IIUC, this will still leave migration broken for anyone migrating
> > > > > a >= 6.1 machine type between a kernel that supports SEQPACKET and
> > > > > a kernel lacking that, or vica-verca.
> > > > >
> > > > > If a feature is dependant on a host kernel feature we can't turn
> > > > > that on automatically as part of the machine type, as we need
> > > > > ABI stability across migration indepdant of kernel version.
> > > > >
> > > > >
> > > > > Regards,
> > > > > Daniel
> > > >
> > > > This is a fundamental problem we have with kernel accelerators.
> > > > A higher level solution at management level is needed.
> > > > For now yes, we do turn features on by default,
> > > > consistent kernels on source and destination are assumed.
> > > > For downstreams not a problem at all as they update
> > > > userspace and kernel in concert.
> > > 
> > > Even downstream in RHEL that is not actually valid anymore. Container
> > > based deployment has killed any assumptions that can be made in this
> > > respect. Even if the userspace and kernel are updated in lockstep in
> > > a particular RHEL release, you cannot assume the running environment
> > > will have a matched pair.
> > > 
> > > Users can be running QEMU userspace from RHEL-8.5 inside a container
> > > that has been deployed on a host using a 8.3 kernel. We've even had
> > > cases of running QEMU from RHEL-8, on a RHEL-7 host.
> > > 
> > > Regards,
> > > Daniel
> > 
> > Is there finally an interest in addressing this then?  This would
> > involve collecting host features across a cluster and for each host
> > figuring out a configuration that works for migration. IIRC a tool was
> > proposed for the task (to live alongside e.g. qemu-img).
> 
> Apart from the tool, what if we provide a mechanism for adding/removing
> device features at run-time?
> After migration we could tell the guest that a feature is no longer
> available.
> 
> Maybe it's too complicated, but it would allow us to solve the problem of
> migrating between different kernels or, with vDPA, between different devices
> that don't support all features.

Possible going forward but not supported by the spec at this point,
and tricky to do generally.
It's possible to do it in a vsock specific way since sockets
are currently closed across migration.


> > 
> > As long as we just stick to the machine type the best we can do is
> > probably to keep doing what we do now (hope that the two host kernels
> > are more or less consistent) as otherwise we'd have to never enable any
> > new features in vsock.
> 
> Should we at least merge this patch to allow to migrate a VM between a new
> and an old qemu even if the kernel is the same?
> 
> Thanks,
> Stefano

I'm inclined to do this, yes.

-- 
MST



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] vhost-vsock: fix migration issue when seqpacket is supported
  2021-09-13 13:46           ` Michael S. Tsirkin
@ 2021-09-14 10:42             ` Stefano Garzarella
  2021-09-14 11:49               ` Michael S. Tsirkin
  0 siblings, 1 reply; 12+ messages in thread
From: Stefano Garzarella @ 2021-09-14 10:42 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Daniel P. Berrangé,
	Eduardo Habkost, Jiang Wang, qemu-stable, qemu-devel,
	Stefan Hajnoczi, Arseny Krasnov

On Mon, Sep 13, 2021 at 09:46:48AM -0400, Michael S. Tsirkin wrote:
>On Mon, Sep 13, 2021 at 02:51:42PM +0200, Stefano Garzarella wrote:
>> On Fri, Sep 10, 2021 at 02:35:53AM -0400, Michael S. Tsirkin wrote:
>> > On Thu, Sep 09, 2021 at 10:02:12AM +0100, Daniel P. Berrangé wrote:
>> > > On Thu, Sep 09, 2021 at 04:47:42AM -0400, Michael S. Tsirkin wrote:
>> > > > On Tue, Sep 07, 2021 at 02:22:24PM +0100, Daniel P. Berrangé wrote:
>> > > > > On Tue, Sep 07, 2021 at 02:49:35PM +0200, Stefano Garzarella wrote:
>> > > > > > Commit 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
>> > > > > > enabled the SEQPACKET feature bit.
>> > > > > > This commit is released with QEMU 6.1, so if we try to migrate a VM where
>> > > > > > the host kernel supports SEQPACKET but machine type version is less than
>> > > > > > 6.1, we get the following errors:
>> > > > > >
>> > > > > >     Features 0x130000002 unsupported. Allowed features: 0x179000000
>> > > > > >     Failed to load virtio-vhost_vsock:virtio
>> > > > > >     error while loading state for instance 0x0 of device '0000:00:05.0/virtio-vhost_vsock'
>> > > > > >     load of migration failed: Operation not permitted
>> > > > > >
>> > > > > > Let's disable the feature bit for machine types < 6.1, adding a
>> > > > > > `features` field to VHostVSock to simplify the handling of upcoming
>> > > > > > features we will support.
>> > > > >
>> > > > > IIUC, this will still leave migration broken for anyone migrating
>> > > > > a >= 6.1 machine type between a kernel that supports SEQPACKET and
>> > > > > a kernel lacking that, or vica-verca.
>> > > > >
>> > > > > If a feature is dependant on a host kernel feature we can't turn
>> > > > > that on automatically as part of the machine type, as we need
>> > > > > ABI stability across migration indepdant of kernel version.
>> > > > >
>> > > > >
>> > > > > Regards,
>> > > > > Daniel
>> > > >
>> > > > This is a fundamental problem we have with kernel accelerators.
>> > > > A higher level solution at management level is needed.
>> > > > For now yes, we do turn features on by default,
>> > > > consistent kernels on source and destination are assumed.
>> > > > For downstreams not a problem at all as they update
>> > > > userspace and kernel in concert.
>> > >
>> > > Even downstream in RHEL that is not actually valid anymore. Container
>> > > based deployment has killed any assumptions that can be made in this
>> > > respect. Even if the userspace and kernel are updated in lockstep in
>> > > a particular RHEL release, you cannot assume the running environment
>> > > will have a matched pair.
>> > >
>> > > Users can be running QEMU userspace from RHEL-8.5 inside a container
>> > > that has been deployed on a host using a 8.3 kernel. We've even had
>> > > cases of running QEMU from RHEL-8, on a RHEL-7 host.
>> > >
>> > > Regards,
>> > > Daniel
>> >
>> > Is there finally an interest in addressing this then?  This would
>> > involve collecting host features across a cluster and for each host
>> > figuring out a configuration that works for migration. IIRC a tool was
>> > proposed for the task (to live alongside e.g. qemu-img).
>>
>> Apart from the tool, what if we provide a mechanism for adding/removing
>> device features at run-time?
>> After migration we could tell the guest that a feature is no longer
>> available.
>>
>> Maybe it's too complicated, but it would allow us to solve the problem of
>> migrating between different kernels or, with vDPA, between different devices
>> that don't support all features.
>
>Possible going forward but not supported by the spec at this point,
>and tricky to do generally.
>It's possible to do it in a vsock specific way since sockets
>are currently closed across migration.

Yep, I see.

>
>
>> >
>> > As long as we just stick to the machine type the best we can do is
>> > probably to keep doing what we do now (hope that the two host kernels
>> > are more or less consistent) as otherwise we'd have to never enable any
>> > new features in vsock.
>>
>> Should we at least merge this patch to allow to migrate a VM between a new
>> and an old qemu even if the kernel is the same?
>>
>> Thanks,
>> Stefano
>
>I'm inclined to do this, yes.
>

If you haven't queued it yet, I'd like to send a v2 using an 
`on,off,auto` property: `auto` would be the actual behavior when 
enabled, `on` instead requires that the kernel supports the feature 
otherwise there is an error.

Thanks,
Stefano



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] vhost-vsock: fix migration issue when seqpacket is supported
  2021-09-14 10:42             ` Stefano Garzarella
@ 2021-09-14 11:49               ` Michael S. Tsirkin
  0 siblings, 0 replies; 12+ messages in thread
From: Michael S. Tsirkin @ 2021-09-14 11:49 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: Daniel P. Berrangé,
	Eduardo Habkost, Jiang Wang, qemu-stable, qemu-devel,
	Stefan Hajnoczi, Arseny Krasnov

On Tue, Sep 14, 2021 at 12:42:09PM +0200, Stefano Garzarella wrote:
> On Mon, Sep 13, 2021 at 09:46:48AM -0400, Michael S. Tsirkin wrote:
> > On Mon, Sep 13, 2021 at 02:51:42PM +0200, Stefano Garzarella wrote:
> > > On Fri, Sep 10, 2021 at 02:35:53AM -0400, Michael S. Tsirkin wrote:
> > > > On Thu, Sep 09, 2021 at 10:02:12AM +0100, Daniel P. Berrangé wrote:
> > > > > On Thu, Sep 09, 2021 at 04:47:42AM -0400, Michael S. Tsirkin wrote:
> > > > > > On Tue, Sep 07, 2021 at 02:22:24PM +0100, Daniel P. Berrangé wrote:
> > > > > > > On Tue, Sep 07, 2021 at 02:49:35PM +0200, Stefano Garzarella wrote:
> > > > > > > > Commit 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
> > > > > > > > enabled the SEQPACKET feature bit.
> > > > > > > > This commit is released with QEMU 6.1, so if we try to migrate a VM where
> > > > > > > > the host kernel supports SEQPACKET but machine type version is less than
> > > > > > > > 6.1, we get the following errors:
> > > > > > > >
> > > > > > > >     Features 0x130000002 unsupported. Allowed features: 0x179000000
> > > > > > > >     Failed to load virtio-vhost_vsock:virtio
> > > > > > > >     error while loading state for instance 0x0 of device '0000:00:05.0/virtio-vhost_vsock'
> > > > > > > >     load of migration failed: Operation not permitted
> > > > > > > >
> > > > > > > > Let's disable the feature bit for machine types < 6.1, adding a
> > > > > > > > `features` field to VHostVSock to simplify the handling of upcoming
> > > > > > > > features we will support.
> > > > > > >
> > > > > > > IIUC, this will still leave migration broken for anyone migrating
> > > > > > > a >= 6.1 machine type between a kernel that supports SEQPACKET and
> > > > > > > a kernel lacking that, or vica-verca.
> > > > > > >
> > > > > > > If a feature is dependant on a host kernel feature we can't turn
> > > > > > > that on automatically as part of the machine type, as we need
> > > > > > > ABI stability across migration indepdant of kernel version.
> > > > > > >
> > > > > > >
> > > > > > > Regards,
> > > > > > > Daniel
> > > > > >
> > > > > > This is a fundamental problem we have with kernel accelerators.
> > > > > > A higher level solution at management level is needed.
> > > > > > For now yes, we do turn features on by default,
> > > > > > consistent kernels on source and destination are assumed.
> > > > > > For downstreams not a problem at all as they update
> > > > > > userspace and kernel in concert.
> > > > >
> > > > > Even downstream in RHEL that is not actually valid anymore. Container
> > > > > based deployment has killed any assumptions that can be made in this
> > > > > respect. Even if the userspace and kernel are updated in lockstep in
> > > > > a particular RHEL release, you cannot assume the running environment
> > > > > will have a matched pair.
> > > > >
> > > > > Users can be running QEMU userspace from RHEL-8.5 inside a container
> > > > > that has been deployed on a host using a 8.3 kernel. We've even had
> > > > > cases of running QEMU from RHEL-8, on a RHEL-7 host.
> > > > >
> > > > > Regards,
> > > > > Daniel
> > > >
> > > > Is there finally an interest in addressing this then?  This would
> > > > involve collecting host features across a cluster and for each host
> > > > figuring out a configuration that works for migration. IIRC a tool was
> > > > proposed for the task (to live alongside e.g. qemu-img).
> > > 
> > > Apart from the tool, what if we provide a mechanism for adding/removing
> > > device features at run-time?
> > > After migration we could tell the guest that a feature is no longer
> > > available.
> > > 
> > > Maybe it's too complicated, but it would allow us to solve the problem of
> > > migrating between different kernels or, with vDPA, between different devices
> > > that don't support all features.
> > 
> > Possible going forward but not supported by the spec at this point,
> > and tricky to do generally.
> > It's possible to do it in a vsock specific way since sockets
> > are currently closed across migration.
> 
> Yep, I see.
> 
> > 
> > 
> > > >
> > > > As long as we just stick to the machine type the best we can do is
> > > > probably to keep doing what we do now (hope that the two host kernels
> > > > are more or less consistent) as otherwise we'd have to never enable any
> > > > new features in vsock.
> > > 
> > > Should we at least merge this patch to allow to migrate a VM between a new
> > > and an old qemu even if the kernel is the same?
> > > 
> > > Thanks,
> > > Stefano
> > 
> > I'm inclined to do this, yes.
> > 
> 
> If you haven't queued it yet, I'd like to send a v2 using an `on,off,auto`
> property: `auto` would be the actual behavior when enabled, `on` instead
> requires that the kernel supports the feature otherwise there is an error.
> 
> Thanks,
> Stefano

go ahead pls.



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-09-14 11:51 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-07 12:49 [PATCH] vhost-vsock: fix migration issue when seqpacket is supported Stefano Garzarella
2021-09-07 13:22 ` Daniel P. Berrangé
2021-09-07 13:47   ` Stefano Garzarella
2021-09-08 13:41     ` Stefano Garzarella
2021-09-08 13:48       ` Daniel P. Berrangé
2021-09-09  8:47   ` Michael S. Tsirkin
2021-09-09  9:02     ` Daniel P. Berrangé
2021-09-10  6:35       ` Michael S. Tsirkin
2021-09-13 12:51         ` Stefano Garzarella
2021-09-13 13:46           ` Michael S. Tsirkin
2021-09-14 10:42             ` Stefano Garzarella
2021-09-14 11:49               ` Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.