All of lore.kernel.org
 help / color / mirror / Atom feed
* dpdk/vpp and cross-version migration for vhost
@ 2016-10-13 17:50 ` Michael S. Tsirkin
  0 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2016-10-13 17:50 UTC (permalink / raw)
  To: Yuanhan Liu
  Cc: dev, Stephen Hemminger, Maxime Coquelin, qemu-devel, libvir-list,
	vpp-dev

Hi!
So it looks like we face a problem with cross-version
migration when using vhost. It's not new but became more
acute with the advent of vhost user.

For users to be able to migrate between different versions
of the hypervisor the interface exposed to guests
by hypervisor must stay unchanged.

The problem is that a qemu device is connected
to a backend in another process, so the interface
exposed to guests depends on the capabilities of that
process.

Specifically, for vhost user interface based on virtio, this includes
the "host features" bitmap that defines the interface, as well as more
host values such as the max ring size.  Adding new features/changing
values to this interface is required to make progress, but on the other
hand we need ability to get the old host features to be compatible.

To solve this problem within qemu, qemu has a versioning system based on
a machine type concept which fundamentally is a version string, by
specifying that string one can get hardware compatible with a previous
qemu version. QEMU also reports the latest version and list of versions
supported so libvirt records the version at VM creation and then is
careful to use this machine version whenever it migrates a VM.

One might wonder how is this solved with a kernel vhost backend. The
answer is that it mostly isn't - instead an assumption is made, that
qemu versions are deployed together with the kernel - this is generally
true for downstreams.  Thus whenever qemu gains a new feature, it is
already supported by the kernel as well.  However, if one attempts
migration with a new qemu from a system with a new to old kernel, one
would get a failure.

In the world where we have multiple userspace backends, with some of
these supplied by ISVs, this seems non-realistic.

IMO we need to support vhost backend versioning, ideally
in a way that will also work for vhost kernel backends.

So I'd like to get some input from both backend and management
developers on what a good solution would look like.

If we want to emulate the qemu solution, this involves adding the
concept of interface versions to dpdk.  For example, dpdk could supply a
file (or utility printing?) with list of versions: latest and versions
supported. libvirt could read that and
- store latest version at vm creation
- pass it around with the vm
- pass it to qemu

>From here, qemu could pass this over the vhost-user channel,
thus making sure it's initialized with the correct
compatible interface.

As version here is an opaque string for libvirt and qemu,
anything can be used - but I suggest either a list
of values defining the interface, e.g.
any_layout=on,max_ring=256
or a version including the name and vendor of the backend,
e.g. "org.dpdk.v4.5.6".

Note that typically the list of supported versions can only be
extended, not shrunk. Also, if the host/guest interface
does not change, don't change the current version as
this just creates work for everyone.

Thoughts? Would this work well for management? dpdk? vpp?

Thanks!

-- 
MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [Qemu-devel] dpdk/vpp and cross-version migration for vhost
@ 2016-10-13 17:50 ` Michael S. Tsirkin
  0 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2016-10-13 17:50 UTC (permalink / raw)
  To: Yuanhan Liu
  Cc: dev, Stephen Hemminger, Maxime Coquelin, qemu-devel, libvir-list,
	vpp-dev

Hi!
So it looks like we face a problem with cross-version
migration when using vhost. It's not new but became more
acute with the advent of vhost user.

For users to be able to migrate between different versions
of the hypervisor the interface exposed to guests
by hypervisor must stay unchanged.

The problem is that a qemu device is connected
to a backend in another process, so the interface
exposed to guests depends on the capabilities of that
process.

Specifically, for vhost user interface based on virtio, this includes
the "host features" bitmap that defines the interface, as well as more
host values such as the max ring size.  Adding new features/changing
values to this interface is required to make progress, but on the other
hand we need ability to get the old host features to be compatible.

To solve this problem within qemu, qemu has a versioning system based on
a machine type concept which fundamentally is a version string, by
specifying that string one can get hardware compatible with a previous
qemu version. QEMU also reports the latest version and list of versions
supported so libvirt records the version at VM creation and then is
careful to use this machine version whenever it migrates a VM.

One might wonder how is this solved with a kernel vhost backend. The
answer is that it mostly isn't - instead an assumption is made, that
qemu versions are deployed together with the kernel - this is generally
true for downstreams.  Thus whenever qemu gains a new feature, it is
already supported by the kernel as well.  However, if one attempts
migration with a new qemu from a system with a new to old kernel, one
would get a failure.

In the world where we have multiple userspace backends, with some of
these supplied by ISVs, this seems non-realistic.

IMO we need to support vhost backend versioning, ideally
in a way that will also work for vhost kernel backends.

So I'd like to get some input from both backend and management
developers on what a good solution would look like.

If we want to emulate the qemu solution, this involves adding the
concept of interface versions to dpdk.  For example, dpdk could supply a
file (or utility printing?) with list of versions: latest and versions
supported. libvirt could read that and
- store latest version at vm creation
- pass it around with the vm
- pass it to qemu

>From here, qemu could pass this over the vhost-user channel,
thus making sure it's initialized with the correct
compatible interface.

As version here is an opaque string for libvirt and qemu,
anything can be used - but I suggest either a list
of values defining the interface, e.g.
any_layout=on,max_ring=256
or a version including the name and vendor of the backend,
e.g. "org.dpdk.v4.5.6".

Note that typically the list of supported versions can only be
extended, not shrunk. Also, if the host/guest interface
does not change, don't change the current version as
this just creates work for everyone.

Thoughts? Would this work well for management? dpdk? vpp?

Thanks!

-- 
MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-10-13 17:50 ` [Qemu-devel] " Michael S. Tsirkin
@ 2016-11-16 20:43   ` Maxime Coquelin
  -1 siblings, 0 replies; 44+ messages in thread
From: Maxime Coquelin @ 2016-11-16 20:43 UTC (permalink / raw)
  To: Michael S. Tsirkin, Yuanhan Liu
  Cc: dev, Stephen Hemminger, qemu-devel, libvir-list, vpp-dev

Hi Michael,

On 10/13/2016 07:50 PM, Michael S. Tsirkin wrote:
> Hi!
> So it looks like we face a problem with cross-version
> migration when using vhost. It's not new but became more
> acute with the advent of vhost user.
>
> For users to be able to migrate between different versions
> of the hypervisor the interface exposed to guests
> by hypervisor must stay unchanged.
>
> The problem is that a qemu device is connected
> to a backend in another process, so the interface
> exposed to guests depends on the capabilities of that
> process.
>
> Specifically, for vhost user interface based on virtio, this includes
> the "host features" bitmap that defines the interface, as well as more
> host values such as the max ring size.  Adding new features/changing
> values to this interface is required to make progress, but on the other
> hand we need ability to get the old host features to be compatible.
>
> To solve this problem within qemu, qemu has a versioning system based on
> a machine type concept which fundamentally is a version string, by
> specifying that string one can get hardware compatible with a previous
> qemu version. QEMU also reports the latest version and list of versions
> supported so libvirt records the version at VM creation and then is
> careful to use this machine version whenever it migrates a VM.
>
> One might wonder how is this solved with a kernel vhost backend. The
> answer is that it mostly isn't - instead an assumption is made, that
> qemu versions are deployed together with the kernel - this is generally
> true for downstreams.  Thus whenever qemu gains a new feature, it is
> already supported by the kernel as well.  However, if one attempts
> migration with a new qemu from a system with a new to old kernel, one
> would get a failure.
>
> In the world where we have multiple userspace backends, with some of
> these supplied by ISVs, this seems non-realistic.
>
> IMO we need to support vhost backend versioning, ideally
> in a way that will also work for vhost kernel backends.
>
> So I'd like to get some input from both backend and management
> developers on what a good solution would look like.
>
> If we want to emulate the qemu solution, this involves adding the
> concept of interface versions to dpdk.  For example, dpdk could supply a
> file (or utility printing?) with list of versions: latest and versions
> supported. libvirt could read that and

So if I understand correctly, it would be generated at build time?
One problem I see is that the DPDK's vhost-user lib API provides a way
to disable features:
"
rte_vhost_feature_disable/rte_vhost_feature_enable(feature_mask)

This function disables/enables some features. For example, it can be 
used to disable mergeable buffers and TSO features, which both are 
enabled by default.
"

I think we should not have this capability on host side, it should be
guest's decision to use or not some features, and if it has to be done
on host, QEMU already provides a way to disable features (moreover
per-device, which is not the case with rte_vhost_feature_disable).
IMHO, we should consider deprecating this API in v17.02.

That said, the API is here, and it would break migration if the version
file advertises some features the vSwitch has disabled at runtime.

> - store latest version at vm creation
> - pass it around with the vm
> - pass it to qemu
> From here, qemu could pass this over the vhost-user channel,
> thus making sure it's initialized with the correct
> compatible interface.

Using vhost-user protocol features I guess?

> As version here is an opaque string for libvirt and qemu,
> anything can be used - but I suggest either a list
> of values defining the interface, e.g.
> any_layout=on,max_ring=256
> or a version including the name and vendor of the backend,
> e.g. "org.dpdk.v4.5.6".

I think the first option provides more flexibility.
For example, we could imagine migrating from a process using DPDK's
vhost-user lib, to another process using its own implementation (VPP
has its own implementation currently if I'm not mistaken).
Maybe this scenario does not make sense, but in this case, exposing
values directly would avoid the need for synchronization between
vhost-user implementations.

>
> Note that typically the list of supported versions can only be
> extended, not shrunk. Also, if the host/guest interface
> does not change, don't change the current version as
> this just creates work for everyone.
>
> Thoughts? Would this work well for management? dpdk? vpp?

One thing I'm not clear is how it will work for the MTU feature, if the
process it is migrated to exposes a larger MTU that the guest doesn't
support (if it has sized receive buffers to pre-migration MTU for
example).

Thanks,
Maxime

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] dpdk/vpp and cross-version migration for vhost
@ 2016-11-16 20:43   ` Maxime Coquelin
  0 siblings, 0 replies; 44+ messages in thread
From: Maxime Coquelin @ 2016-11-16 20:43 UTC (permalink / raw)
  To: Michael S. Tsirkin, Yuanhan Liu
  Cc: dev, Stephen Hemminger, qemu-devel, libvir-list, vpp-dev

Hi Michael,

On 10/13/2016 07:50 PM, Michael S. Tsirkin wrote:
> Hi!
> So it looks like we face a problem with cross-version
> migration when using vhost. It's not new but became more
> acute with the advent of vhost user.
>
> For users to be able to migrate between different versions
> of the hypervisor the interface exposed to guests
> by hypervisor must stay unchanged.
>
> The problem is that a qemu device is connected
> to a backend in another process, so the interface
> exposed to guests depends on the capabilities of that
> process.
>
> Specifically, for vhost user interface based on virtio, this includes
> the "host features" bitmap that defines the interface, as well as more
> host values such as the max ring size.  Adding new features/changing
> values to this interface is required to make progress, but on the other
> hand we need ability to get the old host features to be compatible.
>
> To solve this problem within qemu, qemu has a versioning system based on
> a machine type concept which fundamentally is a version string, by
> specifying that string one can get hardware compatible with a previous
> qemu version. QEMU also reports the latest version and list of versions
> supported so libvirt records the version at VM creation and then is
> careful to use this machine version whenever it migrates a VM.
>
> One might wonder how is this solved with a kernel vhost backend. The
> answer is that it mostly isn't - instead an assumption is made, that
> qemu versions are deployed together with the kernel - this is generally
> true for downstreams.  Thus whenever qemu gains a new feature, it is
> already supported by the kernel as well.  However, if one attempts
> migration with a new qemu from a system with a new to old kernel, one
> would get a failure.
>
> In the world where we have multiple userspace backends, with some of
> these supplied by ISVs, this seems non-realistic.
>
> IMO we need to support vhost backend versioning, ideally
> in a way that will also work for vhost kernel backends.
>
> So I'd like to get some input from both backend and management
> developers on what a good solution would look like.
>
> If we want to emulate the qemu solution, this involves adding the
> concept of interface versions to dpdk.  For example, dpdk could supply a
> file (or utility printing?) with list of versions: latest and versions
> supported. libvirt could read that and

So if I understand correctly, it would be generated at build time?
One problem I see is that the DPDK's vhost-user lib API provides a way
to disable features:
"
rte_vhost_feature_disable/rte_vhost_feature_enable(feature_mask)

This function disables/enables some features. For example, it can be 
used to disable mergeable buffers and TSO features, which both are 
enabled by default.
"

I think we should not have this capability on host side, it should be
guest's decision to use or not some features, and if it has to be done
on host, QEMU already provides a way to disable features (moreover
per-device, which is not the case with rte_vhost_feature_disable).
IMHO, we should consider deprecating this API in v17.02.

That said, the API is here, and it would break migration if the version
file advertises some features the vSwitch has disabled at runtime.

> - store latest version at vm creation
> - pass it around with the vm
> - pass it to qemu
> From here, qemu could pass this over the vhost-user channel,
> thus making sure it's initialized with the correct
> compatible interface.

Using vhost-user protocol features I guess?

> As version here is an opaque string for libvirt and qemu,
> anything can be used - but I suggest either a list
> of values defining the interface, e.g.
> any_layout=on,max_ring=256
> or a version including the name and vendor of the backend,
> e.g. "org.dpdk.v4.5.6".

I think the first option provides more flexibility.
For example, we could imagine migrating from a process using DPDK's
vhost-user lib, to another process using its own implementation (VPP
has its own implementation currently if I'm not mistaken).
Maybe this scenario does not make sense, but in this case, exposing
values directly would avoid the need for synchronization between
vhost-user implementations.

>
> Note that typically the list of supported versions can only be
> extended, not shrunk. Also, if the host/guest interface
> does not change, don't change the current version as
> this just creates work for everyone.
>
> Thoughts? Would this work well for management? dpdk? vpp?

One thing I'm not clear is how it will work for the MTU feature, if the
process it is migrated to exposes a larger MTU that the guest doesn't
support (if it has sized receive buffers to pre-migration MTU for
example).

Thanks,
Maxime

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-10-13 17:50 ` [Qemu-devel] " Michael S. Tsirkin
@ 2016-11-17  8:29   ` Yuanhan Liu
  -1 siblings, 0 replies; 44+ messages in thread
From: Yuanhan Liu @ 2016-11-17  8:29 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: dev, Stephen Hemminger, Maxime Coquelin, qemu-devel, libvir-list,
	vpp-dev, Marc-André Lureau

As usaual, sorry for late response :/

On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
> Hi!
> So it looks like we face a problem with cross-version
> migration when using vhost. It's not new but became more
> acute with the advent of vhost user.
> 
> For users to be able to migrate between different versions
> of the hypervisor the interface exposed to guests
> by hypervisor must stay unchanged.
> 
> The problem is that a qemu device is connected
> to a backend in another process, so the interface
> exposed to guests depends on the capabilities of that
> process.
> 
> Specifically, for vhost user interface based on virtio, this includes
> the "host features" bitmap that defines the interface, as well as more
> host values such as the max ring size.  Adding new features/changing
> values to this interface is required to make progress, but on the other
> hand we need ability to get the old host features to be compatible.

It looks like to the same issue of vhost-user reconnect to me. For example,

- start dpdk 16.07 & qemu 2.5
- kill dpdk
- start dpdk 16.11

Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect),
above should work. Because qemu saves the negotiated features before the
disconnect and stores it back after the reconnection.

    commit a463215b087c41d7ca94e51aa347cde523831873
    Author: Marc-André Lureau <marcandre.lureau@redhat.com>
    Date:   Mon Jun 6 18:45:05 2016 +0200
    
        vhost-net: save & restore vhost-user acked features
    
        The initial vhost-user connection sets the features to be negotiated
        with the driver. Renegotiation isn't possible without device reset.
    
        To handle reconnection of vhost-user backend, ensure the same set of
        features are provided, and reuse already acked features.
    
        Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>


So we could do similar to vhost-user? I mean, save the acked features
before migration and store it back after it. This should be able to
keep the compatibility. If user downgrades DPDK version, it also could
be easily detected, and then exit with an error to user: migration
failed due to un-compatible vhost features.

Just some rough thoughts. Makes tiny sense?

	--yliu
> 
> To solve this problem within qemu, qemu has a versioning system based on
> a machine type concept which fundamentally is a version string, by
> specifying that string one can get hardware compatible with a previous
> qemu version. QEMU also reports the latest version and list of versions
> supported so libvirt records the version at VM creation and then is
> careful to use this machine version whenever it migrates a VM.
> 
> One might wonder how is this solved with a kernel vhost backend. The
> answer is that it mostly isn't - instead an assumption is made, that
> qemu versions are deployed together with the kernel - this is generally
> true for downstreams.  Thus whenever qemu gains a new feature, it is
> already supported by the kernel as well.  However, if one attempts
> migration with a new qemu from a system with a new to old kernel, one
> would get a failure.
> 
> In the world where we have multiple userspace backends, with some of
> these supplied by ISVs, this seems non-realistic.
> 
> IMO we need to support vhost backend versioning, ideally
> in a way that will also work for vhost kernel backends.
> 
> So I'd like to get some input from both backend and management
> developers on what a good solution would look like.
> 
> If we want to emulate the qemu solution, this involves adding the
> concept of interface versions to dpdk.  For example, dpdk could supply a
> file (or utility printing?) with list of versions: latest and versions
> supported. libvirt could read that and
> - store latest version at vm creation
> - pass it around with the vm
> - pass it to qemu
> 
> >From here, qemu could pass this over the vhost-user channel,
> thus making sure it's initialized with the correct
> compatible interface.
> 
> As version here is an opaque string for libvirt and qemu,
> anything can be used - but I suggest either a list
> of values defining the interface, e.g.
> any_layout=on,max_ring=256
> or a version including the name and vendor of the backend,
> e.g. "org.dpdk.v4.5.6".
> 
> Note that typically the list of supported versions can only be
> extended, not shrunk. Also, if the host/guest interface
> does not change, don't change the current version as
> this just creates work for everyone.
> 
> Thoughts? Would this work well for management? dpdk? vpp?
> 
> Thanks!
> 
> -- 
> MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] dpdk/vpp and cross-version migration for vhost
@ 2016-11-17  8:29   ` Yuanhan Liu
  0 siblings, 0 replies; 44+ messages in thread
From: Yuanhan Liu @ 2016-11-17  8:29 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: dev, Stephen Hemminger, Maxime Coquelin, qemu-devel, libvir-list,
	vpp-dev, Marc-André Lureau

As usaual, sorry for late response :/

On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
> Hi!
> So it looks like we face a problem with cross-version
> migration when using vhost. It's not new but became more
> acute with the advent of vhost user.
> 
> For users to be able to migrate between different versions
> of the hypervisor the interface exposed to guests
> by hypervisor must stay unchanged.
> 
> The problem is that a qemu device is connected
> to a backend in another process, so the interface
> exposed to guests depends on the capabilities of that
> process.
> 
> Specifically, for vhost user interface based on virtio, this includes
> the "host features" bitmap that defines the interface, as well as more
> host values such as the max ring size.  Adding new features/changing
> values to this interface is required to make progress, but on the other
> hand we need ability to get the old host features to be compatible.

It looks like to the same issue of vhost-user reconnect to me. For example,

- start dpdk 16.07 & qemu 2.5
- kill dpdk
- start dpdk 16.11

Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect),
above should work. Because qemu saves the negotiated features before the
disconnect and stores it back after the reconnection.

    commit a463215b087c41d7ca94e51aa347cde523831873
    Author: Marc-André Lureau <marcandre.lureau@redhat.com>
    Date:   Mon Jun 6 18:45:05 2016 +0200
    
        vhost-net: save & restore vhost-user acked features
    
        The initial vhost-user connection sets the features to be negotiated
        with the driver. Renegotiation isn't possible without device reset.
    
        To handle reconnection of vhost-user backend, ensure the same set of
        features are provided, and reuse already acked features.
    
        Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>


So we could do similar to vhost-user? I mean, save the acked features
before migration and store it back after it. This should be able to
keep the compatibility. If user downgrades DPDK version, it also could
be easily detected, and then exit with an error to user: migration
failed due to un-compatible vhost features.

Just some rough thoughts. Makes tiny sense?

	--yliu
> 
> To solve this problem within qemu, qemu has a versioning system based on
> a machine type concept which fundamentally is a version string, by
> specifying that string one can get hardware compatible with a previous
> qemu version. QEMU also reports the latest version and list of versions
> supported so libvirt records the version at VM creation and then is
> careful to use this machine version whenever it migrates a VM.
> 
> One might wonder how is this solved with a kernel vhost backend. The
> answer is that it mostly isn't - instead an assumption is made, that
> qemu versions are deployed together with the kernel - this is generally
> true for downstreams.  Thus whenever qemu gains a new feature, it is
> already supported by the kernel as well.  However, if one attempts
> migration with a new qemu from a system with a new to old kernel, one
> would get a failure.
> 
> In the world where we have multiple userspace backends, with some of
> these supplied by ISVs, this seems non-realistic.
> 
> IMO we need to support vhost backend versioning, ideally
> in a way that will also work for vhost kernel backends.
> 
> So I'd like to get some input from both backend and management
> developers on what a good solution would look like.
> 
> If we want to emulate the qemu solution, this involves adding the
> concept of interface versions to dpdk.  For example, dpdk could supply a
> file (or utility printing?) with list of versions: latest and versions
> supported. libvirt could read that and
> - store latest version at vm creation
> - pass it around with the vm
> - pass it to qemu
> 
> >From here, qemu could pass this over the vhost-user channel,
> thus making sure it's initialized with the correct
> compatible interface.
> 
> As version here is an opaque string for libvirt and qemu,
> anything can be used - but I suggest either a list
> of values defining the interface, e.g.
> any_layout=on,max_ring=256
> or a version including the name and vendor of the backend,
> e.g. "org.dpdk.v4.5.6".
> 
> Note that typically the list of supported versions can only be
> extended, not shrunk. Also, if the host/guest interface
> does not change, don't change the current version as
> this just creates work for everyone.
> 
> Thoughts? Would this work well for management? dpdk? vpp?
> 
> Thanks!
> 
> -- 
> MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-11-17  8:29   ` [Qemu-devel] " Yuanhan Liu
@ 2016-11-17  8:47     ` Maxime Coquelin
  -1 siblings, 0 replies; 44+ messages in thread
From: Maxime Coquelin @ 2016-11-17  8:47 UTC (permalink / raw)
  To: Yuanhan Liu, Michael S. Tsirkin
  Cc: dev, Stephen Hemminger, qemu-devel, libvir-list, vpp-dev,
	Marc-André Lureau



On 11/17/2016 09:29 AM, Yuanhan Liu wrote:
> As usaual, sorry for late response :/
>
> On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
>> Hi!
>> So it looks like we face a problem with cross-version
>> migration when using vhost. It's not new but became more
>> acute with the advent of vhost user.
>>
>> For users to be able to migrate between different versions
>> of the hypervisor the interface exposed to guests
>> by hypervisor must stay unchanged.
>>
>> The problem is that a qemu device is connected
>> to a backend in another process, so the interface
>> exposed to guests depends on the capabilities of that
>> process.
>>
>> Specifically, for vhost user interface based on virtio, this includes
>> the "host features" bitmap that defines the interface, as well as more
>> host values such as the max ring size.  Adding new features/changing
>> values to this interface is required to make progress, but on the other
>> hand we need ability to get the old host features to be compatible.
>
> It looks like to the same issue of vhost-user reconnect to me. For example,
>
> - start dpdk 16.07 & qemu 2.5
> - kill dpdk
> - start dpdk 16.11
>
> Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect),
> above should work. Because qemu saves the negotiated features before the
> disconnect and stores it back after the reconnection.
>
>     commit a463215b087c41d7ca94e51aa347cde523831873
>     Author: Marc-André Lureau <marcandre.lureau@redhat.com>
>     Date:   Mon Jun 6 18:45:05 2016 +0200
>
>         vhost-net: save & restore vhost-user acked features
>
>         The initial vhost-user connection sets the features to be negotiated
>         with the driver. Renegotiation isn't possible without device reset.
>
>         To handle reconnection of vhost-user backend, ensure the same set of
>         features are provided, and reuse already acked features.
>
>         Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>
>
> So we could do similar to vhost-user? I mean, save the acked features
> before migration and store it back after it. This should be able to
> keep the compatibility. If user downgrades DPDK version, it also could
> be easily detected, and then exit with an error to user: migration
> failed due to un-compatible vhost features.
>
> Just some rough thoughts. Makes tiny sense?

My understanding is that the management tool has to know whether
versions are compatible before initiating the migration:
  1. The downtime could be unpredictable if a VM has to move from hosts
     to hosts multiple times, which is problematic, especially for NFV.
  2. If migration is not possible, maybe the management tool would
     prefer not to interrupt the VM on current host.

I have little experience with migration though, so I could be mistaken.

Thanks,
Maxime

>
> 	--yliu
>>
>> To solve this problem within qemu, qemu has a versioning system based on
>> a machine type concept which fundamentally is a version string, by
>> specifying that string one can get hardware compatible with a previous
>> qemu version. QEMU also reports the latest version and list of versions
>> supported so libvirt records the version at VM creation and then is
>> careful to use this machine version whenever it migrates a VM.
>>
>> One might wonder how is this solved with a kernel vhost backend. The
>> answer is that it mostly isn't - instead an assumption is made, that
>> qemu versions are deployed together with the kernel - this is generally
>> true for downstreams.  Thus whenever qemu gains a new feature, it is
>> already supported by the kernel as well.  However, if one attempts
>> migration with a new qemu from a system with a new to old kernel, one
>> would get a failure.
>>
>> In the world where we have multiple userspace backends, with some of
>> these supplied by ISVs, this seems non-realistic.
>>
>> IMO we need to support vhost backend versioning, ideally
>> in a way that will also work for vhost kernel backends.
>>
>> So I'd like to get some input from both backend and management
>> developers on what a good solution would look like.
>>
>> If we want to emulate the qemu solution, this involves adding the
>> concept of interface versions to dpdk.  For example, dpdk could supply a
>> file (or utility printing?) with list of versions: latest and versions
>> supported. libvirt could read that and
>> - store latest version at vm creation
>> - pass it around with the vm
>> - pass it to qemu
>>
>> >From here, qemu could pass this over the vhost-user channel,
>> thus making sure it's initialized with the correct
>> compatible interface.
>>
>> As version here is an opaque string for libvirt and qemu,
>> anything can be used - but I suggest either a list
>> of values defining the interface, e.g.
>> any_layout=on,max_ring=256
>> or a version including the name and vendor of the backend,
>> e.g. "org.dpdk.v4.5.6".
>>
>> Note that typically the list of supported versions can only be
>> extended, not shrunk. Also, if the host/guest interface
>> does not change, don't change the current version as
>> this just creates work for everyone.
>>
>> Thoughts? Would this work well for management? dpdk? vpp?
>>
>> Thanks!
>>
>> --
>> MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] dpdk/vpp and cross-version migration for vhost
@ 2016-11-17  8:47     ` Maxime Coquelin
  0 siblings, 0 replies; 44+ messages in thread
From: Maxime Coquelin @ 2016-11-17  8:47 UTC (permalink / raw)
  To: Yuanhan Liu, Michael S. Tsirkin
  Cc: dev, Stephen Hemminger, qemu-devel, libvir-list, vpp-dev,
	Marc-André Lureau



On 11/17/2016 09:29 AM, Yuanhan Liu wrote:
> As usaual, sorry for late response :/
>
> On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
>> Hi!
>> So it looks like we face a problem with cross-version
>> migration when using vhost. It's not new but became more
>> acute with the advent of vhost user.
>>
>> For users to be able to migrate between different versions
>> of the hypervisor the interface exposed to guests
>> by hypervisor must stay unchanged.
>>
>> The problem is that a qemu device is connected
>> to a backend in another process, so the interface
>> exposed to guests depends on the capabilities of that
>> process.
>>
>> Specifically, for vhost user interface based on virtio, this includes
>> the "host features" bitmap that defines the interface, as well as more
>> host values such as the max ring size.  Adding new features/changing
>> values to this interface is required to make progress, but on the other
>> hand we need ability to get the old host features to be compatible.
>
> It looks like to the same issue of vhost-user reconnect to me. For example,
>
> - start dpdk 16.07 & qemu 2.5
> - kill dpdk
> - start dpdk 16.11
>
> Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect),
> above should work. Because qemu saves the negotiated features before the
> disconnect and stores it back after the reconnection.
>
>     commit a463215b087c41d7ca94e51aa347cde523831873
>     Author: Marc-André Lureau <marcandre.lureau@redhat.com>
>     Date:   Mon Jun 6 18:45:05 2016 +0200
>
>         vhost-net: save & restore vhost-user acked features
>
>         The initial vhost-user connection sets the features to be negotiated
>         with the driver. Renegotiation isn't possible without device reset.
>
>         To handle reconnection of vhost-user backend, ensure the same set of
>         features are provided, and reuse already acked features.
>
>         Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>
>
> So we could do similar to vhost-user? I mean, save the acked features
> before migration and store it back after it. This should be able to
> keep the compatibility. If user downgrades DPDK version, it also could
> be easily detected, and then exit with an error to user: migration
> failed due to un-compatible vhost features.
>
> Just some rough thoughts. Makes tiny sense?

My understanding is that the management tool has to know whether
versions are compatible before initiating the migration:
  1. The downtime could be unpredictable if a VM has to move from hosts
     to hosts multiple times, which is problematic, especially for NFV.
  2. If migration is not possible, maybe the management tool would
     prefer not to interrupt the VM on current host.

I have little experience with migration though, so I could be mistaken.

Thanks,
Maxime

>
> 	--yliu
>>
>> To solve this problem within qemu, qemu has a versioning system based on
>> a machine type concept which fundamentally is a version string, by
>> specifying that string one can get hardware compatible with a previous
>> qemu version. QEMU also reports the latest version and list of versions
>> supported so libvirt records the version at VM creation and then is
>> careful to use this machine version whenever it migrates a VM.
>>
>> One might wonder how is this solved with a kernel vhost backend. The
>> answer is that it mostly isn't - instead an assumption is made, that
>> qemu versions are deployed together with the kernel - this is generally
>> true for downstreams.  Thus whenever qemu gains a new feature, it is
>> already supported by the kernel as well.  However, if one attempts
>> migration with a new qemu from a system with a new to old kernel, one
>> would get a failure.
>>
>> In the world where we have multiple userspace backends, with some of
>> these supplied by ISVs, this seems non-realistic.
>>
>> IMO we need to support vhost backend versioning, ideally
>> in a way that will also work for vhost kernel backends.
>>
>> So I'd like to get some input from both backend and management
>> developers on what a good solution would look like.
>>
>> If we want to emulate the qemu solution, this involves adding the
>> concept of interface versions to dpdk.  For example, dpdk could supply a
>> file (or utility printing?) with list of versions: latest and versions
>> supported. libvirt could read that and
>> - store latest version at vm creation
>> - pass it around with the vm
>> - pass it to qemu
>>
>> >From here, qemu could pass this over the vhost-user channel,
>> thus making sure it's initialized with the correct
>> compatible interface.
>>
>> As version here is an opaque string for libvirt and qemu,
>> anything can be used - but I suggest either a list
>> of values defining the interface, e.g.
>> any_layout=on,max_ring=256
>> or a version including the name and vendor of the backend,
>> e.g. "org.dpdk.v4.5.6".
>>
>> Note that typically the list of supported versions can only be
>> extended, not shrunk. Also, if the host/guest interface
>> does not change, don't change the current version as
>> this just creates work for everyone.
>>
>> Thoughts? Would this work well for management? dpdk? vpp?
>>
>> Thanks!
>>
>> --
>> MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-11-17  8:47     ` [Qemu-devel] " Maxime Coquelin
@ 2016-11-17  9:49       ` Yuanhan Liu
  -1 siblings, 0 replies; 44+ messages in thread
From: Yuanhan Liu @ 2016-11-17  9:49 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: Michael S. Tsirkin, dev, Stephen Hemminger, qemu-devel,
	libvir-list, vpp-dev, Marc-André Lureau

On Thu, Nov 17, 2016 at 09:47:09AM +0100, Maxime Coquelin wrote:
> 
> 
> On 11/17/2016 09:29 AM, Yuanhan Liu wrote:
> >As usaual, sorry for late response :/
> >
> >On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
> >>Hi!
> >>So it looks like we face a problem with cross-version
> >>migration when using vhost. It's not new but became more
> >>acute with the advent of vhost user.
> >>
> >>For users to be able to migrate between different versions
> >>of the hypervisor the interface exposed to guests
> >>by hypervisor must stay unchanged.
> >>
> >>The problem is that a qemu device is connected
> >>to a backend in another process, so the interface
> >>exposed to guests depends on the capabilities of that
> >>process.
> >>
> >>Specifically, for vhost user interface based on virtio, this includes
> >>the "host features" bitmap that defines the interface, as well as more
> >>host values such as the max ring size.  Adding new features/changing
> >>values to this interface is required to make progress, but on the other
> >>hand we need ability to get the old host features to be compatible.
> >
> >It looks like to the same issue of vhost-user reconnect to me. For example,
> >
> >- start dpdk 16.07 & qemu 2.5
> >- kill dpdk
> >- start dpdk 16.11
> >
> >Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect),
> >above should work. Because qemu saves the negotiated features before the
> >disconnect and stores it back after the reconnection.
> >
> >    commit a463215b087c41d7ca94e51aa347cde523831873
> >    Author: Marc-André Lureau <marcandre.lureau@redhat.com>
> >    Date:   Mon Jun 6 18:45:05 2016 +0200
> >
> >        vhost-net: save & restore vhost-user acked features
> >
> >        The initial vhost-user connection sets the features to be negotiated
> >        with the driver. Renegotiation isn't possible without device reset.
> >
> >        To handle reconnection of vhost-user backend, ensure the same set of
> >        features are provided, and reuse already acked features.
> >
> >        Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> >
> >
> >So we could do similar to vhost-user? I mean, save the acked features
> >before migration and store it back after it. This should be able to
> >keep the compatibility. If user downgrades DPDK version, it also could
> >be easily detected, and then exit with an error to user: migration
> >failed due to un-compatible vhost features.
> >
> >Just some rough thoughts. Makes tiny sense?
> 
> My understanding is that the management tool has to know whether
> versions are compatible before initiating the migration:

Makes sense. How about getting and restoring the acked features through
qemu command lines then, say, through the monitor interface?

With that, it would be something like:

- start vhost-user backend (DPDK, VPP, or whatever) & qemu in the src host

- read the acked features (through monitor interface)

- start vhost-user backend in the dst host

- start qemu in the dst host with the just queried acked features

  QEMU then is expected to use this feature set for the later vhost-user
  feature negotitation. Exit if features compatibility is broken.

Thoughts?

	--yliu

>  1. The downtime could be unpredictable if a VM has to move from hosts
>     to hosts multiple times, which is problematic, especially for NFV.
>  2. If migration is not possible, maybe the management tool would
>     prefer not to interrupt the VM on current host.
> 
> I have little experience with migration though, so I could be mistaken.
> 
> Thanks,
> Maxime
> 
> >
> >	--yliu
> >>
> >>To solve this problem within qemu, qemu has a versioning system based on
> >>a machine type concept which fundamentally is a version string, by
> >>specifying that string one can get hardware compatible with a previous
> >>qemu version. QEMU also reports the latest version and list of versions
> >>supported so libvirt records the version at VM creation and then is
> >>careful to use this machine version whenever it migrates a VM.
> >>
> >>One might wonder how is this solved with a kernel vhost backend. The
> >>answer is that it mostly isn't - instead an assumption is made, that
> >>qemu versions are deployed together with the kernel - this is generally
> >>true for downstreams.  Thus whenever qemu gains a new feature, it is
> >>already supported by the kernel as well.  However, if one attempts
> >>migration with a new qemu from a system with a new to old kernel, one
> >>would get a failure.
> >>
> >>In the world where we have multiple userspace backends, with some of
> >>these supplied by ISVs, this seems non-realistic.
> >>
> >>IMO we need to support vhost backend versioning, ideally
> >>in a way that will also work for vhost kernel backends.
> >>
> >>So I'd like to get some input from both backend and management
> >>developers on what a good solution would look like.
> >>
> >>If we want to emulate the qemu solution, this involves adding the
> >>concept of interface versions to dpdk.  For example, dpdk could supply a
> >>file (or utility printing?) with list of versions: latest and versions
> >>supported. libvirt could read that and
> >>- store latest version at vm creation
> >>- pass it around with the vm
> >>- pass it to qemu
> >>
> >>>From here, qemu could pass this over the vhost-user channel,
> >>thus making sure it's initialized with the correct
> >>compatible interface.
> >>
> >>As version here is an opaque string for libvirt and qemu,
> >>anything can be used - but I suggest either a list
> >>of values defining the interface, e.g.
> >>any_layout=on,max_ring=256
> >>or a version including the name and vendor of the backend,
> >>e.g. "org.dpdk.v4.5.6".
> >>
> >>Note that typically the list of supported versions can only be
> >>extended, not shrunk. Also, if the host/guest interface
> >>does not change, don't change the current version as
> >>this just creates work for everyone.
> >>
> >>Thoughts? Would this work well for management? dpdk? vpp?
> >>
> >>Thanks!
> >>
> >>--
> >>MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] dpdk/vpp and cross-version migration for vhost
@ 2016-11-17  9:49       ` Yuanhan Liu
  0 siblings, 0 replies; 44+ messages in thread
From: Yuanhan Liu @ 2016-11-17  9:49 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: Michael S. Tsirkin, dev, Stephen Hemminger, qemu-devel,
	libvir-list, vpp-dev, Marc-André Lureau

On Thu, Nov 17, 2016 at 09:47:09AM +0100, Maxime Coquelin wrote:
> 
> 
> On 11/17/2016 09:29 AM, Yuanhan Liu wrote:
> >As usaual, sorry for late response :/
> >
> >On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
> >>Hi!
> >>So it looks like we face a problem with cross-version
> >>migration when using vhost. It's not new but became more
> >>acute with the advent of vhost user.
> >>
> >>For users to be able to migrate between different versions
> >>of the hypervisor the interface exposed to guests
> >>by hypervisor must stay unchanged.
> >>
> >>The problem is that a qemu device is connected
> >>to a backend in another process, so the interface
> >>exposed to guests depends on the capabilities of that
> >>process.
> >>
> >>Specifically, for vhost user interface based on virtio, this includes
> >>the "host features" bitmap that defines the interface, as well as more
> >>host values such as the max ring size.  Adding new features/changing
> >>values to this interface is required to make progress, but on the other
> >>hand we need ability to get the old host features to be compatible.
> >
> >It looks like to the same issue of vhost-user reconnect to me. For example,
> >
> >- start dpdk 16.07 & qemu 2.5
> >- kill dpdk
> >- start dpdk 16.11
> >
> >Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect),
> >above should work. Because qemu saves the negotiated features before the
> >disconnect and stores it back after the reconnection.
> >
> >    commit a463215b087c41d7ca94e51aa347cde523831873
> >    Author: Marc-André Lureau <marcandre.lureau@redhat.com>
> >    Date:   Mon Jun 6 18:45:05 2016 +0200
> >
> >        vhost-net: save & restore vhost-user acked features
> >
> >        The initial vhost-user connection sets the features to be negotiated
> >        with the driver. Renegotiation isn't possible without device reset.
> >
> >        To handle reconnection of vhost-user backend, ensure the same set of
> >        features are provided, and reuse already acked features.
> >
> >        Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> >
> >
> >So we could do similar to vhost-user? I mean, save the acked features
> >before migration and store it back after it. This should be able to
> >keep the compatibility. If user downgrades DPDK version, it also could
> >be easily detected, and then exit with an error to user: migration
> >failed due to un-compatible vhost features.
> >
> >Just some rough thoughts. Makes tiny sense?
> 
> My understanding is that the management tool has to know whether
> versions are compatible before initiating the migration:

Makes sense. How about getting and restoring the acked features through
qemu command lines then, say, through the monitor interface?

With that, it would be something like:

- start vhost-user backend (DPDK, VPP, or whatever) & qemu in the src host

- read the acked features (through monitor interface)

- start vhost-user backend in the dst host

- start qemu in the dst host with the just queried acked features

  QEMU then is expected to use this feature set for the later vhost-user
  feature negotitation. Exit if features compatibility is broken.

Thoughts?

	--yliu

>  1. The downtime could be unpredictable if a VM has to move from hosts
>     to hosts multiple times, which is problematic, especially for NFV.
>  2. If migration is not possible, maybe the management tool would
>     prefer not to interrupt the VM on current host.
> 
> I have little experience with migration though, so I could be mistaken.
> 
> Thanks,
> Maxime
> 
> >
> >	--yliu
> >>
> >>To solve this problem within qemu, qemu has a versioning system based on
> >>a machine type concept which fundamentally is a version string, by
> >>specifying that string one can get hardware compatible with a previous
> >>qemu version. QEMU also reports the latest version and list of versions
> >>supported so libvirt records the version at VM creation and then is
> >>careful to use this machine version whenever it migrates a VM.
> >>
> >>One might wonder how is this solved with a kernel vhost backend. The
> >>answer is that it mostly isn't - instead an assumption is made, that
> >>qemu versions are deployed together with the kernel - this is generally
> >>true for downstreams.  Thus whenever qemu gains a new feature, it is
> >>already supported by the kernel as well.  However, if one attempts
> >>migration with a new qemu from a system with a new to old kernel, one
> >>would get a failure.
> >>
> >>In the world where we have multiple userspace backends, with some of
> >>these supplied by ISVs, this seems non-realistic.
> >>
> >>IMO we need to support vhost backend versioning, ideally
> >>in a way that will also work for vhost kernel backends.
> >>
> >>So I'd like to get some input from both backend and management
> >>developers on what a good solution would look like.
> >>
> >>If we want to emulate the qemu solution, this involves adding the
> >>concept of interface versions to dpdk.  For example, dpdk could supply a
> >>file (or utility printing?) with list of versions: latest and versions
> >>supported. libvirt could read that and
> >>- store latest version at vm creation
> >>- pass it around with the vm
> >>- pass it to qemu
> >>
> >>>From here, qemu could pass this over the vhost-user channel,
> >>thus making sure it's initialized with the correct
> >>compatible interface.
> >>
> >>As version here is an opaque string for libvirt and qemu,
> >>anything can be used - but I suggest either a list
> >>of values defining the interface, e.g.
> >>any_layout=on,max_ring=256
> >>or a version including the name and vendor of the backend,
> >>e.g. "org.dpdk.v4.5.6".
> >>
> >>Note that typically the list of supported versions can only be
> >>extended, not shrunk. Also, if the host/guest interface
> >>does not change, don't change the current version as
> >>this just creates work for everyone.
> >>
> >>Thoughts? Would this work well for management? dpdk? vpp?
> >>
> >>Thanks!
> >>
> >>--
> >>MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [vpp-dev] dpdk/vpp and cross-version migration for vhost
  2016-11-17  9:49       ` [Qemu-devel] " Yuanhan Liu
@ 2016-11-17 15:25         ` Thomas F Herbert
  -1 siblings, 0 replies; 44+ messages in thread
From: Thomas F Herbert @ 2016-11-17 15:25 UTC (permalink / raw)
  To: Yuanhan Liu, Maxime Coquelin
  Cc: Michael S. Tsirkin, dev, qemu-devel, libvir-list, vpp-dev,
	Marc-André Lureau, Billy McFall

+Billy McFall


On 11/17/2016 04:49 AM, Yuanhan Liu wrote:
> On Thu, Nov 17, 2016 at 09:47:09AM +0100, Maxime Coquelin wrote:
>>
>> On 11/17/2016 09:29 AM, Yuanhan Liu wrote:
>>> As usaual, sorry for late response :/
>>>
>>> On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
>>>> Hi!
>>>> So it looks like we face a problem with cross-version
>>>> migration when using vhost. It's not new but became more
>>>> acute with the advent of vhost user.
>>>>
>>>> For users to be able to migrate between different versions
>>>> of the hypervisor the interface exposed to guests
>>>> by hypervisor must stay unchanged.
>>>>
>>>> The problem is that a qemu device is connected
>>>> to a backend in another process, so the interface
>>>> exposed to guests depends on the capabilities of that
>>>> process.
>>>>
>>>> Specifically, for vhost user interface based on virtio, this includes
>>>> the "host features" bitmap that defines the interface, as well as more
>>>> host values such as the max ring size.  Adding new features/changing
>>>> values to this interface is required to make progress, but on the other
>>>> hand we need ability to get the old host features to be compatible.
>>> It looks like to the same issue of vhost-user reconnect to me. For example,
>>>
>>> - start dpdk 16.07 & qemu 2.5
>>> - kill dpdk
>>> - start dpdk 16.11
>>>
>>> Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect),
>>> above should work. Because qemu saves the negotiated features before the
>>> disconnect and stores it back after the reconnection.
>>>
>>>     commit a463215b087c41d7ca94e51aa347cde523831873
>>>     Author: Marc-André Lureau <marcandre.lureau@redhat.com>
>>>     Date:   Mon Jun 6 18:45:05 2016 +0200
>>>
>>>         vhost-net: save & restore vhost-user acked features
>>>
>>>         The initial vhost-user connection sets the features to be negotiated
>>>         with the driver. Renegotiation isn't possible without device reset.
>>>
>>>         To handle reconnection of vhost-user backend, ensure the same set of
>>>         features are provided, and reuse already acked features.
>>>
>>>         Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>>>
>>>
>>> So we could do similar to vhost-user? I mean, save the acked features
>>> before migration and store it back after it. This should be able to
>>> keep the compatibility. If user downgrades DPDK version, it also could
>>> be easily detected, and then exit with an error to user: migration
>>> failed due to un-compatible vhost features.
>>>
>>> Just some rough thoughts. Makes tiny sense?
>> My understanding is that the management tool has to know whether
>> versions are compatible before initiating the migration:
> Makes sense. How about getting and restoring the acked features through
> qemu command lines then, say, through the monitor interface?
>
> With that, it would be something like:
>
> - start vhost-user backend (DPDK, VPP, or whatever) & qemu in the src host
>
> - read the acked features (through monitor interface)
>
> - start vhost-user backend in the dst host
>
> - start qemu in the dst host with the just queried acked features
>
>    QEMU then is expected to use this feature set for the later vhost-user
>    feature negotitation. Exit if features compatibility is broken.
>
> Thoughts?
>
> 	--yliu
>
>>   1. The downtime could be unpredictable if a VM has to move from hosts
>>      to hosts multiple times, which is problematic, especially for NFV.
>>   2. If migration is not possible, maybe the management tool would
>>      prefer not to interrupt the VM on current host.
>>
>> I have little experience with migration though, so I could be mistaken.
>>
>> Thanks,
>> Maxime
>>
>>> 	--yliu
>>>> To solve this problem within qemu, qemu has a versioning system based on
>>>> a machine type concept which fundamentally is a version string, by
>>>> specifying that string one can get hardware compatible with a previous
>>>> qemu version. QEMU also reports the latest version and list of versions
>>>> supported so libvirt records the version at VM creation and then is
>>>> careful to use this machine version whenever it migrates a VM.
>>>>
>>>> One might wonder how is this solved with a kernel vhost backend. The
>>>> answer is that it mostly isn't - instead an assumption is made, that
>>>> qemu versions are deployed together with the kernel - this is generally
>>>> true for downstreams.  Thus whenever qemu gains a new feature, it is
>>>> already supported by the kernel as well.  However, if one attempts
>>>> migration with a new qemu from a system with a new to old kernel, one
>>>> would get a failure.
>>>>
>>>> In the world where we have multiple userspace backends, with some of
>>>> these supplied by ISVs, this seems non-realistic.
>>>>
>>>> IMO we need to support vhost backend versioning, ideally
>>>> in a way that will also work for vhost kernel backends.
>>>>
>>>> So I'd like to get some input from both backend and management
>>>> developers on what a good solution would look like.
>>>>
>>>> If we want to emulate the qemu solution, this involves adding the
>>>> concept of interface versions to dpdk.  For example, dpdk could supply a
>>>> file (or utility printing?) with list of versions: latest and versions
>>>> supported. libvirt could read that and
>>>> - store latest version at vm creation
>>>> - pass it around with the vm
>>>> - pass it to qemu
>>>>
>>>> >From here, qemu could pass this over the vhost-user channel,
>>>> thus making sure it's initialized with the correct
>>>> compatible interface.
>>>>
>>>> As version here is an opaque string for libvirt and qemu,
>>>> anything can be used - but I suggest either a list
>>>> of values defining the interface, e.g.
>>>> any_layout=on,max_ring=256
>>>> or a version including the name and vendor of the backend,
>>>> e.g. "org.dpdk.v4.5.6".
>>>>
>>>> Note that typically the list of supported versions can only be
>>>> extended, not shrunk. Also, if the host/guest interface
>>>> does not change, don't change the current version as
>>>> this just creates work for everyone.
>>>>
>>>> Thoughts? Would this work well for management? dpdk? vpp?
>>>>
>>>> Thanks!
>>>>
>>>> --
>>>> MST
> _______________________________________________
> vpp-dev mailing list
> vpp-dev@lists.fd.io
> https://lists.fd.io/mailman/listinfo/vpp-dev

-- 
*Thomas F Herbert*
SDN Group
Office of Technology
*Red Hat*

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [vpp-dev] dpdk/vpp and cross-version migration for vhost
@ 2016-11-17 15:25         ` Thomas F Herbert
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas F Herbert @ 2016-11-17 15:25 UTC (permalink / raw)
  To: Yuanhan Liu, Maxime Coquelin
  Cc: Michael S. Tsirkin, dev, qemu-devel, libvir-list, vpp-dev,
	Marc-André Lureau, Billy McFall

+Billy McFall


On 11/17/2016 04:49 AM, Yuanhan Liu wrote:
> On Thu, Nov 17, 2016 at 09:47:09AM +0100, Maxime Coquelin wrote:
>>
>> On 11/17/2016 09:29 AM, Yuanhan Liu wrote:
>>> As usaual, sorry for late response :/
>>>
>>> On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
>>>> Hi!
>>>> So it looks like we face a problem with cross-version
>>>> migration when using vhost. It's not new but became more
>>>> acute with the advent of vhost user.
>>>>
>>>> For users to be able to migrate between different versions
>>>> of the hypervisor the interface exposed to guests
>>>> by hypervisor must stay unchanged.
>>>>
>>>> The problem is that a qemu device is connected
>>>> to a backend in another process, so the interface
>>>> exposed to guests depends on the capabilities of that
>>>> process.
>>>>
>>>> Specifically, for vhost user interface based on virtio, this includes
>>>> the "host features" bitmap that defines the interface, as well as more
>>>> host values such as the max ring size.  Adding new features/changing
>>>> values to this interface is required to make progress, but on the other
>>>> hand we need ability to get the old host features to be compatible.
>>> It looks like to the same issue of vhost-user reconnect to me. For example,
>>>
>>> - start dpdk 16.07 & qemu 2.5
>>> - kill dpdk
>>> - start dpdk 16.11
>>>
>>> Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect),
>>> above should work. Because qemu saves the negotiated features before the
>>> disconnect and stores it back after the reconnection.
>>>
>>>     commit a463215b087c41d7ca94e51aa347cde523831873
>>>     Author: Marc-André Lureau <marcandre.lureau@redhat.com>
>>>     Date:   Mon Jun 6 18:45:05 2016 +0200
>>>
>>>         vhost-net: save & restore vhost-user acked features
>>>
>>>         The initial vhost-user connection sets the features to be negotiated
>>>         with the driver. Renegotiation isn't possible without device reset.
>>>
>>>         To handle reconnection of vhost-user backend, ensure the same set of
>>>         features are provided, and reuse already acked features.
>>>
>>>         Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>>>
>>>
>>> So we could do similar to vhost-user? I mean, save the acked features
>>> before migration and store it back after it. This should be able to
>>> keep the compatibility. If user downgrades DPDK version, it also could
>>> be easily detected, and then exit with an error to user: migration
>>> failed due to un-compatible vhost features.
>>>
>>> Just some rough thoughts. Makes tiny sense?
>> My understanding is that the management tool has to know whether
>> versions are compatible before initiating the migration:
> Makes sense. How about getting and restoring the acked features through
> qemu command lines then, say, through the monitor interface?
>
> With that, it would be something like:
>
> - start vhost-user backend (DPDK, VPP, or whatever) & qemu in the src host
>
> - read the acked features (through monitor interface)
>
> - start vhost-user backend in the dst host
>
> - start qemu in the dst host with the just queried acked features
>
>    QEMU then is expected to use this feature set for the later vhost-user
>    feature negotitation. Exit if features compatibility is broken.
>
> Thoughts?
>
> 	--yliu
>
>>   1. The downtime could be unpredictable if a VM has to move from hosts
>>      to hosts multiple times, which is problematic, especially for NFV.
>>   2. If migration is not possible, maybe the management tool would
>>      prefer not to interrupt the VM on current host.
>>
>> I have little experience with migration though, so I could be mistaken.
>>
>> Thanks,
>> Maxime
>>
>>> 	--yliu
>>>> To solve this problem within qemu, qemu has a versioning system based on
>>>> a machine type concept which fundamentally is a version string, by
>>>> specifying that string one can get hardware compatible with a previous
>>>> qemu version. QEMU also reports the latest version and list of versions
>>>> supported so libvirt records the version at VM creation and then is
>>>> careful to use this machine version whenever it migrates a VM.
>>>>
>>>> One might wonder how is this solved with a kernel vhost backend. The
>>>> answer is that it mostly isn't - instead an assumption is made, that
>>>> qemu versions are deployed together with the kernel - this is generally
>>>> true for downstreams.  Thus whenever qemu gains a new feature, it is
>>>> already supported by the kernel as well.  However, if one attempts
>>>> migration with a new qemu from a system with a new to old kernel, one
>>>> would get a failure.
>>>>
>>>> In the world where we have multiple userspace backends, with some of
>>>> these supplied by ISVs, this seems non-realistic.
>>>>
>>>> IMO we need to support vhost backend versioning, ideally
>>>> in a way that will also work for vhost kernel backends.
>>>>
>>>> So I'd like to get some input from both backend and management
>>>> developers on what a good solution would look like.
>>>>
>>>> If we want to emulate the qemu solution, this involves adding the
>>>> concept of interface versions to dpdk.  For example, dpdk could supply a
>>>> file (or utility printing?) with list of versions: latest and versions
>>>> supported. libvirt could read that and
>>>> - store latest version at vm creation
>>>> - pass it around with the vm
>>>> - pass it to qemu
>>>>
>>>> >From here, qemu could pass this over the vhost-user channel,
>>>> thus making sure it's initialized with the correct
>>>> compatible interface.
>>>>
>>>> As version here is an opaque string for libvirt and qemu,
>>>> anything can be used - but I suggest either a list
>>>> of values defining the interface, e.g.
>>>> any_layout=on,max_ring=256
>>>> or a version including the name and vendor of the backend,
>>>> e.g. "org.dpdk.v4.5.6".
>>>>
>>>> Note that typically the list of supported versions can only be
>>>> extended, not shrunk. Also, if the host/guest interface
>>>> does not change, don't change the current version as
>>>> this just creates work for everyone.
>>>>
>>>> Thoughts? Would this work well for management? dpdk? vpp?
>>>>
>>>> Thanks!
>>>>
>>>> --
>>>> MST
> _______________________________________________
> vpp-dev mailing list
> vpp-dev@lists.fd.io
> https://lists.fd.io/mailman/listinfo/vpp-dev

-- 
*Thomas F Herbert*
SDN Group
Office of Technology
*Red Hat*

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-11-17  9:49       ` [Qemu-devel] " Yuanhan Liu
@ 2016-11-17 17:37         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2016-11-17 17:37 UTC (permalink / raw)
  To: Yuanhan Liu
  Cc: Maxime Coquelin, dev, Stephen Hemminger, qemu-devel, libvir-list,
	vpp-dev, Marc-André Lureau

On Thu, Nov 17, 2016 at 05:49:36PM +0800, Yuanhan Liu wrote:
> On Thu, Nov 17, 2016 at 09:47:09AM +0100, Maxime Coquelin wrote:
> > 
> > 
> > On 11/17/2016 09:29 AM, Yuanhan Liu wrote:
> > >As usaual, sorry for late response :/
> > >
> > >On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
> > >>Hi!
> > >>So it looks like we face a problem with cross-version
> > >>migration when using vhost. It's not new but became more
> > >>acute with the advent of vhost user.
> > >>
> > >>For users to be able to migrate between different versions
> > >>of the hypervisor the interface exposed to guests
> > >>by hypervisor must stay unchanged.
> > >>
> > >>The problem is that a qemu device is connected
> > >>to a backend in another process, so the interface
> > >>exposed to guests depends on the capabilities of that
> > >>process.
> > >>
> > >>Specifically, for vhost user interface based on virtio, this includes
> > >>the "host features" bitmap that defines the interface, as well as more
> > >>host values such as the max ring size.  Adding new features/changing
> > >>values to this interface is required to make progress, but on the other
> > >>hand we need ability to get the old host features to be compatible.
> > >
> > >It looks like to the same issue of vhost-user reconnect to me. For example,
> > >
> > >- start dpdk 16.07 & qemu 2.5
> > >- kill dpdk
> > >- start dpdk 16.11
> > >
> > >Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect),
> > >above should work. Because qemu saves the negotiated features before the
> > >disconnect and stores it back after the reconnection.
> > >
> > >    commit a463215b087c41d7ca94e51aa347cde523831873
> > >    Author: Marc-André Lureau <marcandre.lureau@redhat.com>
> > >    Date:   Mon Jun 6 18:45:05 2016 +0200
> > >
> > >        vhost-net: save & restore vhost-user acked features
> > >
> > >        The initial vhost-user connection sets the features to be negotiated
> > >        with the driver. Renegotiation isn't possible without device reset.
> > >
> > >        To handle reconnection of vhost-user backend, ensure the same set of
> > >        features are provided, and reuse already acked features.
> > >
> > >        Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > >
> > >
> > >So we could do similar to vhost-user? I mean, save the acked features
> > >before migration and store it back after it. This should be able to
> > >keep the compatibility. If user downgrades DPDK version, it also could
> > >be easily detected, and then exit with an error to user: migration
> > >failed due to un-compatible vhost features.
> > >
> > >Just some rough thoughts. Makes tiny sense?
> > 
> > My understanding is that the management tool has to know whether
> > versions are compatible before initiating the migration:
> 
> Makes sense. How about getting and restoring the acked features through
> qemu command lines then, say, through the monitor interface?
> 
> With that, it would be something like:
> 
> - start vhost-user backend (DPDK, VPP, or whatever) & qemu in the src host
> 
> - read the acked features (through monitor interface)
> 
> - start vhost-user backend in the dst host
> 
> - start qemu in the dst host with the just queried acked features
> 
>   QEMU then is expected to use this feature set for the later vhost-user
>   feature negotitation. Exit if features compatibility is broken.
> 
> Thoughts?
> 
> 	--yliu


You keep assuming that you have the VM started first and
figure out things afterwards, but this does not work.

Think about a cluster of machines. You want to start a VM in
a way that will ensure compatibility with all hosts
in a cluster.

If you don't, guest visible interface will change
and you won't be able to migrate.

It does not make sense to discuss feature bits specifically
since that is not the only part of interface.
For example, max ring size supported might change.


Let me describe how it works in qemu/libvirt.
When you install a VM, you can specify compatibility
level (aka "machine type"), and you can query the supported compatibility
levels. Management uses that to find the supported compatibility
and stores the compatibility in XML that is migrated with the VM.
There's also a way to find the latest level which is the
default unless overridden by user, again this level
is recorded and then
- management can make sure migration destination is compatible
- management can avoid migration to hosts without that support


We absolutely can QEMU be in control here, but what
is missing is ability to query compatibility as above.



> >  1. The downtime could be unpredictable if a VM has to move from hosts
> >     to hosts multiple times, which is problematic, especially for NFV.
> >  2. If migration is not possible, maybe the management tool would
> >     prefer not to interrupt the VM on current host.
> > 
> > I have little experience with migration though, so I could be mistaken.
> > 
> > Thanks,
> > Maxime
> > 
> > >
> > >	--yliu
> > >>
> > >>To solve this problem within qemu, qemu has a versioning system based on
> > >>a machine type concept which fundamentally is a version string, by
> > >>specifying that string one can get hardware compatible with a previous
> > >>qemu version. QEMU also reports the latest version and list of versions
> > >>supported so libvirt records the version at VM creation and then is
> > >>careful to use this machine version whenever it migrates a VM.
> > >>
> > >>One might wonder how is this solved with a kernel vhost backend. The
> > >>answer is that it mostly isn't - instead an assumption is made, that
> > >>qemu versions are deployed together with the kernel - this is generally
> > >>true for downstreams.  Thus whenever qemu gains a new feature, it is
> > >>already supported by the kernel as well.  However, if one attempts
> > >>migration with a new qemu from a system with a new to old kernel, one
> > >>would get a failure.
> > >>
> > >>In the world where we have multiple userspace backends, with some of
> > >>these supplied by ISVs, this seems non-realistic.
> > >>
> > >>IMO we need to support vhost backend versioning, ideally
> > >>in a way that will also work for vhost kernel backends.
> > >>
> > >>So I'd like to get some input from both backend and management
> > >>developers on what a good solution would look like.
> > >>
> > >>If we want to emulate the qemu solution, this involves adding the
> > >>concept of interface versions to dpdk.  For example, dpdk could supply a
> > >>file (or utility printing?) with list of versions: latest and versions
> > >>supported. libvirt could read that and
> > >>- store latest version at vm creation
> > >>- pass it around with the vm
> > >>- pass it to qemu
> > >>
> > >>>From here, qemu could pass this over the vhost-user channel,
> > >>thus making sure it's initialized with the correct
> > >>compatible interface.
> > >>
> > >>As version here is an opaque string for libvirt and qemu,
> > >>anything can be used - but I suggest either a list
> > >>of values defining the interface, e.g.
> > >>any_layout=on,max_ring=256
> > >>or a version including the name and vendor of the backend,
> > >>e.g. "org.dpdk.v4.5.6".
> > >>
> > >>Note that typically the list of supported versions can only be
> > >>extended, not shrunk. Also, if the host/guest interface
> > >>does not change, don't change the current version as
> > >>this just creates work for everyone.
> > >>
> > >>Thoughts? Would this work well for management? dpdk? vpp?
> > >>
> > >>Thanks!
> > >>
> > >>--
> > >>MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] dpdk/vpp and cross-version migration for vhost
@ 2016-11-17 17:37         ` Michael S. Tsirkin
  0 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2016-11-17 17:37 UTC (permalink / raw)
  To: Yuanhan Liu
  Cc: Maxime Coquelin, dev, Stephen Hemminger, qemu-devel, libvir-list,
	vpp-dev, Marc-André Lureau

On Thu, Nov 17, 2016 at 05:49:36PM +0800, Yuanhan Liu wrote:
> On Thu, Nov 17, 2016 at 09:47:09AM +0100, Maxime Coquelin wrote:
> > 
> > 
> > On 11/17/2016 09:29 AM, Yuanhan Liu wrote:
> > >As usaual, sorry for late response :/
> > >
> > >On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
> > >>Hi!
> > >>So it looks like we face a problem with cross-version
> > >>migration when using vhost. It's not new but became more
> > >>acute with the advent of vhost user.
> > >>
> > >>For users to be able to migrate between different versions
> > >>of the hypervisor the interface exposed to guests
> > >>by hypervisor must stay unchanged.
> > >>
> > >>The problem is that a qemu device is connected
> > >>to a backend in another process, so the interface
> > >>exposed to guests depends on the capabilities of that
> > >>process.
> > >>
> > >>Specifically, for vhost user interface based on virtio, this includes
> > >>the "host features" bitmap that defines the interface, as well as more
> > >>host values such as the max ring size.  Adding new features/changing
> > >>values to this interface is required to make progress, but on the other
> > >>hand we need ability to get the old host features to be compatible.
> > >
> > >It looks like to the same issue of vhost-user reconnect to me. For example,
> > >
> > >- start dpdk 16.07 & qemu 2.5
> > >- kill dpdk
> > >- start dpdk 16.11
> > >
> > >Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect),
> > >above should work. Because qemu saves the negotiated features before the
> > >disconnect and stores it back after the reconnection.
> > >
> > >    commit a463215b087c41d7ca94e51aa347cde523831873
> > >    Author: Marc-André Lureau <marcandre.lureau@redhat.com>
> > >    Date:   Mon Jun 6 18:45:05 2016 +0200
> > >
> > >        vhost-net: save & restore vhost-user acked features
> > >
> > >        The initial vhost-user connection sets the features to be negotiated
> > >        with the driver. Renegotiation isn't possible without device reset.
> > >
> > >        To handle reconnection of vhost-user backend, ensure the same set of
> > >        features are provided, and reuse already acked features.
> > >
> > >        Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > >
> > >
> > >So we could do similar to vhost-user? I mean, save the acked features
> > >before migration and store it back after it. This should be able to
> > >keep the compatibility. If user downgrades DPDK version, it also could
> > >be easily detected, and then exit with an error to user: migration
> > >failed due to un-compatible vhost features.
> > >
> > >Just some rough thoughts. Makes tiny sense?
> > 
> > My understanding is that the management tool has to know whether
> > versions are compatible before initiating the migration:
> 
> Makes sense. How about getting and restoring the acked features through
> qemu command lines then, say, through the monitor interface?
> 
> With that, it would be something like:
> 
> - start vhost-user backend (DPDK, VPP, or whatever) & qemu in the src host
> 
> - read the acked features (through monitor interface)
> 
> - start vhost-user backend in the dst host
> 
> - start qemu in the dst host with the just queried acked features
> 
>   QEMU then is expected to use this feature set for the later vhost-user
>   feature negotitation. Exit if features compatibility is broken.
> 
> Thoughts?
> 
> 	--yliu


You keep assuming that you have the VM started first and
figure out things afterwards, but this does not work.

Think about a cluster of machines. You want to start a VM in
a way that will ensure compatibility with all hosts
in a cluster.

If you don't, guest visible interface will change
and you won't be able to migrate.

It does not make sense to discuss feature bits specifically
since that is not the only part of interface.
For example, max ring size supported might change.


Let me describe how it works in qemu/libvirt.
When you install a VM, you can specify compatibility
level (aka "machine type"), and you can query the supported compatibility
levels. Management uses that to find the supported compatibility
and stores the compatibility in XML that is migrated with the VM.
There's also a way to find the latest level which is the
default unless overridden by user, again this level
is recorded and then
- management can make sure migration destination is compatible
- management can avoid migration to hosts without that support


We absolutely can QEMU be in control here, but what
is missing is ability to query compatibility as above.



> >  1. The downtime could be unpredictable if a VM has to move from hosts
> >     to hosts multiple times, which is problematic, especially for NFV.
> >  2. If migration is not possible, maybe the management tool would
> >     prefer not to interrupt the VM on current host.
> > 
> > I have little experience with migration though, so I could be mistaken.
> > 
> > Thanks,
> > Maxime
> > 
> > >
> > >	--yliu
> > >>
> > >>To solve this problem within qemu, qemu has a versioning system based on
> > >>a machine type concept which fundamentally is a version string, by
> > >>specifying that string one can get hardware compatible with a previous
> > >>qemu version. QEMU also reports the latest version and list of versions
> > >>supported so libvirt records the version at VM creation and then is
> > >>careful to use this machine version whenever it migrates a VM.
> > >>
> > >>One might wonder how is this solved with a kernel vhost backend. The
> > >>answer is that it mostly isn't - instead an assumption is made, that
> > >>qemu versions are deployed together with the kernel - this is generally
> > >>true for downstreams.  Thus whenever qemu gains a new feature, it is
> > >>already supported by the kernel as well.  However, if one attempts
> > >>migration with a new qemu from a system with a new to old kernel, one
> > >>would get a failure.
> > >>
> > >>In the world where we have multiple userspace backends, with some of
> > >>these supplied by ISVs, this seems non-realistic.
> > >>
> > >>IMO we need to support vhost backend versioning, ideally
> > >>in a way that will also work for vhost kernel backends.
> > >>
> > >>So I'd like to get some input from both backend and management
> > >>developers on what a good solution would look like.
> > >>
> > >>If we want to emulate the qemu solution, this involves adding the
> > >>concept of interface versions to dpdk.  For example, dpdk could supply a
> > >>file (or utility printing?) with list of versions: latest and versions
> > >>supported. libvirt could read that and
> > >>- store latest version at vm creation
> > >>- pass it around with the vm
> > >>- pass it to qemu
> > >>
> > >>>From here, qemu could pass this over the vhost-user channel,
> > >>thus making sure it's initialized with the correct
> > >>compatible interface.
> > >>
> > >>As version here is an opaque string for libvirt and qemu,
> > >>anything can be used - but I suggest either a list
> > >>of values defining the interface, e.g.
> > >>any_layout=on,max_ring=256
> > >>or a version including the name and vendor of the backend,
> > >>e.g. "org.dpdk.v4.5.6".
> > >>
> > >>Note that typically the list of supported versions can only be
> > >>extended, not shrunk. Also, if the host/guest interface
> > >>does not change, don't change the current version as
> > >>this just creates work for everyone.
> > >>
> > >>Thoughts? Would this work well for management? dpdk? vpp?
> > >>
> > >>Thanks!
> > >>
> > >>--
> > >>MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-11-17 17:37         ` [Qemu-devel] " Michael S. Tsirkin
@ 2016-11-22 13:02           ` Yuanhan Liu
  -1 siblings, 0 replies; 44+ messages in thread
From: Yuanhan Liu @ 2016-11-22 13:02 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Maxime Coquelin, dev, Stephen Hemminger, qemu-devel, libvir-list,
	vpp-dev, Marc-André Lureau

On Thu, Nov 17, 2016 at 07:37:09PM +0200, Michael S. Tsirkin wrote:
> On Thu, Nov 17, 2016 at 05:49:36PM +0800, Yuanhan Liu wrote:
> > On Thu, Nov 17, 2016 at 09:47:09AM +0100, Maxime Coquelin wrote:
> > > 
> > > 
> > > On 11/17/2016 09:29 AM, Yuanhan Liu wrote:
> > > >As usaual, sorry for late response :/
> > > >
> > > >On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
> > > >>Hi!
> > > >>So it looks like we face a problem with cross-version
> > > >>migration when using vhost. It's not new but became more
> > > >>acute with the advent of vhost user.
> > > >>
> > > >>For users to be able to migrate between different versions
> > > >>of the hypervisor the interface exposed to guests
> > > >>by hypervisor must stay unchanged.
> > > >>
> > > >>The problem is that a qemu device is connected
> > > >>to a backend in another process, so the interface
> > > >>exposed to guests depends on the capabilities of that
> > > >>process.
> > > >>
> > > >>Specifically, for vhost user interface based on virtio, this includes
> > > >>the "host features" bitmap that defines the interface, as well as more
> > > >>host values such as the max ring size.  Adding new features/changing
> > > >>values to this interface is required to make progress, but on the other
> > > >>hand we need ability to get the old host features to be compatible.
> > > >
> > > >It looks like to the same issue of vhost-user reconnect to me. For example,
> > > >
> > > >- start dpdk 16.07 & qemu 2.5
> > > >- kill dpdk
> > > >- start dpdk 16.11
> > > >
> > > >Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect),
> > > >above should work. Because qemu saves the negotiated features before the
> > > >disconnect and stores it back after the reconnection.
> > > >
> > > >    commit a463215b087c41d7ca94e51aa347cde523831873
> > > >    Author: Marc-André Lureau <marcandre.lureau@redhat.com>
> > > >    Date:   Mon Jun 6 18:45:05 2016 +0200
> > > >
> > > >        vhost-net: save & restore vhost-user acked features
> > > >
> > > >        The initial vhost-user connection sets the features to be negotiated
> > > >        with the driver. Renegotiation isn't possible without device reset.
> > > >
> > > >        To handle reconnection of vhost-user backend, ensure the same set of
> > > >        features are provided, and reuse already acked features.
> > > >
> > > >        Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > > >
> > > >
> > > >So we could do similar to vhost-user? I mean, save the acked features
> > > >before migration and store it back after it. This should be able to
> > > >keep the compatibility. If user downgrades DPDK version, it also could
> > > >be easily detected, and then exit with an error to user: migration
> > > >failed due to un-compatible vhost features.
> > > >
> > > >Just some rough thoughts. Makes tiny sense?
> > > 
> > > My understanding is that the management tool has to know whether
> > > versions are compatible before initiating the migration:
> > 
> > Makes sense. How about getting and restoring the acked features through
> > qemu command lines then, say, through the monitor interface?
> > 
> > With that, it would be something like:
> > 
> > - start vhost-user backend (DPDK, VPP, or whatever) & qemu in the src host
> > 
> > - read the acked features (through monitor interface)
> > 
> > - start vhost-user backend in the dst host
> > 
> > - start qemu in the dst host with the just queried acked features
> > 
> >   QEMU then is expected to use this feature set for the later vhost-user
> >   feature negotitation. Exit if features compatibility is broken.
> > 
> > Thoughts?
> > 
> > 	--yliu
> 
> 
> You keep assuming that you have the VM started first and
> figure out things afterwards, but this does not work.
> 
> Think about a cluster of machines. You want to start a VM in
> a way that will ensure compatibility with all hosts
> in a cluster.

I see. I was more considering about the case when the dst
host (including the qemu and dpdk combo) is given, and
then determine whether it will be a successfull migration
or not.

And you are asking that we need to know which host could
be a good candidate before starting the migration. In such
case, we indeed need some inputs from both the qemu and
vhost-user backend.

For DPDK, I think it could be simple, just as you said, it
could be either a tiny script, or even a macro defined in
the source code file (we extend it every time we add a
new feature) to let the libvirt to read it. Or something
else.

> If you don't, guest visible interface will change
> and you won't be able to migrate.
> 
> It does not make sense to discuss feature bits specifically
> since that is not the only part of interface.
> For example, max ring size supported might change.

I don't quite understand why we have to consider the max ring
size here? Isn't it a virtio device attribute, that QEMU could
provide such compatibility information?

I mean, DPDK is supposed to support vary vring size, it's QEMU
to give a specifc value.

> Let me describe how it works in qemu/libvirt.
> When you install a VM, you can specify compatibility
> level (aka "machine type"), and you can query the supported compatibility
> levels. Management uses that to find the supported compatibility
> and stores the compatibility in XML that is migrated with the VM.
> There's also a way to find the latest level which is the
> default unless overridden by user, again this level
> is recorded and then
> - management can make sure migration destination is compatible
> - management can avoid migration to hosts without that support

Thanks for the info, it helps.

...
> > > >>As version here is an opaque string for libvirt and qemu,
> > > >>anything can be used - but I suggest either a list
> > > >>of values defining the interface, e.g.
> > > >>any_layout=on,max_ring=256
> > > >>or a version including the name and vendor of the backend,
> > > >>e.g. "org.dpdk.v4.5.6".

The version scheme may not be ideal here. Assume a QEMU is supposed
to work with a specific DPDK version, however, user may disable some
newer features through qemu command line, that it also could work with
an elder DPDK version. Using the version scheme will not allow us doing
such migration to an elder DPDK version. The MTU is a lively example
here? (when MTU feature is provided by QEMU but is actually disabled
by user, that it could also work with an elder DPDK without MTU support).

	--yliu

> > > >>
> > > >>Note that typically the list of supported versions can only be
> > > >>extended, not shrunk. Also, if the host/guest interface
> > > >>does not change, don't change the current version as
> > > >>this just creates work for everyone.
> > > >>
> > > >>Thoughts? Would this work well for management? dpdk? vpp?
> > > >>
> > > >>Thanks!
> > > >>
> > > >>--
> > > >>MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] dpdk/vpp and cross-version migration for vhost
@ 2016-11-22 13:02           ` Yuanhan Liu
  0 siblings, 0 replies; 44+ messages in thread
From: Yuanhan Liu @ 2016-11-22 13:02 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Maxime Coquelin, dev, Stephen Hemminger, qemu-devel, libvir-list,
	vpp-dev, Marc-André Lureau

On Thu, Nov 17, 2016 at 07:37:09PM +0200, Michael S. Tsirkin wrote:
> On Thu, Nov 17, 2016 at 05:49:36PM +0800, Yuanhan Liu wrote:
> > On Thu, Nov 17, 2016 at 09:47:09AM +0100, Maxime Coquelin wrote:
> > > 
> > > 
> > > On 11/17/2016 09:29 AM, Yuanhan Liu wrote:
> > > >As usaual, sorry for late response :/
> > > >
> > > >On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
> > > >>Hi!
> > > >>So it looks like we face a problem with cross-version
> > > >>migration when using vhost. It's not new but became more
> > > >>acute with the advent of vhost user.
> > > >>
> > > >>For users to be able to migrate between different versions
> > > >>of the hypervisor the interface exposed to guests
> > > >>by hypervisor must stay unchanged.
> > > >>
> > > >>The problem is that a qemu device is connected
> > > >>to a backend in another process, so the interface
> > > >>exposed to guests depends on the capabilities of that
> > > >>process.
> > > >>
> > > >>Specifically, for vhost user interface based on virtio, this includes
> > > >>the "host features" bitmap that defines the interface, as well as more
> > > >>host values such as the max ring size.  Adding new features/changing
> > > >>values to this interface is required to make progress, but on the other
> > > >>hand we need ability to get the old host features to be compatible.
> > > >
> > > >It looks like to the same issue of vhost-user reconnect to me. For example,
> > > >
> > > >- start dpdk 16.07 & qemu 2.5
> > > >- kill dpdk
> > > >- start dpdk 16.11
> > > >
> > > >Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect),
> > > >above should work. Because qemu saves the negotiated features before the
> > > >disconnect and stores it back after the reconnection.
> > > >
> > > >    commit a463215b087c41d7ca94e51aa347cde523831873
> > > >    Author: Marc-André Lureau <marcandre.lureau@redhat.com>
> > > >    Date:   Mon Jun 6 18:45:05 2016 +0200
> > > >
> > > >        vhost-net: save & restore vhost-user acked features
> > > >
> > > >        The initial vhost-user connection sets the features to be negotiated
> > > >        with the driver. Renegotiation isn't possible without device reset.
> > > >
> > > >        To handle reconnection of vhost-user backend, ensure the same set of
> > > >        features are provided, and reuse already acked features.
> > > >
> > > >        Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > > >
> > > >
> > > >So we could do similar to vhost-user? I mean, save the acked features
> > > >before migration and store it back after it. This should be able to
> > > >keep the compatibility. If user downgrades DPDK version, it also could
> > > >be easily detected, and then exit with an error to user: migration
> > > >failed due to un-compatible vhost features.
> > > >
> > > >Just some rough thoughts. Makes tiny sense?
> > > 
> > > My understanding is that the management tool has to know whether
> > > versions are compatible before initiating the migration:
> > 
> > Makes sense. How about getting and restoring the acked features through
> > qemu command lines then, say, through the monitor interface?
> > 
> > With that, it would be something like:
> > 
> > - start vhost-user backend (DPDK, VPP, or whatever) & qemu in the src host
> > 
> > - read the acked features (through monitor interface)
> > 
> > - start vhost-user backend in the dst host
> > 
> > - start qemu in the dst host with the just queried acked features
> > 
> >   QEMU then is expected to use this feature set for the later vhost-user
> >   feature negotitation. Exit if features compatibility is broken.
> > 
> > Thoughts?
> > 
> > 	--yliu
> 
> 
> You keep assuming that you have the VM started first and
> figure out things afterwards, but this does not work.
> 
> Think about a cluster of machines. You want to start a VM in
> a way that will ensure compatibility with all hosts
> in a cluster.

I see. I was more considering about the case when the dst
host (including the qemu and dpdk combo) is given, and
then determine whether it will be a successfull migration
or not.

And you are asking that we need to know which host could
be a good candidate before starting the migration. In such
case, we indeed need some inputs from both the qemu and
vhost-user backend.

For DPDK, I think it could be simple, just as you said, it
could be either a tiny script, or even a macro defined in
the source code file (we extend it every time we add a
new feature) to let the libvirt to read it. Or something
else.

> If you don't, guest visible interface will change
> and you won't be able to migrate.
> 
> It does not make sense to discuss feature bits specifically
> since that is not the only part of interface.
> For example, max ring size supported might change.

I don't quite understand why we have to consider the max ring
size here? Isn't it a virtio device attribute, that QEMU could
provide such compatibility information?

I mean, DPDK is supposed to support vary vring size, it's QEMU
to give a specifc value.

> Let me describe how it works in qemu/libvirt.
> When you install a VM, you can specify compatibility
> level (aka "machine type"), and you can query the supported compatibility
> levels. Management uses that to find the supported compatibility
> and stores the compatibility in XML that is migrated with the VM.
> There's also a way to find the latest level which is the
> default unless overridden by user, again this level
> is recorded and then
> - management can make sure migration destination is compatible
> - management can avoid migration to hosts without that support

Thanks for the info, it helps.

...
> > > >>As version here is an opaque string for libvirt and qemu,
> > > >>anything can be used - but I suggest either a list
> > > >>of values defining the interface, e.g.
> > > >>any_layout=on,max_ring=256
> > > >>or a version including the name and vendor of the backend,
> > > >>e.g. "org.dpdk.v4.5.6".

The version scheme may not be ideal here. Assume a QEMU is supposed
to work with a specific DPDK version, however, user may disable some
newer features through qemu command line, that it also could work with
an elder DPDK version. Using the version scheme will not allow us doing
such migration to an elder DPDK version. The MTU is a lively example
here? (when MTU feature is provided by QEMU but is actually disabled
by user, that it could also work with an elder DPDK without MTU support).

	--yliu

> > > >>
> > > >>Note that typically the list of supported versions can only be
> > > >>extended, not shrunk. Also, if the host/guest interface
> > > >>does not change, don't change the current version as
> > > >>this just creates work for everyone.
> > > >>
> > > >>Thoughts? Would this work well for management? dpdk? vpp?
> > > >>
> > > >>Thanks!
> > > >>
> > > >>--
> > > >>MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-11-22 13:02           ` [Qemu-devel] " Yuanhan Liu
@ 2016-11-22 14:53             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2016-11-22 14:53 UTC (permalink / raw)
  To: Yuanhan Liu
  Cc: Maxime Coquelin, dev, Stephen Hemminger, qemu-devel, libvir-list,
	vpp-dev, Marc-André Lureau

On Tue, Nov 22, 2016 at 09:02:23PM +0800, Yuanhan Liu wrote:
> On Thu, Nov 17, 2016 at 07:37:09PM +0200, Michael S. Tsirkin wrote:
> > On Thu, Nov 17, 2016 at 05:49:36PM +0800, Yuanhan Liu wrote:
> > > On Thu, Nov 17, 2016 at 09:47:09AM +0100, Maxime Coquelin wrote:
> > > > 
> > > > 
> > > > On 11/17/2016 09:29 AM, Yuanhan Liu wrote:
> > > > >As usaual, sorry for late response :/
> > > > >
> > > > >On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
> > > > >>Hi!
> > > > >>So it looks like we face a problem with cross-version
> > > > >>migration when using vhost. It's not new but became more
> > > > >>acute with the advent of vhost user.
> > > > >>
> > > > >>For users to be able to migrate between different versions
> > > > >>of the hypervisor the interface exposed to guests
> > > > >>by hypervisor must stay unchanged.
> > > > >>
> > > > >>The problem is that a qemu device is connected
> > > > >>to a backend in another process, so the interface
> > > > >>exposed to guests depends on the capabilities of that
> > > > >>process.
> > > > >>
> > > > >>Specifically, for vhost user interface based on virtio, this includes
> > > > >>the "host features" bitmap that defines the interface, as well as more
> > > > >>host values such as the max ring size.  Adding new features/changing
> > > > >>values to this interface is required to make progress, but on the other
> > > > >>hand we need ability to get the old host features to be compatible.
> > > > >
> > > > >It looks like to the same issue of vhost-user reconnect to me. For example,
> > > > >
> > > > >- start dpdk 16.07 & qemu 2.5
> > > > >- kill dpdk
> > > > >- start dpdk 16.11
> > > > >
> > > > >Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect),
> > > > >above should work. Because qemu saves the negotiated features before the
> > > > >disconnect and stores it back after the reconnection.
> > > > >
> > > > >    commit a463215b087c41d7ca94e51aa347cde523831873
> > > > >    Author: Marc-André Lureau <marcandre.lureau@redhat.com>
> > > > >    Date:   Mon Jun 6 18:45:05 2016 +0200
> > > > >
> > > > >        vhost-net: save & restore vhost-user acked features
> > > > >
> > > > >        The initial vhost-user connection sets the features to be negotiated
> > > > >        with the driver. Renegotiation isn't possible without device reset.
> > > > >
> > > > >        To handle reconnection of vhost-user backend, ensure the same set of
> > > > >        features are provided, and reuse already acked features.
> > > > >
> > > > >        Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > > > >
> > > > >
> > > > >So we could do similar to vhost-user? I mean, save the acked features
> > > > >before migration and store it back after it. This should be able to
> > > > >keep the compatibility. If user downgrades DPDK version, it also could
> > > > >be easily detected, and then exit with an error to user: migration
> > > > >failed due to un-compatible vhost features.
> > > > >
> > > > >Just some rough thoughts. Makes tiny sense?
> > > > 
> > > > My understanding is that the management tool has to know whether
> > > > versions are compatible before initiating the migration:
> > > 
> > > Makes sense. How about getting and restoring the acked features through
> > > qemu command lines then, say, through the monitor interface?
> > > 
> > > With that, it would be something like:
> > > 
> > > - start vhost-user backend (DPDK, VPP, or whatever) & qemu in the src host
> > > 
> > > - read the acked features (through monitor interface)
> > > 
> > > - start vhost-user backend in the dst host
> > > 
> > > - start qemu in the dst host with the just queried acked features
> > > 
> > >   QEMU then is expected to use this feature set for the later vhost-user
> > >   feature negotitation. Exit if features compatibility is broken.
> > > 
> > > Thoughts?
> > > 
> > > 	--yliu
> > 
> > 
> > You keep assuming that you have the VM started first and
> > figure out things afterwards, but this does not work.
> > 
> > Think about a cluster of machines. You want to start a VM in
> > a way that will ensure compatibility with all hosts
> > in a cluster.
> 
> I see. I was more considering about the case when the dst
> host (including the qemu and dpdk combo) is given, and
> then determine whether it will be a successfull migration
> or not.
> 
> And you are asking that we need to know which host could
> be a good candidate before starting the migration. In such
> case, we indeed need some inputs from both the qemu and
> vhost-user backend.
> 
> For DPDK, I think it could be simple, just as you said, it
> could be either a tiny script, or even a macro defined in
> the source code file (we extend it every time we add a
> new feature) to let the libvirt to read it. Or something
> else.

There's the issue of APIs that tweak features as Maxime
suggested. Maybe the only thing to do is to deprecate it,
but I feel some way for application to pass info into
guest might be benefitial.


> > If you don't, guest visible interface will change
> > and you won't be able to migrate.
> > 
> > It does not make sense to discuss feature bits specifically
> > since that is not the only part of interface.
> > For example, max ring size supported might change.
> 
> I don't quite understand why we have to consider the max ring
> size here? Isn't it a virtio device attribute, that QEMU could
> provide such compatibility information?
>
> I mean, DPDK is supposed to support vary vring size, it's QEMU
> to give a specifc value.

If backend supports s/g of any size up to 2^16, there's no issue.

ATM some backends might be assuming up to 1K s/g since
QEMU never supported bigger ones. We might classify this
as a bug, or not and add a feature flag.

But it's just an example. There might be more values at issue
in the future.

> > Let me describe how it works in qemu/libvirt.
> > When you install a VM, you can specify compatibility
> > level (aka "machine type"), and you can query the supported compatibility
> > levels. Management uses that to find the supported compatibility
> > and stores the compatibility in XML that is migrated with the VM.
> > There's also a way to find the latest level which is the
> > default unless overridden by user, again this level
> > is recorded and then
> > - management can make sure migration destination is compatible
> > - management can avoid migration to hosts without that support
> 
> Thanks for the info, it helps.
> 
> ...
> > > > >>As version here is an opaque string for libvirt and qemu,
> > > > >>anything can be used - but I suggest either a list
> > > > >>of values defining the interface, e.g.
> > > > >>any_layout=on,max_ring=256
> > > > >>or a version including the name and vendor of the backend,
> > > > >>e.g. "org.dpdk.v4.5.6".
> 
> The version scheme may not be ideal here. Assume a QEMU is supposed
> to work with a specific DPDK version, however, user may disable some
> newer features through qemu command line, that it also could work with
> an elder DPDK version. Using the version scheme will not allow us doing
> such migration to an elder DPDK version. The MTU is a lively example
> here? (when MTU feature is provided by QEMU but is actually disabled
> by user, that it could also work with an elder DPDK without MTU support).
> 
> 	--yliu

OK, so does a list of values look better to you then?



> > > > >>
> > > > >>Note that typically the list of supported versions can only be
> > > > >>extended, not shrunk. Also, if the host/guest interface
> > > > >>does not change, don't change the current version as
> > > > >>this just creates work for everyone.
> > > > >>
> > > > >>Thoughts? Would this work well for management? dpdk? vpp?
> > > > >>
> > > > >>Thanks!
> > > > >>
> > > > >>--
> > > > >>MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] dpdk/vpp and cross-version migration for vhost
@ 2016-11-22 14:53             ` Michael S. Tsirkin
  0 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2016-11-22 14:53 UTC (permalink / raw)
  To: Yuanhan Liu
  Cc: Maxime Coquelin, dev, Stephen Hemminger, qemu-devel, libvir-list,
	vpp-dev, Marc-André Lureau

On Tue, Nov 22, 2016 at 09:02:23PM +0800, Yuanhan Liu wrote:
> On Thu, Nov 17, 2016 at 07:37:09PM +0200, Michael S. Tsirkin wrote:
> > On Thu, Nov 17, 2016 at 05:49:36PM +0800, Yuanhan Liu wrote:
> > > On Thu, Nov 17, 2016 at 09:47:09AM +0100, Maxime Coquelin wrote:
> > > > 
> > > > 
> > > > On 11/17/2016 09:29 AM, Yuanhan Liu wrote:
> > > > >As usaual, sorry for late response :/
> > > > >
> > > > >On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
> > > > >>Hi!
> > > > >>So it looks like we face a problem with cross-version
> > > > >>migration when using vhost. It's not new but became more
> > > > >>acute with the advent of vhost user.
> > > > >>
> > > > >>For users to be able to migrate between different versions
> > > > >>of the hypervisor the interface exposed to guests
> > > > >>by hypervisor must stay unchanged.
> > > > >>
> > > > >>The problem is that a qemu device is connected
> > > > >>to a backend in another process, so the interface
> > > > >>exposed to guests depends on the capabilities of that
> > > > >>process.
> > > > >>
> > > > >>Specifically, for vhost user interface based on virtio, this includes
> > > > >>the "host features" bitmap that defines the interface, as well as more
> > > > >>host values such as the max ring size.  Adding new features/changing
> > > > >>values to this interface is required to make progress, but on the other
> > > > >>hand we need ability to get the old host features to be compatible.
> > > > >
> > > > >It looks like to the same issue of vhost-user reconnect to me. For example,
> > > > >
> > > > >- start dpdk 16.07 & qemu 2.5
> > > > >- kill dpdk
> > > > >- start dpdk 16.11
> > > > >
> > > > >Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect),
> > > > >above should work. Because qemu saves the negotiated features before the
> > > > >disconnect and stores it back after the reconnection.
> > > > >
> > > > >    commit a463215b087c41d7ca94e51aa347cde523831873
> > > > >    Author: Marc-André Lureau <marcandre.lureau@redhat.com>
> > > > >    Date:   Mon Jun 6 18:45:05 2016 +0200
> > > > >
> > > > >        vhost-net: save & restore vhost-user acked features
> > > > >
> > > > >        The initial vhost-user connection sets the features to be negotiated
> > > > >        with the driver. Renegotiation isn't possible without device reset.
> > > > >
> > > > >        To handle reconnection of vhost-user backend, ensure the same set of
> > > > >        features are provided, and reuse already acked features.
> > > > >
> > > > >        Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > > > >
> > > > >
> > > > >So we could do similar to vhost-user? I mean, save the acked features
> > > > >before migration and store it back after it. This should be able to
> > > > >keep the compatibility. If user downgrades DPDK version, it also could
> > > > >be easily detected, and then exit with an error to user: migration
> > > > >failed due to un-compatible vhost features.
> > > > >
> > > > >Just some rough thoughts. Makes tiny sense?
> > > > 
> > > > My understanding is that the management tool has to know whether
> > > > versions are compatible before initiating the migration:
> > > 
> > > Makes sense. How about getting and restoring the acked features through
> > > qemu command lines then, say, through the monitor interface?
> > > 
> > > With that, it would be something like:
> > > 
> > > - start vhost-user backend (DPDK, VPP, or whatever) & qemu in the src host
> > > 
> > > - read the acked features (through monitor interface)
> > > 
> > > - start vhost-user backend in the dst host
> > > 
> > > - start qemu in the dst host with the just queried acked features
> > > 
> > >   QEMU then is expected to use this feature set for the later vhost-user
> > >   feature negotitation. Exit if features compatibility is broken.
> > > 
> > > Thoughts?
> > > 
> > > 	--yliu
> > 
> > 
> > You keep assuming that you have the VM started first and
> > figure out things afterwards, but this does not work.
> > 
> > Think about a cluster of machines. You want to start a VM in
> > a way that will ensure compatibility with all hosts
> > in a cluster.
> 
> I see. I was more considering about the case when the dst
> host (including the qemu and dpdk combo) is given, and
> then determine whether it will be a successfull migration
> or not.
> 
> And you are asking that we need to know which host could
> be a good candidate before starting the migration. In such
> case, we indeed need some inputs from both the qemu and
> vhost-user backend.
> 
> For DPDK, I think it could be simple, just as you said, it
> could be either a tiny script, or even a macro defined in
> the source code file (we extend it every time we add a
> new feature) to let the libvirt to read it. Or something
> else.

There's the issue of APIs that tweak features as Maxime
suggested. Maybe the only thing to do is to deprecate it,
but I feel some way for application to pass info into
guest might be benefitial.


> > If you don't, guest visible interface will change
> > and you won't be able to migrate.
> > 
> > It does not make sense to discuss feature bits specifically
> > since that is not the only part of interface.
> > For example, max ring size supported might change.
> 
> I don't quite understand why we have to consider the max ring
> size here? Isn't it a virtio device attribute, that QEMU could
> provide such compatibility information?
>
> I mean, DPDK is supposed to support vary vring size, it's QEMU
> to give a specifc value.

If backend supports s/g of any size up to 2^16, there's no issue.

ATM some backends might be assuming up to 1K s/g since
QEMU never supported bigger ones. We might classify this
as a bug, or not and add a feature flag.

But it's just an example. There might be more values at issue
in the future.

> > Let me describe how it works in qemu/libvirt.
> > When you install a VM, you can specify compatibility
> > level (aka "machine type"), and you can query the supported compatibility
> > levels. Management uses that to find the supported compatibility
> > and stores the compatibility in XML that is migrated with the VM.
> > There's also a way to find the latest level which is the
> > default unless overridden by user, again this level
> > is recorded and then
> > - management can make sure migration destination is compatible
> > - management can avoid migration to hosts without that support
> 
> Thanks for the info, it helps.
> 
> ...
> > > > >>As version here is an opaque string for libvirt and qemu,
> > > > >>anything can be used - but I suggest either a list
> > > > >>of values defining the interface, e.g.
> > > > >>any_layout=on,max_ring=256
> > > > >>or a version including the name and vendor of the backend,
> > > > >>e.g. "org.dpdk.v4.5.6".
> 
> The version scheme may not be ideal here. Assume a QEMU is supposed
> to work with a specific DPDK version, however, user may disable some
> newer features through qemu command line, that it also could work with
> an elder DPDK version. Using the version scheme will not allow us doing
> such migration to an elder DPDK version. The MTU is a lively example
> here? (when MTU feature is provided by QEMU but is actually disabled
> by user, that it could also work with an elder DPDK without MTU support).
> 
> 	--yliu

OK, so does a list of values look better to you then?



> > > > >>
> > > > >>Note that typically the list of supported versions can only be
> > > > >>extended, not shrunk. Also, if the host/guest interface
> > > > >>does not change, don't change the current version as
> > > > >>this just creates work for everyone.
> > > > >>
> > > > >>Thoughts? Would this work well for management? dpdk? vpp?
> > > > >>
> > > > >>Thanks!
> > > > >>
> > > > >>--
> > > > >>MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-11-22 14:53             ` [Qemu-devel] " Michael S. Tsirkin
@ 2016-11-24  6:31               ` Yuanhan Liu
  -1 siblings, 0 replies; 44+ messages in thread
From: Yuanhan Liu @ 2016-11-24  6:31 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Maxime Coquelin, dev, Stephen Hemminger, qemu-devel, libvir-list,
	vpp-dev, Marc-André Lureau

On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
> > > You keep assuming that you have the VM started first and
> > > figure out things afterwards, but this does not work.
> > > 
> > > Think about a cluster of machines. You want to start a VM in
> > > a way that will ensure compatibility with all hosts
> > > in a cluster.
> > 
> > I see. I was more considering about the case when the dst
> > host (including the qemu and dpdk combo) is given, and
> > then determine whether it will be a successfull migration
> > or not.
> > 
> > And you are asking that we need to know which host could
> > be a good candidate before starting the migration. In such
> > case, we indeed need some inputs from both the qemu and
> > vhost-user backend.
> > 
> > For DPDK, I think it could be simple, just as you said, it
> > could be either a tiny script, or even a macro defined in
> > the source code file (we extend it every time we add a
> > new feature) to let the libvirt to read it. Or something
> > else.
> 
> There's the issue of APIs that tweak features as Maxime
> suggested.

Yes, it's a good point.

> Maybe the only thing to do is to deprecate it,

Looks like so.

> but I feel some way for application to pass info into
> guest might be benefitial.

The two APIs are just for tweaking feature bits DPDK supports before
any device got connected. It's another way to disable some features
(the another obvious way is to through QEMU command lines).

IMO, it's bit handy only in a case like: we have bunch of VMs. Instead
of disabling something though qemu one by one, we could disable it
once in DPDK.

But I doubt the useful of it. It's only used in DPDK's vhost example
after all. Nor is it used in vhost pmd, neither is it used in OVS.

> > > If you don't, guest visible interface will change
> > > and you won't be able to migrate.
> > > 
> > > It does not make sense to discuss feature bits specifically
> > > since that is not the only part of interface.
> > > For example, max ring size supported might change.
> > 
> > I don't quite understand why we have to consider the max ring
> > size here? Isn't it a virtio device attribute, that QEMU could
> > provide such compatibility information?
> >
> > I mean, DPDK is supposed to support vary vring size, it's QEMU
> > to give a specifc value.
> 
> If backend supports s/g of any size up to 2^16, there's no issue.

I don't know others, but I see no issues in DPDK.

> ATM some backends might be assuming up to 1K s/g since
> QEMU never supported bigger ones. We might classify this
> as a bug, or not and add a feature flag.
> 
> But it's just an example. There might be more values at issue
> in the future.

Yeah, maybe. But we could analysis it one by one.

> > > Let me describe how it works in qemu/libvirt.
> > > When you install a VM, you can specify compatibility
> > > level (aka "machine type"), and you can query the supported compatibility
> > > levels. Management uses that to find the supported compatibility
> > > and stores the compatibility in XML that is migrated with the VM.
> > > There's also a way to find the latest level which is the
> > > default unless overridden by user, again this level
> > > is recorded and then
> > > - management can make sure migration destination is compatible
> > > - management can avoid migration to hosts without that support
> > 
> > Thanks for the info, it helps.
> > 
> > ...
> > > > > >>As version here is an opaque string for libvirt and qemu,
> > > > > >>anything can be used - but I suggest either a list
> > > > > >>of values defining the interface, e.g.
> > > > > >>any_layout=on,max_ring=256
> > > > > >>or a version including the name and vendor of the backend,
> > > > > >>e.g. "org.dpdk.v4.5.6".
> > 
> > The version scheme may not be ideal here. Assume a QEMU is supposed
> > to work with a specific DPDK version, however, user may disable some
> > newer features through qemu command line, that it also could work with
> > an elder DPDK version. Using the version scheme will not allow us doing
> > such migration to an elder DPDK version. The MTU is a lively example
> > here? (when MTU feature is provided by QEMU but is actually disabled
> > by user, that it could also work with an elder DPDK without MTU support).
> > 
> > 	--yliu
> 
> OK, so does a list of values look better to you then?

Yes, if there are no better way.

And I think it may be better to not list all those features, literally.
But instead, using the number should be better, say, features=0xdeadbeef.

Listing the feature names means we have to come to an agreement in all
components involved here (QEMU, libvirt, DPDK, VPP, and maybe more
backends), that we have to use the exact same feature names. Though it
may not be a big deal, it lacks some flexibility.

A feature bits will not have this issue.

	--yliu

> 
> 
> > > > > >>
> > > > > >>Note that typically the list of supported versions can only be
> > > > > >>extended, not shrunk. Also, if the host/guest interface
> > > > > >>does not change, don't change the current version as
> > > > > >>this just creates work for everyone.
> > > > > >>
> > > > > >>Thoughts? Would this work well for management? dpdk? vpp?
> > > > > >>
> > > > > >>Thanks!
> > > > > >>
> > > > > >>--
> > > > > >>MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] dpdk/vpp and cross-version migration for vhost
@ 2016-11-24  6:31               ` Yuanhan Liu
  0 siblings, 0 replies; 44+ messages in thread
From: Yuanhan Liu @ 2016-11-24  6:31 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Maxime Coquelin, dev, Stephen Hemminger, qemu-devel, libvir-list,
	vpp-dev, Marc-André Lureau

On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
> > > You keep assuming that you have the VM started first and
> > > figure out things afterwards, but this does not work.
> > > 
> > > Think about a cluster of machines. You want to start a VM in
> > > a way that will ensure compatibility with all hosts
> > > in a cluster.
> > 
> > I see. I was more considering about the case when the dst
> > host (including the qemu and dpdk combo) is given, and
> > then determine whether it will be a successfull migration
> > or not.
> > 
> > And you are asking that we need to know which host could
> > be a good candidate before starting the migration. In such
> > case, we indeed need some inputs from both the qemu and
> > vhost-user backend.
> > 
> > For DPDK, I think it could be simple, just as you said, it
> > could be either a tiny script, or even a macro defined in
> > the source code file (we extend it every time we add a
> > new feature) to let the libvirt to read it. Or something
> > else.
> 
> There's the issue of APIs that tweak features as Maxime
> suggested.

Yes, it's a good point.

> Maybe the only thing to do is to deprecate it,

Looks like so.

> but I feel some way for application to pass info into
> guest might be benefitial.

The two APIs are just for tweaking feature bits DPDK supports before
any device got connected. It's another way to disable some features
(the another obvious way is to through QEMU command lines).

IMO, it's bit handy only in a case like: we have bunch of VMs. Instead
of disabling something though qemu one by one, we could disable it
once in DPDK.

But I doubt the useful of it. It's only used in DPDK's vhost example
after all. Nor is it used in vhost pmd, neither is it used in OVS.

> > > If you don't, guest visible interface will change
> > > and you won't be able to migrate.
> > > 
> > > It does not make sense to discuss feature bits specifically
> > > since that is not the only part of interface.
> > > For example, max ring size supported might change.
> > 
> > I don't quite understand why we have to consider the max ring
> > size here? Isn't it a virtio device attribute, that QEMU could
> > provide such compatibility information?
> >
> > I mean, DPDK is supposed to support vary vring size, it's QEMU
> > to give a specifc value.
> 
> If backend supports s/g of any size up to 2^16, there's no issue.

I don't know others, but I see no issues in DPDK.

> ATM some backends might be assuming up to 1K s/g since
> QEMU never supported bigger ones. We might classify this
> as a bug, or not and add a feature flag.
> 
> But it's just an example. There might be more values at issue
> in the future.

Yeah, maybe. But we could analysis it one by one.

> > > Let me describe how it works in qemu/libvirt.
> > > When you install a VM, you can specify compatibility
> > > level (aka "machine type"), and you can query the supported compatibility
> > > levels. Management uses that to find the supported compatibility
> > > and stores the compatibility in XML that is migrated with the VM.
> > > There's also a way to find the latest level which is the
> > > default unless overridden by user, again this level
> > > is recorded and then
> > > - management can make sure migration destination is compatible
> > > - management can avoid migration to hosts without that support
> > 
> > Thanks for the info, it helps.
> > 
> > ...
> > > > > >>As version here is an opaque string for libvirt and qemu,
> > > > > >>anything can be used - but I suggest either a list
> > > > > >>of values defining the interface, e.g.
> > > > > >>any_layout=on,max_ring=256
> > > > > >>or a version including the name and vendor of the backend,
> > > > > >>e.g. "org.dpdk.v4.5.6".
> > 
> > The version scheme may not be ideal here. Assume a QEMU is supposed
> > to work with a specific DPDK version, however, user may disable some
> > newer features through qemu command line, that it also could work with
> > an elder DPDK version. Using the version scheme will not allow us doing
> > such migration to an elder DPDK version. The MTU is a lively example
> > here? (when MTU feature is provided by QEMU but is actually disabled
> > by user, that it could also work with an elder DPDK without MTU support).
> > 
> > 	--yliu
> 
> OK, so does a list of values look better to you then?

Yes, if there are no better way.

And I think it may be better to not list all those features, literally.
But instead, using the number should be better, say, features=0xdeadbeef.

Listing the feature names means we have to come to an agreement in all
components involved here (QEMU, libvirt, DPDK, VPP, and maybe more
backends), that we have to use the exact same feature names. Though it
may not be a big deal, it lacks some flexibility.

A feature bits will not have this issue.

	--yliu

> 
> 
> > > > > >>
> > > > > >>Note that typically the list of supported versions can only be
> > > > > >>extended, not shrunk. Also, if the host/guest interface
> > > > > >>does not change, don't change the current version as
> > > > > >>this just creates work for everyone.
> > > > > >>
> > > > > >>Thoughts? Would this work well for management? dpdk? vpp?
> > > > > >>
> > > > > >>Thanks!
> > > > > >>
> > > > > >>--
> > > > > >>MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-11-24  6:31               ` [Qemu-devel] " Yuanhan Liu
@ 2016-11-24  9:30                 ` Kevin Traynor
  -1 siblings, 0 replies; 44+ messages in thread
From: Kevin Traynor @ 2016-11-24  9:30 UTC (permalink / raw)
  To: Yuanhan Liu, Michael S. Tsirkin
  Cc: Maxime Coquelin, dev, Stephen Hemminger, qemu-devel, libvir-list,
	vpp-dev, Marc-André Lureau

On 11/24/2016 06:31 AM, Yuanhan Liu wrote:
> On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
>>>> You keep assuming that you have the VM started first and
>>>> figure out things afterwards, but this does not work.
>>>>
>>>> Think about a cluster of machines. You want to start a VM in
>>>> a way that will ensure compatibility with all hosts
>>>> in a cluster.
>>>
>>> I see. I was more considering about the case when the dst
>>> host (including the qemu and dpdk combo) is given, and
>>> then determine whether it will be a successfull migration
>>> or not.
>>>
>>> And you are asking that we need to know which host could
>>> be a good candidate before starting the migration. In such
>>> case, we indeed need some inputs from both the qemu and
>>> vhost-user backend.
>>>
>>> For DPDK, I think it could be simple, just as you said, it
>>> could be either a tiny script, or even a macro defined in
>>> the source code file (we extend it every time we add a
>>> new feature) to let the libvirt to read it. Or something
>>> else.
>>
>> There's the issue of APIs that tweak features as Maxime
>> suggested.
> 
> Yes, it's a good point.
> 
>> Maybe the only thing to do is to deprecate it,
> 
> Looks like so.
> 
>> but I feel some way for application to pass info into
>> guest might be benefitial.
> 
> The two APIs are just for tweaking feature bits DPDK supports before
> any device got connected. It's another way to disable some features
> (the another obvious way is to through QEMU command lines).
> 
> IMO, it's bit handy only in a case like: we have bunch of VMs. Instead
> of disabling something though qemu one by one, we could disable it
> once in DPDK.
> 
> But I doubt the useful of it. It's only used in DPDK's vhost example
> after all. Nor is it used in vhost pmd, neither is it used in OVS.

rte_vhost_feature_disable() is currently used in OVS, lib/netdev-dpdk.c

netdev_dpdk_vhost_class_init(void)
{
    static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;

    /* This function can be called for different classes.  The
initialization
     * needs to be done only once */
    if (ovsthread_once_start(&once)) {
        rte_vhost_driver_callback_register(&virtio_net_device_ops);
        rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4
                                  | 1ULL << VIRTIO_NET_F_HOST_TSO6
                                  | 1ULL << VIRTIO_NET_F_CSUM);

> 
>>>> If you don't, guest visible interface will change
>>>> and you won't be able to migrate.
>>>>
>>>> It does not make sense to discuss feature bits specifically
>>>> since that is not the only part of interface.
>>>> For example, max ring size supported might change.
>>>
>>> I don't quite understand why we have to consider the max ring
>>> size here? Isn't it a virtio device attribute, that QEMU could
>>> provide such compatibility information?
>>>
>>> I mean, DPDK is supposed to support vary vring size, it's QEMU
>>> to give a specifc value.
>>
>> If backend supports s/g of any size up to 2^16, there's no issue.
> 
> I don't know others, but I see no issues in DPDK.
> 
>> ATM some backends might be assuming up to 1K s/g since
>> QEMU never supported bigger ones. We might classify this
>> as a bug, or not and add a feature flag.
>>
>> But it's just an example. There might be more values at issue
>> in the future.
> 
> Yeah, maybe. But we could analysis it one by one.
> 
>>>> Let me describe how it works in qemu/libvirt.
>>>> When you install a VM, you can specify compatibility
>>>> level (aka "machine type"), and you can query the supported compatibility
>>>> levels. Management uses that to find the supported compatibility
>>>> and stores the compatibility in XML that is migrated with the VM.
>>>> There's also a way to find the latest level which is the
>>>> default unless overridden by user, again this level
>>>> is recorded and then
>>>> - management can make sure migration destination is compatible
>>>> - management can avoid migration to hosts without that support
>>>
>>> Thanks for the info, it helps.
>>>
>>> ...
>>>>>>>> As version here is an opaque string for libvirt and qemu,
>>>>>>>> anything can be used - but I suggest either a list
>>>>>>>> of values defining the interface, e.g.
>>>>>>>> any_layout=on,max_ring=256
>>>>>>>> or a version including the name and vendor of the backend,
>>>>>>>> e.g. "org.dpdk.v4.5.6".
>>>
>>> The version scheme may not be ideal here. Assume a QEMU is supposed
>>> to work with a specific DPDK version, however, user may disable some
>>> newer features through qemu command line, that it also could work with
>>> an elder DPDK version. Using the version scheme will not allow us doing
>>> such migration to an elder DPDK version. The MTU is a lively example
>>> here? (when MTU feature is provided by QEMU but is actually disabled
>>> by user, that it could also work with an elder DPDK without MTU support).
>>>
>>> 	--yliu
>>
>> OK, so does a list of values look better to you then?
> 
> Yes, if there are no better way.
> 
> And I think it may be better to not list all those features, literally.
> But instead, using the number should be better, say, features=0xdeadbeef.
> 
> Listing the feature names means we have to come to an agreement in all
> components involved here (QEMU, libvirt, DPDK, VPP, and maybe more
> backends), that we have to use the exact same feature names. Though it
> may not be a big deal, it lacks some flexibility.
> 
> A feature bits will not have this issue.
> 
> 	--yliu
> 
>>
>>
>>>>>>>>
>>>>>>>> Note that typically the list of supported versions can only be
>>>>>>>> extended, not shrunk. Also, if the host/guest interface
>>>>>>>> does not change, don't change the current version as
>>>>>>>> this just creates work for everyone.
>>>>>>>>
>>>>>>>> Thoughts? Would this work well for management? dpdk? vpp?
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> --
>>>>>>>> MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [dpdk-dev] dpdk/vpp and cross-version migration for vhost
@ 2016-11-24  9:30                 ` Kevin Traynor
  0 siblings, 0 replies; 44+ messages in thread
From: Kevin Traynor @ 2016-11-24  9:30 UTC (permalink / raw)
  To: Yuanhan Liu, Michael S. Tsirkin
  Cc: Maxime Coquelin, dev, Stephen Hemminger, qemu-devel, libvir-list,
	vpp-dev, Marc-André Lureau

On 11/24/2016 06:31 AM, Yuanhan Liu wrote:
> On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
>>>> You keep assuming that you have the VM started first and
>>>> figure out things afterwards, but this does not work.
>>>>
>>>> Think about a cluster of machines. You want to start a VM in
>>>> a way that will ensure compatibility with all hosts
>>>> in a cluster.
>>>
>>> I see. I was more considering about the case when the dst
>>> host (including the qemu and dpdk combo) is given, and
>>> then determine whether it will be a successfull migration
>>> or not.
>>>
>>> And you are asking that we need to know which host could
>>> be a good candidate before starting the migration. In such
>>> case, we indeed need some inputs from both the qemu and
>>> vhost-user backend.
>>>
>>> For DPDK, I think it could be simple, just as you said, it
>>> could be either a tiny script, or even a macro defined in
>>> the source code file (we extend it every time we add a
>>> new feature) to let the libvirt to read it. Or something
>>> else.
>>
>> There's the issue of APIs that tweak features as Maxime
>> suggested.
> 
> Yes, it's a good point.
> 
>> Maybe the only thing to do is to deprecate it,
> 
> Looks like so.
> 
>> but I feel some way for application to pass info into
>> guest might be benefitial.
> 
> The two APIs are just for tweaking feature bits DPDK supports before
> any device got connected. It's another way to disable some features
> (the another obvious way is to through QEMU command lines).
> 
> IMO, it's bit handy only in a case like: we have bunch of VMs. Instead
> of disabling something though qemu one by one, we could disable it
> once in DPDK.
> 
> But I doubt the useful of it. It's only used in DPDK's vhost example
> after all. Nor is it used in vhost pmd, neither is it used in OVS.

rte_vhost_feature_disable() is currently used in OVS, lib/netdev-dpdk.c

netdev_dpdk_vhost_class_init(void)
{
    static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;

    /* This function can be called for different classes.  The
initialization
     * needs to be done only once */
    if (ovsthread_once_start(&once)) {
        rte_vhost_driver_callback_register(&virtio_net_device_ops);
        rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4
                                  | 1ULL << VIRTIO_NET_F_HOST_TSO6
                                  | 1ULL << VIRTIO_NET_F_CSUM);

> 
>>>> If you don't, guest visible interface will change
>>>> and you won't be able to migrate.
>>>>
>>>> It does not make sense to discuss feature bits specifically
>>>> since that is not the only part of interface.
>>>> For example, max ring size supported might change.
>>>
>>> I don't quite understand why we have to consider the max ring
>>> size here? Isn't it a virtio device attribute, that QEMU could
>>> provide such compatibility information?
>>>
>>> I mean, DPDK is supposed to support vary vring size, it's QEMU
>>> to give a specifc value.
>>
>> If backend supports s/g of any size up to 2^16, there's no issue.
> 
> I don't know others, but I see no issues in DPDK.
> 
>> ATM some backends might be assuming up to 1K s/g since
>> QEMU never supported bigger ones. We might classify this
>> as a bug, or not and add a feature flag.
>>
>> But it's just an example. There might be more values at issue
>> in the future.
> 
> Yeah, maybe. But we could analysis it one by one.
> 
>>>> Let me describe how it works in qemu/libvirt.
>>>> When you install a VM, you can specify compatibility
>>>> level (aka "machine type"), and you can query the supported compatibility
>>>> levels. Management uses that to find the supported compatibility
>>>> and stores the compatibility in XML that is migrated with the VM.
>>>> There's also a way to find the latest level which is the
>>>> default unless overridden by user, again this level
>>>> is recorded and then
>>>> - management can make sure migration destination is compatible
>>>> - management can avoid migration to hosts without that support
>>>
>>> Thanks for the info, it helps.
>>>
>>> ...
>>>>>>>> As version here is an opaque string for libvirt and qemu,
>>>>>>>> anything can be used - but I suggest either a list
>>>>>>>> of values defining the interface, e.g.
>>>>>>>> any_layout=on,max_ring=256
>>>>>>>> or a version including the name and vendor of the backend,
>>>>>>>> e.g. "org.dpdk.v4.5.6".
>>>
>>> The version scheme may not be ideal here. Assume a QEMU is supposed
>>> to work with a specific DPDK version, however, user may disable some
>>> newer features through qemu command line, that it also could work with
>>> an elder DPDK version. Using the version scheme will not allow us doing
>>> such migration to an elder DPDK version. The MTU is a lively example
>>> here? (when MTU feature is provided by QEMU but is actually disabled
>>> by user, that it could also work with an elder DPDK without MTU support).
>>>
>>> 	--yliu
>>
>> OK, so does a list of values look better to you then?
> 
> Yes, if there are no better way.
> 
> And I think it may be better to not list all those features, literally.
> But instead, using the number should be better, say, features=0xdeadbeef.
> 
> Listing the feature names means we have to come to an agreement in all
> components involved here (QEMU, libvirt, DPDK, VPP, and maybe more
> backends), that we have to use the exact same feature names. Though it
> may not be a big deal, it lacks some flexibility.
> 
> A feature bits will not have this issue.
> 
> 	--yliu
> 
>>
>>
>>>>>>>>
>>>>>>>> Note that typically the list of supported versions can only be
>>>>>>>> extended, not shrunk. Also, if the host/guest interface
>>>>>>>> does not change, don't change the current version as
>>>>>>>> this just creates work for everyone.
>>>>>>>>
>>>>>>>> Thoughts? Would this work well for management? dpdk? vpp?
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> --
>>>>>>>> MST

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-11-24  9:30                 ` [Qemu-devel] [dpdk-dev] " Kevin Traynor
@ 2016-11-24 12:33                   ` Yuanhan Liu
  -1 siblings, 0 replies; 44+ messages in thread
From: Yuanhan Liu @ 2016-11-24 12:33 UTC (permalink / raw)
  To: Kevin Traynor
  Cc: Michael S. Tsirkin, Maxime Coquelin, dev, Stephen Hemminger,
	qemu-devel, libvir-list, vpp-dev, Marc-André Lureau

On Thu, Nov 24, 2016 at 09:30:49AM +0000, Kevin Traynor wrote:
> On 11/24/2016 06:31 AM, Yuanhan Liu wrote:
> > On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
> >>>> You keep assuming that you have the VM started first and
> >>>> figure out things afterwards, but this does not work.
> >>>>
> >>>> Think about a cluster of machines. You want to start a VM in
> >>>> a way that will ensure compatibility with all hosts
> >>>> in a cluster.
> >>>
> >>> I see. I was more considering about the case when the dst
> >>> host (including the qemu and dpdk combo) is given, and
> >>> then determine whether it will be a successfull migration
> >>> or not.
> >>>
> >>> And you are asking that we need to know which host could
> >>> be a good candidate before starting the migration. In such
> >>> case, we indeed need some inputs from both the qemu and
> >>> vhost-user backend.
> >>>
> >>> For DPDK, I think it could be simple, just as you said, it
> >>> could be either a tiny script, or even a macro defined in
> >>> the source code file (we extend it every time we add a
> >>> new feature) to let the libvirt to read it. Or something
> >>> else.
> >>
> >> There's the issue of APIs that tweak features as Maxime
> >> suggested.
> > 
> > Yes, it's a good point.
> > 
> >> Maybe the only thing to do is to deprecate it,
> > 
> > Looks like so.
> > 
> >> but I feel some way for application to pass info into
> >> guest might be benefitial.
> > 
> > The two APIs are just for tweaking feature bits DPDK supports before
> > any device got connected. It's another way to disable some features
> > (the another obvious way is to through QEMU command lines).
> > 
> > IMO, it's bit handy only in a case like: we have bunch of VMs. Instead
> > of disabling something though qemu one by one, we could disable it
> > once in DPDK.
> > 
> > But I doubt the useful of it. It's only used in DPDK's vhost example
> > after all. Nor is it used in vhost pmd, neither is it used in OVS.
> 
> rte_vhost_feature_disable() is currently used in OVS, lib/netdev-dpdk.c

Hmmm. I must have checked very old code ...
> 
> netdev_dpdk_vhost_class_init(void)
> {
>     static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
> 
>     /* This function can be called for different classes.  The
> initialization
>      * needs to be done only once */
>     if (ovsthread_once_start(&once)) {
>         rte_vhost_driver_callback_register(&virtio_net_device_ops);
>         rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4
>                                   | 1ULL << VIRTIO_NET_F_HOST_TSO6
>                                   | 1ULL << VIRTIO_NET_F_CSUM);

I saw the commit introduced such change, but it tells no reason why
it was added.

commit 362ca39639ae871806be5ae97d55e1cbb14afd92
Author: mweglicx <michalx.weglicki@intel.com>
Date:   Thu Apr 14 17:40:06 2016 +0100

    Update relevant artifacts to add support for DPDK 16.04.

    Following changes are applied:
     - INSTALL.DPDK.md: CONFIG_RTE_BUILD_COMBINE_LIBS step has been
       removed because it is no longer present in DPDK configuration
       (combined library is created by default),
     - INSTALL.DPDK.md: VHost Cuse configuration is updated,
     - netdev-dpdk.c: Link speed definition is changed in DPDK and
       netdev_dpdk_get_features is updated accordingly,
     - netdev-dpdk.c: TSO and checksum offload has been disabled for
       vhostuser device.
     - .travis/linux-build.sh: DPDK version is updated and legacy
       flags have been removed in configuration.

    Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com>
    Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
    Acked-by: Daniele Di Proietto <diproiettod@vmware.com>

	--yliu

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [dpdk-dev] dpdk/vpp and cross-version migration for vhost
@ 2016-11-24 12:33                   ` Yuanhan Liu
  0 siblings, 0 replies; 44+ messages in thread
From: Yuanhan Liu @ 2016-11-24 12:33 UTC (permalink / raw)
  To: Kevin Traynor
  Cc: Michael S. Tsirkin, Maxime Coquelin, dev, Stephen Hemminger,
	qemu-devel, libvir-list, vpp-dev, Marc-André Lureau

On Thu, Nov 24, 2016 at 09:30:49AM +0000, Kevin Traynor wrote:
> On 11/24/2016 06:31 AM, Yuanhan Liu wrote:
> > On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
> >>>> You keep assuming that you have the VM started first and
> >>>> figure out things afterwards, but this does not work.
> >>>>
> >>>> Think about a cluster of machines. You want to start a VM in
> >>>> a way that will ensure compatibility with all hosts
> >>>> in a cluster.
> >>>
> >>> I see. I was more considering about the case when the dst
> >>> host (including the qemu and dpdk combo) is given, and
> >>> then determine whether it will be a successfull migration
> >>> or not.
> >>>
> >>> And you are asking that we need to know which host could
> >>> be a good candidate before starting the migration. In such
> >>> case, we indeed need some inputs from both the qemu and
> >>> vhost-user backend.
> >>>
> >>> For DPDK, I think it could be simple, just as you said, it
> >>> could be either a tiny script, or even a macro defined in
> >>> the source code file (we extend it every time we add a
> >>> new feature) to let the libvirt to read it. Or something
> >>> else.
> >>
> >> There's the issue of APIs that tweak features as Maxime
> >> suggested.
> > 
> > Yes, it's a good point.
> > 
> >> Maybe the only thing to do is to deprecate it,
> > 
> > Looks like so.
> > 
> >> but I feel some way for application to pass info into
> >> guest might be benefitial.
> > 
> > The two APIs are just for tweaking feature bits DPDK supports before
> > any device got connected. It's another way to disable some features
> > (the another obvious way is to through QEMU command lines).
> > 
> > IMO, it's bit handy only in a case like: we have bunch of VMs. Instead
> > of disabling something though qemu one by one, we could disable it
> > once in DPDK.
> > 
> > But I doubt the useful of it. It's only used in DPDK's vhost example
> > after all. Nor is it used in vhost pmd, neither is it used in OVS.
> 
> rte_vhost_feature_disable() is currently used in OVS, lib/netdev-dpdk.c

Hmmm. I must have checked very old code ...
> 
> netdev_dpdk_vhost_class_init(void)
> {
>     static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
> 
>     /* This function can be called for different classes.  The
> initialization
>      * needs to be done only once */
>     if (ovsthread_once_start(&once)) {
>         rte_vhost_driver_callback_register(&virtio_net_device_ops);
>         rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4
>                                   | 1ULL << VIRTIO_NET_F_HOST_TSO6
>                                   | 1ULL << VIRTIO_NET_F_CSUM);

I saw the commit introduced such change, but it tells no reason why
it was added.

commit 362ca39639ae871806be5ae97d55e1cbb14afd92
Author: mweglicx <michalx.weglicki@intel.com>
Date:   Thu Apr 14 17:40:06 2016 +0100

    Update relevant artifacts to add support for DPDK 16.04.

    Following changes are applied:
     - INSTALL.DPDK.md: CONFIG_RTE_BUILD_COMBINE_LIBS step has been
       removed because it is no longer present in DPDK configuration
       (combined library is created by default),
     - INSTALL.DPDK.md: VHost Cuse configuration is updated,
     - netdev-dpdk.c: Link speed definition is changed in DPDK and
       netdev_dpdk_get_features is updated accordingly,
     - netdev-dpdk.c: TSO and checksum offload has been disabled for
       vhostuser device.
     - .travis/linux-build.sh: DPDK version is updated and legacy
       flags have been removed in configuration.

    Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com>
    Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
    Acked-by: Daniele Di Proietto <diproiettod@vmware.com>

	--yliu

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-11-24 12:33                   ` [Qemu-devel] [dpdk-dev] " Yuanhan Liu
@ 2016-11-24 12:47                     ` Maxime Coquelin
  -1 siblings, 0 replies; 44+ messages in thread
From: Maxime Coquelin @ 2016-11-24 12:47 UTC (permalink / raw)
  To: Yuanhan Liu, Kevin Traynor
  Cc: Michael S. Tsirkin, dev, Stephen Hemminger, qemu-devel,
	libvir-list, vpp-dev, Marc-André Lureau



On 11/24/2016 01:33 PM, Yuanhan Liu wrote:
> On Thu, Nov 24, 2016 at 09:30:49AM +0000, Kevin Traynor wrote:
>> > On 11/24/2016 06:31 AM, Yuanhan Liu wrote:
>>> > > On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
>>>>>> > >>>> You keep assuming that you have the VM started first and
>>>>>> > >>>> figure out things afterwards, but this does not work.
>>>>>> > >>>>
>>>>>> > >>>> Think about a cluster of machines. You want to start a VM in
>>>>>> > >>>> a way that will ensure compatibility with all hosts
>>>>>> > >>>> in a cluster.
>>>>> > >>>
>>>>> > >>> I see. I was more considering about the case when the dst
>>>>> > >>> host (including the qemu and dpdk combo) is given, and
>>>>> > >>> then determine whether it will be a successfull migration
>>>>> > >>> or not.
>>>>> > >>>
>>>>> > >>> And you are asking that we need to know which host could
>>>>> > >>> be a good candidate before starting the migration. In such
>>>>> > >>> case, we indeed need some inputs from both the qemu and
>>>>> > >>> vhost-user backend.
>>>>> > >>>
>>>>> > >>> For DPDK, I think it could be simple, just as you said, it
>>>>> > >>> could be either a tiny script, or even a macro defined in
>>>>> > >>> the source code file (we extend it every time we add a
>>>>> > >>> new feature) to let the libvirt to read it. Or something
>>>>> > >>> else.
>>>> > >>
>>>> > >> There's the issue of APIs that tweak features as Maxime
>>>> > >> suggested.
>>> > >
>>> > > Yes, it's a good point.
>>> > >
>>>> > >> Maybe the only thing to do is to deprecate it,
>>> > >
>>> > > Looks like so.
>>> > >
>>>> > >> but I feel some way for application to pass info into
>>>> > >> guest might be benefitial.
>>> > >
>>> > > The two APIs are just for tweaking feature bits DPDK supports before
>>> > > any device got connected. It's another way to disable some features
>>> > > (the another obvious way is to through QEMU command lines).
>>> > >
>>> > > IMO, it's bit handy only in a case like: we have bunch of VMs. Instead
>>> > > of disabling something though qemu one by one, we could disable it
>>> > > once in DPDK.
>>> > >
>>> > > But I doubt the useful of it. It's only used in DPDK's vhost example
>>> > > after all. Nor is it used in vhost pmd, neither is it used in OVS.
>> >
>> > rte_vhost_feature_disable() is currently used in OVS, lib/netdev-dpdk.c
> Hmmm. I must have checked very old code ...
>> >
>> > netdev_dpdk_vhost_class_init(void)
>> > {
>> >     static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
>> >
>> >     /* This function can be called for different classes.  The
>> > initialization
>> >      * needs to be done only once */
>> >     if (ovsthread_once_start(&once)) {
>> >         rte_vhost_driver_callback_register(&virtio_net_device_ops);
>> >         rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4
>> >                                   | 1ULL << VIRTIO_NET_F_HOST_TSO6
>> >                                   | 1ULL << VIRTIO_NET_F_CSUM);
> I saw the commit introduced such change, but it tells no reason why
> it was added.

I'm also interested to know the reason.
In any case, I think this is something that can/should be managed by
the management tool, which  should disable it in cmd parameters.

Kevin, do you agree?

Cheers,
Maxime

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [dpdk-dev] dpdk/vpp and cross-version migration for vhost
@ 2016-11-24 12:47                     ` Maxime Coquelin
  0 siblings, 0 replies; 44+ messages in thread
From: Maxime Coquelin @ 2016-11-24 12:47 UTC (permalink / raw)
  To: Yuanhan Liu, Kevin Traynor
  Cc: Michael S. Tsirkin, dev, Stephen Hemminger, qemu-devel,
	libvir-list, vpp-dev, Marc-André Lureau



On 11/24/2016 01:33 PM, Yuanhan Liu wrote:
> On Thu, Nov 24, 2016 at 09:30:49AM +0000, Kevin Traynor wrote:
>> > On 11/24/2016 06:31 AM, Yuanhan Liu wrote:
>>> > > On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
>>>>>> > >>>> You keep assuming that you have the VM started first and
>>>>>> > >>>> figure out things afterwards, but this does not work.
>>>>>> > >>>>
>>>>>> > >>>> Think about a cluster of machines. You want to start a VM in
>>>>>> > >>>> a way that will ensure compatibility with all hosts
>>>>>> > >>>> in a cluster.
>>>>> > >>>
>>>>> > >>> I see. I was more considering about the case when the dst
>>>>> > >>> host (including the qemu and dpdk combo) is given, and
>>>>> > >>> then determine whether it will be a successfull migration
>>>>> > >>> or not.
>>>>> > >>>
>>>>> > >>> And you are asking that we need to know which host could
>>>>> > >>> be a good candidate before starting the migration. In such
>>>>> > >>> case, we indeed need some inputs from both the qemu and
>>>>> > >>> vhost-user backend.
>>>>> > >>>
>>>>> > >>> For DPDK, I think it could be simple, just as you said, it
>>>>> > >>> could be either a tiny script, or even a macro defined in
>>>>> > >>> the source code file (we extend it every time we add a
>>>>> > >>> new feature) to let the libvirt to read it. Or something
>>>>> > >>> else.
>>>> > >>
>>>> > >> There's the issue of APIs that tweak features as Maxime
>>>> > >> suggested.
>>> > >
>>> > > Yes, it's a good point.
>>> > >
>>>> > >> Maybe the only thing to do is to deprecate it,
>>> > >
>>> > > Looks like so.
>>> > >
>>>> > >> but I feel some way for application to pass info into
>>>> > >> guest might be benefitial.
>>> > >
>>> > > The two APIs are just for tweaking feature bits DPDK supports before
>>> > > any device got connected. It's another way to disable some features
>>> > > (the another obvious way is to through QEMU command lines).
>>> > >
>>> > > IMO, it's bit handy only in a case like: we have bunch of VMs. Instead
>>> > > of disabling something though qemu one by one, we could disable it
>>> > > once in DPDK.
>>> > >
>>> > > But I doubt the useful of it. It's only used in DPDK's vhost example
>>> > > after all. Nor is it used in vhost pmd, neither is it used in OVS.
>> >
>> > rte_vhost_feature_disable() is currently used in OVS, lib/netdev-dpdk.c
> Hmmm. I must have checked very old code ...
>> >
>> > netdev_dpdk_vhost_class_init(void)
>> > {
>> >     static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
>> >
>> >     /* This function can be called for different classes.  The
>> > initialization
>> >      * needs to be done only once */
>> >     if (ovsthread_once_start(&once)) {
>> >         rte_vhost_driver_callback_register(&virtio_net_device_ops);
>> >         rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4
>> >                                   | 1ULL << VIRTIO_NET_F_HOST_TSO6
>> >                                   | 1ULL << VIRTIO_NET_F_CSUM);
> I saw the commit introduced such change, but it tells no reason why
> it was added.

I'm also interested to know the reason.
In any case, I think this is something that can/should be managed by
the management tool, which  should disable it in cmd parameters.

Kevin, do you agree?

Cheers,
Maxime

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-11-24 12:47                     ` [Qemu-devel] [dpdk-dev] " Maxime Coquelin
@ 2016-11-24 15:01                       ` Kevin Traynor
  -1 siblings, 0 replies; 44+ messages in thread
From: Kevin Traynor @ 2016-11-24 15:01 UTC (permalink / raw)
  To: Maxime Coquelin, Yuanhan Liu, Kavanagh, Mark B, michalx.weglicki
  Cc: Michael S. Tsirkin, dev, Stephen Hemminger, qemu-devel,
	libvir-list, vpp-dev, Marc-André Lureau

On 11/24/2016 12:47 PM, Maxime Coquelin wrote:
> 
> 
> On 11/24/2016 01:33 PM, Yuanhan Liu wrote:
>> On Thu, Nov 24, 2016 at 09:30:49AM +0000, Kevin Traynor wrote:
>>> > On 11/24/2016 06:31 AM, Yuanhan Liu wrote:
>>>> > > On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
>>>>>>> > >>>> You keep assuming that you have the VM started first and
>>>>>>> > >>>> figure out things afterwards, but this does not work.
>>>>>>> > >>>>
>>>>>>> > >>>> Think about a cluster of machines. You want to start a VM in
>>>>>>> > >>>> a way that will ensure compatibility with all hosts
>>>>>>> > >>>> in a cluster.
>>>>>> > >>>
>>>>>> > >>> I see. I was more considering about the case when the dst
>>>>>> > >>> host (including the qemu and dpdk combo) is given, and
>>>>>> > >>> then determine whether it will be a successfull migration
>>>>>> > >>> or not.
>>>>>> > >>>
>>>>>> > >>> And you are asking that we need to know which host could
>>>>>> > >>> be a good candidate before starting the migration. In such
>>>>>> > >>> case, we indeed need some inputs from both the qemu and
>>>>>> > >>> vhost-user backend.
>>>>>> > >>>
>>>>>> > >>> For DPDK, I think it could be simple, just as you said, it
>>>>>> > >>> could be either a tiny script, or even a macro defined in
>>>>>> > >>> the source code file (we extend it every time we add a
>>>>>> > >>> new feature) to let the libvirt to read it. Or something
>>>>>> > >>> else.
>>>>> > >>
>>>>> > >> There's the issue of APIs that tweak features as Maxime
>>>>> > >> suggested.
>>>> > >
>>>> > > Yes, it's a good point.
>>>> > >
>>>>> > >> Maybe the only thing to do is to deprecate it,
>>>> > >
>>>> > > Looks like so.
>>>> > >
>>>>> > >> but I feel some way for application to pass info into
>>>>> > >> guest might be benefitial.
>>>> > >
>>>> > > The two APIs are just for tweaking feature bits DPDK supports
>>>> before
>>>> > > any device got connected. It's another way to disable some features
>>>> > > (the another obvious way is to through QEMU command lines).
>>>> > >
>>>> > > IMO, it's bit handy only in a case like: we have bunch of VMs.
>>>> Instead
>>>> > > of disabling something though qemu one by one, we could disable it
>>>> > > once in DPDK.
>>>> > >
>>>> > > But I doubt the useful of it. It's only used in DPDK's vhost
>>>> example
>>>> > > after all. Nor is it used in vhost pmd, neither is it used in OVS.
>>> >
>>> > rte_vhost_feature_disable() is currently used in OVS,
>>> lib/netdev-dpdk.c
>> Hmmm. I must have checked very old code ...
>>> >
>>> > netdev_dpdk_vhost_class_init(void)
>>> > {
>>> >     static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
>>> >
>>> >     /* This function can be called for different classes.  The
>>> > initialization
>>> >      * needs to be done only once */
>>> >     if (ovsthread_once_start(&once)) {
>>> >         rte_vhost_driver_callback_register(&virtio_net_device_ops);
>>> >         rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4
>>> >                                   | 1ULL << VIRTIO_NET_F_HOST_TSO6
>>> >                                   | 1ULL << VIRTIO_NET_F_CSUM);
>> I saw the commit introduced such change, but it tells no reason why
>> it was added.
> 
> I'm also interested to know the reason.

I can't remember off hand, added Mark K or Michal W who should be able
to shed some light on it.

> In any case, I think this is something that can/should be managed by
> the management tool, which  should disable it in cmd parameters.
> 
> Kevin, do you agree?

I think best to find out the reason first. Because if no reason to
disable in the code, then no need to debate!

> 
> Cheers,
> Maxime

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [dpdk-dev] dpdk/vpp and cross-version migration for vhost
@ 2016-11-24 15:01                       ` Kevin Traynor
  0 siblings, 0 replies; 44+ messages in thread
From: Kevin Traynor @ 2016-11-24 15:01 UTC (permalink / raw)
  To: Maxime Coquelin, Yuanhan Liu, Kavanagh, Mark B, michalx.weglicki
  Cc: Michael S. Tsirkin, dev, Stephen Hemminger, qemu-devel,
	libvir-list, vpp-dev, Marc-André Lureau

On 11/24/2016 12:47 PM, Maxime Coquelin wrote:
> 
> 
> On 11/24/2016 01:33 PM, Yuanhan Liu wrote:
>> On Thu, Nov 24, 2016 at 09:30:49AM +0000, Kevin Traynor wrote:
>>> > On 11/24/2016 06:31 AM, Yuanhan Liu wrote:
>>>> > > On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
>>>>>>> > >>>> You keep assuming that you have the VM started first and
>>>>>>> > >>>> figure out things afterwards, but this does not work.
>>>>>>> > >>>>
>>>>>>> > >>>> Think about a cluster of machines. You want to start a VM in
>>>>>>> > >>>> a way that will ensure compatibility with all hosts
>>>>>>> > >>>> in a cluster.
>>>>>> > >>>
>>>>>> > >>> I see. I was more considering about the case when the dst
>>>>>> > >>> host (including the qemu and dpdk combo) is given, and
>>>>>> > >>> then determine whether it will be a successfull migration
>>>>>> > >>> or not.
>>>>>> > >>>
>>>>>> > >>> And you are asking that we need to know which host could
>>>>>> > >>> be a good candidate before starting the migration. In such
>>>>>> > >>> case, we indeed need some inputs from both the qemu and
>>>>>> > >>> vhost-user backend.
>>>>>> > >>>
>>>>>> > >>> For DPDK, I think it could be simple, just as you said, it
>>>>>> > >>> could be either a tiny script, or even a macro defined in
>>>>>> > >>> the source code file (we extend it every time we add a
>>>>>> > >>> new feature) to let the libvirt to read it. Or something
>>>>>> > >>> else.
>>>>> > >>
>>>>> > >> There's the issue of APIs that tweak features as Maxime
>>>>> > >> suggested.
>>>> > >
>>>> > > Yes, it's a good point.
>>>> > >
>>>>> > >> Maybe the only thing to do is to deprecate it,
>>>> > >
>>>> > > Looks like so.
>>>> > >
>>>>> > >> but I feel some way for application to pass info into
>>>>> > >> guest might be benefitial.
>>>> > >
>>>> > > The two APIs are just for tweaking feature bits DPDK supports
>>>> before
>>>> > > any device got connected. It's another way to disable some features
>>>> > > (the another obvious way is to through QEMU command lines).
>>>> > >
>>>> > > IMO, it's bit handy only in a case like: we have bunch of VMs.
>>>> Instead
>>>> > > of disabling something though qemu one by one, we could disable it
>>>> > > once in DPDK.
>>>> > >
>>>> > > But I doubt the useful of it. It's only used in DPDK's vhost
>>>> example
>>>> > > after all. Nor is it used in vhost pmd, neither is it used in OVS.
>>> >
>>> > rte_vhost_feature_disable() is currently used in OVS,
>>> lib/netdev-dpdk.c
>> Hmmm. I must have checked very old code ...
>>> >
>>> > netdev_dpdk_vhost_class_init(void)
>>> > {
>>> >     static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
>>> >
>>> >     /* This function can be called for different classes.  The
>>> > initialization
>>> >      * needs to be done only once */
>>> >     if (ovsthread_once_start(&once)) {
>>> >         rte_vhost_driver_callback_register(&virtio_net_device_ops);
>>> >         rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4
>>> >                                   | 1ULL << VIRTIO_NET_F_HOST_TSO6
>>> >                                   | 1ULL << VIRTIO_NET_F_CSUM);
>> I saw the commit introduced such change, but it tells no reason why
>> it was added.
> 
> I'm also interested to know the reason.

I can't remember off hand, added Mark K or Michal W who should be able
to shed some light on it.

> In any case, I think this is something that can/should be managed by
> the management tool, which  should disable it in cmd parameters.
> 
> Kevin, do you agree?

I think best to find out the reason first. Because if no reason to
disable in the code, then no need to debate!

> 
> Cheers,
> Maxime

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-11-24 15:01                       ` [Qemu-devel] [dpdk-dev] " Kevin Traynor
@ 2016-11-24 15:24                         ` Kavanagh, Mark B
  -1 siblings, 0 replies; 44+ messages in thread
From: Kavanagh, Mark B @ 2016-11-24 15:24 UTC (permalink / raw)
  To: Kevin Traynor, Maxime Coquelin, Yuanhan Liu, Weglicki, MichalX
  Cc: Michael S. Tsirkin, dev, Stephen Hemminger, qemu-devel,
	libvir-list, vpp-dev, Marc-André Lureau

>
>On 11/24/2016 12:47 PM, Maxime Coquelin wrote:
>>
>>
>> On 11/24/2016 01:33 PM, Yuanhan Liu wrote:
>>> On Thu, Nov 24, 2016 at 09:30:49AM +0000, Kevin Traynor wrote:
>>>> > On 11/24/2016 06:31 AM, Yuanhan Liu wrote:
>>>>> > > On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
>>>>>>>> > >>>> You keep assuming that you have the VM started first and
>>>>>>>> > >>>> figure out things afterwards, but this does not work.
>>>>>>>> > >>>>
>>>>>>>> > >>>> Think about a cluster of machines. You want to start a VM in
>>>>>>>> > >>>> a way that will ensure compatibility with all hosts
>>>>>>>> > >>>> in a cluster.
>>>>>>> > >>>
>>>>>>> > >>> I see. I was more considering about the case when the dst
>>>>>>> > >>> host (including the qemu and dpdk combo) is given, and
>>>>>>> > >>> then determine whether it will be a successfull migration
>>>>>>> > >>> or not.
>>>>>>> > >>>
>>>>>>> > >>> And you are asking that we need to know which host could
>>>>>>> > >>> be a good candidate before starting the migration. In such
>>>>>>> > >>> case, we indeed need some inputs from both the qemu and
>>>>>>> > >>> vhost-user backend.
>>>>>>> > >>>
>>>>>>> > >>> For DPDK, I think it could be simple, just as you said, it
>>>>>>> > >>> could be either a tiny script, or even a macro defined in
>>>>>>> > >>> the source code file (we extend it every time we add a
>>>>>>> > >>> new feature) to let the libvirt to read it. Or something
>>>>>>> > >>> else.
>>>>>> > >>
>>>>>> > >> There's the issue of APIs that tweak features as Maxime
>>>>>> > >> suggested.
>>>>> > >
>>>>> > > Yes, it's a good point.
>>>>> > >
>>>>>> > >> Maybe the only thing to do is to deprecate it,
>>>>> > >
>>>>> > > Looks like so.
>>>>> > >
>>>>>> > >> but I feel some way for application to pass info into
>>>>>> > >> guest might be benefitial.
>>>>> > >
>>>>> > > The two APIs are just for tweaking feature bits DPDK supports
>>>>> before
>>>>> > > any device got connected. It's another way to disable some features
>>>>> > > (the another obvious way is to through QEMU command lines).
>>>>> > >
>>>>> > > IMO, it's bit handy only in a case like: we have bunch of VMs.
>>>>> Instead
>>>>> > > of disabling something though qemu one by one, we could disable it
>>>>> > > once in DPDK.
>>>>> > >
>>>>> > > But I doubt the useful of it. It's only used in DPDK's vhost
>>>>> example
>>>>> > > after all. Nor is it used in vhost pmd, neither is it used in OVS.
>>>> >
>>>> > rte_vhost_feature_disable() is currently used in OVS,
>>>> lib/netdev-dpdk.c
>>> Hmmm. I must have checked very old code ...
>>>> >
>>>> > netdev_dpdk_vhost_class_init(void)
>>>> > {
>>>> >     static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
>>>> >
>>>> >     /* This function can be called for different classes.  The
>>>> > initialization
>>>> >      * needs to be done only once */
>>>> >     if (ovsthread_once_start(&once)) {
>>>> >         rte_vhost_driver_callback_register(&virtio_net_device_ops);
>>>> >         rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4
>>>> >                                   | 1ULL << VIRTIO_NET_F_HOST_TSO6
>>>> >                                   | 1ULL << VIRTIO_NET_F_CSUM);
>>> I saw the commit introduced such change, but it tells no reason why
>>> it was added.
>>
>> I'm also interested to know the reason.
>
>I can't remember off hand, added Mark K or Michal W who should be able
>to shed some light on it.

DPDK v16.04 added support for vHost User TSO; as such, by default, TSO is advertised to guest devices as an available feature during feature negotiation with QEMU.
However, while the vHost user backend sets up the majority of the mbuf fields that are required for TSO, there is still a reliance on the associated DPDK application (i.e. in this case OvS-DPDK) to set the remaining flags and/or offsets. Since OvS-DPDK doesn't currently provide that functionality, it is necessary to explicitly disable TSO; otherwise, undefined behaviour will ensue.

>
>> In any case, I think this is something that can/should be managed by
>> the management tool, which  should disable it in cmd parameters.
>>
>> Kevin, do you agree?
>
>I think best to find out the reason first. Because if no reason to
>disable in the code, then no need to debate!
>
>>
>> Cheers,
>> Maxime

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [dpdk-dev] dpdk/vpp and cross-version migration for vhost
@ 2016-11-24 15:24                         ` Kavanagh, Mark B
  0 siblings, 0 replies; 44+ messages in thread
From: Kavanagh, Mark B @ 2016-11-24 15:24 UTC (permalink / raw)
  To: Kevin Traynor, Maxime Coquelin, Yuanhan Liu, Weglicki, MichalX
  Cc: Michael S. Tsirkin, dev, Stephen Hemminger, qemu-devel,
	libvir-list, vpp-dev, Marc-André Lureau

>
>On 11/24/2016 12:47 PM, Maxime Coquelin wrote:
>>
>>
>> On 11/24/2016 01:33 PM, Yuanhan Liu wrote:
>>> On Thu, Nov 24, 2016 at 09:30:49AM +0000, Kevin Traynor wrote:
>>>> > On 11/24/2016 06:31 AM, Yuanhan Liu wrote:
>>>>> > > On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
>>>>>>>> > >>>> You keep assuming that you have the VM started first and
>>>>>>>> > >>>> figure out things afterwards, but this does not work.
>>>>>>>> > >>>>
>>>>>>>> > >>>> Think about a cluster of machines. You want to start a VM in
>>>>>>>> > >>>> a way that will ensure compatibility with all hosts
>>>>>>>> > >>>> in a cluster.
>>>>>>> > >>>
>>>>>>> > >>> I see. I was more considering about the case when the dst
>>>>>>> > >>> host (including the qemu and dpdk combo) is given, and
>>>>>>> > >>> then determine whether it will be a successfull migration
>>>>>>> > >>> or not.
>>>>>>> > >>>
>>>>>>> > >>> And you are asking that we need to know which host could
>>>>>>> > >>> be a good candidate before starting the migration. In such
>>>>>>> > >>> case, we indeed need some inputs from both the qemu and
>>>>>>> > >>> vhost-user backend.
>>>>>>> > >>>
>>>>>>> > >>> For DPDK, I think it could be simple, just as you said, it
>>>>>>> > >>> could be either a tiny script, or even a macro defined in
>>>>>>> > >>> the source code file (we extend it every time we add a
>>>>>>> > >>> new feature) to let the libvirt to read it. Or something
>>>>>>> > >>> else.
>>>>>> > >>
>>>>>> > >> There's the issue of APIs that tweak features as Maxime
>>>>>> > >> suggested.
>>>>> > >
>>>>> > > Yes, it's a good point.
>>>>> > >
>>>>>> > >> Maybe the only thing to do is to deprecate it,
>>>>> > >
>>>>> > > Looks like so.
>>>>> > >
>>>>>> > >> but I feel some way for application to pass info into
>>>>>> > >> guest might be benefitial.
>>>>> > >
>>>>> > > The two APIs are just for tweaking feature bits DPDK supports
>>>>> before
>>>>> > > any device got connected. It's another way to disable some features
>>>>> > > (the another obvious way is to through QEMU command lines).
>>>>> > >
>>>>> > > IMO, it's bit handy only in a case like: we have bunch of VMs.
>>>>> Instead
>>>>> > > of disabling something though qemu one by one, we could disable it
>>>>> > > once in DPDK.
>>>>> > >
>>>>> > > But I doubt the useful of it. It's only used in DPDK's vhost
>>>>> example
>>>>> > > after all. Nor is it used in vhost pmd, neither is it used in OVS.
>>>> >
>>>> > rte_vhost_feature_disable() is currently used in OVS,
>>>> lib/netdev-dpdk.c
>>> Hmmm. I must have checked very old code ...
>>>> >
>>>> > netdev_dpdk_vhost_class_init(void)
>>>> > {
>>>> >     static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
>>>> >
>>>> >     /* This function can be called for different classes.  The
>>>> > initialization
>>>> >      * needs to be done only once */
>>>> >     if (ovsthread_once_start(&once)) {
>>>> >         rte_vhost_driver_callback_register(&virtio_net_device_ops);
>>>> >         rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4
>>>> >                                   | 1ULL << VIRTIO_NET_F_HOST_TSO6
>>>> >                                   | 1ULL << VIRTIO_NET_F_CSUM);
>>> I saw the commit introduced such change, but it tells no reason why
>>> it was added.
>>
>> I'm also interested to know the reason.
>
>I can't remember off hand, added Mark K or Michal W who should be able
>to shed some light on it.

DPDK v16.04 added support for vHost User TSO; as such, by default, TSO is advertised to guest devices as an available feature during feature negotiation with QEMU.
However, while the vHost user backend sets up the majority of the mbuf fields that are required for TSO, there is still a reliance on the associated DPDK application (i.e. in this case OvS-DPDK) to set the remaining flags and/or offsets. Since OvS-DPDK doesn't currently provide that functionality, it is necessary to explicitly disable TSO; otherwise, undefined behaviour will ensue.

>
>> In any case, I think this is something that can/should be managed by
>> the management tool, which  should disable it in cmd parameters.
>>
>> Kevin, do you agree?
>
>I think best to find out the reason first. Because if no reason to
>disable in the code, then no need to debate!
>
>>
>> Cheers,
>> Maxime

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-11-24 15:24                         ` [Qemu-devel] [dpdk-dev] " Kavanagh, Mark B
@ 2016-11-28 15:28                           ` Maxime Coquelin
  -1 siblings, 0 replies; 44+ messages in thread
From: Maxime Coquelin @ 2016-11-28 15:28 UTC (permalink / raw)
  To: Kavanagh, Mark B, Kevin Traynor, Yuanhan Liu, Weglicki, MichalX
  Cc: Michael S. Tsirkin, dev, Stephen Hemminger, qemu-devel,
	libvir-list, vpp-dev, Marc-André Lureau



On 11/24/2016 04:24 PM, Kavanagh, Mark B wrote:
>>
>> On 11/24/2016 12:47 PM, Maxime Coquelin wrote:
>>>
>>>
>>> On 11/24/2016 01:33 PM, Yuanhan Liu wrote:
>>>> On Thu, Nov 24, 2016 at 09:30:49AM +0000, Kevin Traynor wrote:
>>>>>> On 11/24/2016 06:31 AM, Yuanhan Liu wrote:
>>>>>>>> On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
>>>>>>>>>>>>>> You keep assuming that you have the VM started first and
>>>>>>>>>>>>>> figure out things afterwards, but this does not work.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Think about a cluster of machines. You want to start a VM in
>>>>>>>>>>>>>> a way that will ensure compatibility with all hosts
>>>>>>>>>>>>>> in a cluster.
>>>>>>>>>>>>
>>>>>>>>>>>> I see. I was more considering about the case when the dst
>>>>>>>>>>>> host (including the qemu and dpdk combo) is given, and
>>>>>>>>>>>> then determine whether it will be a successfull migration
>>>>>>>>>>>> or not.
>>>>>>>>>>>>
>>>>>>>>>>>> And you are asking that we need to know which host could
>>>>>>>>>>>> be a good candidate before starting the migration. In such
>>>>>>>>>>>> case, we indeed need some inputs from both the qemu and
>>>>>>>>>>>> vhost-user backend.
>>>>>>>>>>>>
>>>>>>>>>>>> For DPDK, I think it could be simple, just as you said, it
>>>>>>>>>>>> could be either a tiny script, or even a macro defined in
>>>>>>>>>>>> the source code file (we extend it every time we add a
>>>>>>>>>>>> new feature) to let the libvirt to read it. Or something
>>>>>>>>>>>> else.
>>>>>>>>>>
>>>>>>>>>> There's the issue of APIs that tweak features as Maxime
>>>>>>>>>> suggested.
>>>>>>>>
>>>>>>>> Yes, it's a good point.
>>>>>>>>
>>>>>>>>>> Maybe the only thing to do is to deprecate it,
>>>>>>>>
>>>>>>>> Looks like so.
>>>>>>>>
>>>>>>>>>> but I feel some way for application to pass info into
>>>>>>>>>> guest might be benefitial.
>>>>>>>>
>>>>>>>> The two APIs are just for tweaking feature bits DPDK supports
>>>>>> before
>>>>>>>> any device got connected. It's another way to disable some features
>>>>>>>> (the another obvious way is to through QEMU command lines).
>>>>>>>>
>>>>>>>> IMO, it's bit handy only in a case like: we have bunch of VMs.
>>>>>> Instead
>>>>>>>> of disabling something though qemu one by one, we could disable it
>>>>>>>> once in DPDK.
>>>>>>>>
>>>>>>>> But I doubt the useful of it. It's only used in DPDK's vhost
>>>>>> example
>>>>>>>> after all. Nor is it used in vhost pmd, neither is it used in OVS.
>>>>>>
>>>>>> rte_vhost_feature_disable() is currently used in OVS,
>>>>> lib/netdev-dpdk.c
>>>> Hmmm. I must have checked very old code ...
>>>>>>
>>>>>> netdev_dpdk_vhost_class_init(void)
>>>>>> {
>>>>>>     static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
>>>>>>
>>>>>>     /* This function can be called for different classes.  The
>>>>>> initialization
>>>>>>      * needs to be done only once */
>>>>>>     if (ovsthread_once_start(&once)) {
>>>>>>         rte_vhost_driver_callback_register(&virtio_net_device_ops);
>>>>>>         rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4
>>>>>>                                   | 1ULL << VIRTIO_NET_F_HOST_TSO6
>>>>>>                                   | 1ULL << VIRTIO_NET_F_CSUM);
>>>> I saw the commit introduced such change, but it tells no reason why
>>>> it was added.
>>>
>>> I'm also interested to know the reason.
>>
>> I can't remember off hand, added Mark K or Michal W who should be able
>> to shed some light on it.
>
> DPDK v16.04 added support for vHost User TSO; as such, by default, TSO is advertised to guest devices as an available feature during feature negotiation with QEMU.
> However, while the vHost user backend sets up the majority of the mbuf fields that are required for TSO, there is still a reliance on the associated DPDK application (i.e. in this case OvS-DPDK) to set the remaining flags and/or offsets. Since OvS-DPDK doesn't currently provide that functionality, it is necessary to explicitly disable TSO; otherwise, undefined behaviour will ensue.
Thanks Mark for the clarification.

In this case, maybe we could add a DPDK build option to disable Vhost's
TSO support, that would be selected for OVS packages?

Does that sound reasonable?

Cheers,
Maxime

>>
>>> In any case, I think this is something that can/should be managed by
>>> the management tool, which  should disable it in cmd parameters.
>>>
>>> Kevin, do you agree?
>>
>> I think best to find out the reason first. Because if no reason to
>> disable in the code, then no need to debate!
>>
>>>
>>> Cheers,
>>> Maxime
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [dpdk-dev] dpdk/vpp and cross-version migration for vhost
@ 2016-11-28 15:28                           ` Maxime Coquelin
  0 siblings, 0 replies; 44+ messages in thread
From: Maxime Coquelin @ 2016-11-28 15:28 UTC (permalink / raw)
  To: Kavanagh, Mark B, Kevin Traynor, Yuanhan Liu, Weglicki, MichalX
  Cc: Michael S. Tsirkin, dev, Stephen Hemminger, qemu-devel,
	libvir-list, vpp-dev, Marc-André Lureau



On 11/24/2016 04:24 PM, Kavanagh, Mark B wrote:
>>
>> On 11/24/2016 12:47 PM, Maxime Coquelin wrote:
>>>
>>>
>>> On 11/24/2016 01:33 PM, Yuanhan Liu wrote:
>>>> On Thu, Nov 24, 2016 at 09:30:49AM +0000, Kevin Traynor wrote:
>>>>>> On 11/24/2016 06:31 AM, Yuanhan Liu wrote:
>>>>>>>> On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
>>>>>>>>>>>>>> You keep assuming that you have the VM started first and
>>>>>>>>>>>>>> figure out things afterwards, but this does not work.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Think about a cluster of machines. You want to start a VM in
>>>>>>>>>>>>>> a way that will ensure compatibility with all hosts
>>>>>>>>>>>>>> in a cluster.
>>>>>>>>>>>>
>>>>>>>>>>>> I see. I was more considering about the case when the dst
>>>>>>>>>>>> host (including the qemu and dpdk combo) is given, and
>>>>>>>>>>>> then determine whether it will be a successfull migration
>>>>>>>>>>>> or not.
>>>>>>>>>>>>
>>>>>>>>>>>> And you are asking that we need to know which host could
>>>>>>>>>>>> be a good candidate before starting the migration. In such
>>>>>>>>>>>> case, we indeed need some inputs from both the qemu and
>>>>>>>>>>>> vhost-user backend.
>>>>>>>>>>>>
>>>>>>>>>>>> For DPDK, I think it could be simple, just as you said, it
>>>>>>>>>>>> could be either a tiny script, or even a macro defined in
>>>>>>>>>>>> the source code file (we extend it every time we add a
>>>>>>>>>>>> new feature) to let the libvirt to read it. Or something
>>>>>>>>>>>> else.
>>>>>>>>>>
>>>>>>>>>> There's the issue of APIs that tweak features as Maxime
>>>>>>>>>> suggested.
>>>>>>>>
>>>>>>>> Yes, it's a good point.
>>>>>>>>
>>>>>>>>>> Maybe the only thing to do is to deprecate it,
>>>>>>>>
>>>>>>>> Looks like so.
>>>>>>>>
>>>>>>>>>> but I feel some way for application to pass info into
>>>>>>>>>> guest might be benefitial.
>>>>>>>>
>>>>>>>> The two APIs are just for tweaking feature bits DPDK supports
>>>>>> before
>>>>>>>> any device got connected. It's another way to disable some features
>>>>>>>> (the another obvious way is to through QEMU command lines).
>>>>>>>>
>>>>>>>> IMO, it's bit handy only in a case like: we have bunch of VMs.
>>>>>> Instead
>>>>>>>> of disabling something though qemu one by one, we could disable it
>>>>>>>> once in DPDK.
>>>>>>>>
>>>>>>>> But I doubt the useful of it. It's only used in DPDK's vhost
>>>>>> example
>>>>>>>> after all. Nor is it used in vhost pmd, neither is it used in OVS.
>>>>>>
>>>>>> rte_vhost_feature_disable() is currently used in OVS,
>>>>> lib/netdev-dpdk.c
>>>> Hmmm. I must have checked very old code ...
>>>>>>
>>>>>> netdev_dpdk_vhost_class_init(void)
>>>>>> {
>>>>>>     static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
>>>>>>
>>>>>>     /* This function can be called for different classes.  The
>>>>>> initialization
>>>>>>      * needs to be done only once */
>>>>>>     if (ovsthread_once_start(&once)) {
>>>>>>         rte_vhost_driver_callback_register(&virtio_net_device_ops);
>>>>>>         rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4
>>>>>>                                   | 1ULL << VIRTIO_NET_F_HOST_TSO6
>>>>>>                                   | 1ULL << VIRTIO_NET_F_CSUM);
>>>> I saw the commit introduced such change, but it tells no reason why
>>>> it was added.
>>>
>>> I'm also interested to know the reason.
>>
>> I can't remember off hand, added Mark K or Michal W who should be able
>> to shed some light on it.
>
> DPDK v16.04 added support for vHost User TSO; as such, by default, TSO is advertised to guest devices as an available feature during feature negotiation with QEMU.
> However, while the vHost user backend sets up the majority of the mbuf fields that are required for TSO, there is still a reliance on the associated DPDK application (i.e. in this case OvS-DPDK) to set the remaining flags and/or offsets. Since OvS-DPDK doesn't currently provide that functionality, it is necessary to explicitly disable TSO; otherwise, undefined behaviour will ensue.
Thanks Mark for the clarification.

In this case, maybe we could add a DPDK build option to disable Vhost's
TSO support, that would be selected for OVS packages?

Does that sound reasonable?

Cheers,
Maxime

>>
>>> In any case, I think this is something that can/should be managed by
>>> the management tool, which  should disable it in cmd parameters.
>>>
>>> Kevin, do you agree?
>>
>> I think best to find out the reason first. Because if no reason to
>> disable in the code, then no need to debate!
>>
>>>
>>> Cheers,
>>> Maxime
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-11-28 15:28                           ` [Qemu-devel] [dpdk-dev] " Maxime Coquelin
@ 2016-11-28 22:18                             ` Thomas Monjalon
  -1 siblings, 0 replies; 44+ messages in thread
From: Thomas Monjalon @ 2016-11-28 22:18 UTC (permalink / raw)
  To: Maxime Coquelin, Kavanagh, Mark B, Kevin Traynor, Yuanhan Liu
  Cc: dev, Weglicki, MichalX, Michael S. Tsirkin, Stephen Hemminger,
	qemu-devel, libvir-list, vpp-dev, Marc-André Lureau,
	olivier.matz

2016-11-28 16:28, Maxime Coquelin:
> On 11/24/2016 04:24 PM, Kavanagh, Mark B wrote:
> > DPDK v16.04 added support for vHost User TSO; as such, by default,
> > TSO is advertised to guest devices as an available feature during
> > feature negotiation with QEMU.
> > However, while the vHost user backend sets up the majority of the
> > mbuf fields that are required for TSO, there is still a reliance
> > on the associated DPDK application (i.e. in this case OvS-DPDK)
> > to set the remaining flags and/or offsets.
> > Since OvS-DPDK doesn't currently provide that functionality, it is
> > necessary to explicitly disable TSO; otherwise, undefined behaviour
> > will ensue.
> 
> Thanks Mark for the clarification.
> 
> In this case, maybe we could add a DPDK build option to disable Vhost's
> TSO support, that would be selected for OVS packages?

Why do you prefer a build-time option rather than the run-time config
with rte_vhost_feature_disable()? Because we need to lock the features?

Reminder: build-time configuration options are forbidden in DPDK for
such usage. It would prevent other applications from using the feature
in a given distribution, just because it is not implemented in OVS.

> Does that sound reasonable?

Maybe I'm missing something but I feel it is more reasonnable to implement
the missing code in OVS.
If something is missing in DPDK, do not hesitate to request or add more
helper functions.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [dpdk-dev] dpdk/vpp and cross-version migration for vhost
@ 2016-11-28 22:18                             ` Thomas Monjalon
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Monjalon @ 2016-11-28 22:18 UTC (permalink / raw)
  To: Maxime Coquelin, Kavanagh, Mark B, Kevin Traynor, Yuanhan Liu
  Cc: dev, Weglicki, MichalX, Michael S. Tsirkin, Stephen Hemminger,
	qemu-devel, libvir-list, vpp-dev, Marc-André Lureau,
	olivier.matz

2016-11-28 16:28, Maxime Coquelin:
> On 11/24/2016 04:24 PM, Kavanagh, Mark B wrote:
> > DPDK v16.04 added support for vHost User TSO; as such, by default,
> > TSO is advertised to guest devices as an available feature during
> > feature negotiation with QEMU.
> > However, while the vHost user backend sets up the majority of the
> > mbuf fields that are required for TSO, there is still a reliance
> > on the associated DPDK application (i.e. in this case OvS-DPDK)
> > to set the remaining flags and/or offsets.
> > Since OvS-DPDK doesn't currently provide that functionality, it is
> > necessary to explicitly disable TSO; otherwise, undefined behaviour
> > will ensue.
> 
> Thanks Mark for the clarification.
> 
> In this case, maybe we could add a DPDK build option to disable Vhost's
> TSO support, that would be selected for OVS packages?

Why do you prefer a build-time option rather than the run-time config
with rte_vhost_feature_disable()? Because we need to lock the features?

Reminder: build-time configuration options are forbidden in DPDK for
such usage. It would prevent other applications from using the feature
in a given distribution, just because it is not implemented in OVS.

> Does that sound reasonable?

Maybe I'm missing something but I feel it is more reasonnable to implement
the missing code in OVS.
If something is missing in DPDK, do not hesitate to request or add more
helper functions.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-11-28 22:18                             ` [Qemu-devel] [dpdk-dev] " Thomas Monjalon
@ 2016-11-29  8:09                               ` Maxime Coquelin
  -1 siblings, 0 replies; 44+ messages in thread
From: Maxime Coquelin @ 2016-11-29  8:09 UTC (permalink / raw)
  To: Thomas Monjalon, Kavanagh, Mark B, Kevin Traynor, Yuanhan Liu
  Cc: dev, Weglicki, MichalX, Michael S. Tsirkin, Stephen Hemminger,
	qemu-devel, libvir-list, vpp-dev, Marc-André Lureau,
	olivier.matz



On 11/28/2016 11:18 PM, Thomas Monjalon wrote:
> 2016-11-28 16:28, Maxime Coquelin:
>> On 11/24/2016 04:24 PM, Kavanagh, Mark B wrote:
>>> DPDK v16.04 added support for vHost User TSO; as such, by default,
>>> TSO is advertised to guest devices as an available feature during
>>> feature negotiation with QEMU.
>>> However, while the vHost user backend sets up the majority of the
>>> mbuf fields that are required for TSO, there is still a reliance
>>> on the associated DPDK application (i.e. in this case OvS-DPDK)
>>> to set the remaining flags and/or offsets.
>>> Since OvS-DPDK doesn't currently provide that functionality, it is
>>> necessary to explicitly disable TSO; otherwise, undefined behaviour
>>> will ensue.
>>
>> Thanks Mark for the clarification.
>>
>> In this case, maybe we could add a DPDK build option to disable Vhost's
>> TSO support, that would be selected for OVS packages?
>
> Why do you prefer a build-time option rather than the run-time config
> with rte_vhost_feature_disable()? Because we need to lock the features?

Right, we need to know what the backend supports before it is started,
so that management tool can check where it could be migrated to.

>
> Reminder: build-time configuration options are forbidden in DPDK for
> such usage. It would prevent other applications from using the feature
> in a given distribution, just because it is not implemented in OVS.

I understand, this is not the right solution.
I proposed this because I misunderstood how the distributions package
OVS+DPDK.

>
>> Does that sound reasonable?
>
> Maybe I'm missing something but I feel it is more reasonnable to implement
> the missing code in OVS.

Yes, that would be the ideal solution.
OVS implements TSO and we let management tool decide whether or not
enabling the features.

While this is done, we could deprecate rte_vhost_feature_disable, and
print error message notifying the user it should be done my the
management tool.

> If something is missing in DPDK, do not hesitate to request or add more
> helper functions.

Sure.

Thanks,
Maxime

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] [dpdk-dev] dpdk/vpp and cross-version migration for vhost
@ 2016-11-29  8:09                               ` Maxime Coquelin
  0 siblings, 0 replies; 44+ messages in thread
From: Maxime Coquelin @ 2016-11-29  8:09 UTC (permalink / raw)
  To: Thomas Monjalon, Kavanagh, Mark B, Kevin Traynor, Yuanhan Liu
  Cc: dev, Weglicki, MichalX, Michael S. Tsirkin, Stephen Hemminger,
	qemu-devel, libvir-list, vpp-dev, Marc-André Lureau,
	olivier.matz



On 11/28/2016 11:18 PM, Thomas Monjalon wrote:
> 2016-11-28 16:28, Maxime Coquelin:
>> On 11/24/2016 04:24 PM, Kavanagh, Mark B wrote:
>>> DPDK v16.04 added support for vHost User TSO; as such, by default,
>>> TSO is advertised to guest devices as an available feature during
>>> feature negotiation with QEMU.
>>> However, while the vHost user backend sets up the majority of the
>>> mbuf fields that are required for TSO, there is still a reliance
>>> on the associated DPDK application (i.e. in this case OvS-DPDK)
>>> to set the remaining flags and/or offsets.
>>> Since OvS-DPDK doesn't currently provide that functionality, it is
>>> necessary to explicitly disable TSO; otherwise, undefined behaviour
>>> will ensue.
>>
>> Thanks Mark for the clarification.
>>
>> In this case, maybe we could add a DPDK build option to disable Vhost's
>> TSO support, that would be selected for OVS packages?
>
> Why do you prefer a build-time option rather than the run-time config
> with rte_vhost_feature_disable()? Because we need to lock the features?

Right, we need to know what the backend supports before it is started,
so that management tool can check where it could be migrated to.

>
> Reminder: build-time configuration options are forbidden in DPDK for
> such usage. It would prevent other applications from using the feature
> in a given distribution, just because it is not implemented in OVS.

I understand, this is not the right solution.
I proposed this because I misunderstood how the distributions package
OVS+DPDK.

>
>> Does that sound reasonable?
>
> Maybe I'm missing something but I feel it is more reasonnable to implement
> the missing code in OVS.

Yes, that would be the ideal solution.
OVS implements TSO and we let management tool decide whether or not
enabling the features.

While this is done, we could deprecate rte_vhost_feature_disable, and
print error message notifying the user it should be done my the
management tool.

> If something is missing in DPDK, do not hesitate to request or add more
> helper functions.

Sure.

Thanks,
Maxime

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-11-24  6:31               ` [Qemu-devel] " Yuanhan Liu
@ 2016-12-09 13:35                 ` Maxime Coquelin
  -1 siblings, 0 replies; 44+ messages in thread
From: Maxime Coquelin @ 2016-12-09 13:35 UTC (permalink / raw)
  To: Yuanhan Liu, Michael S. Tsirkin, Daniel P. Berrange
  Cc: dev, Stephen Hemminger, qemu-devel, libvir-list, vpp-dev,
	Marc-André Lureau

++Daniel for libvirt

On 11/24/2016 07:31 AM, Yuanhan Liu wrote:
>>>>>>>> As version here is an opaque string for libvirt and qemu,
>>>>>>>> > > > > > >>anything can be used - but I suggest either a list
>>>>>>>> > > > > > >>of values defining the interface, e.g.
>>>>>>>> > > > > > >>any_layout=on,max_ring=256
>>>>>>>> > > > > > >>or a version including the name and vendor of the backend,
>>>>>>>> > > > > > >>e.g. "org.dpdk.v4.5.6".
>>> > >
>>> > > The version scheme may not be ideal here. Assume a QEMU is supposed
>>> > > to work with a specific DPDK version, however, user may disable some
>>> > > newer features through qemu command line, that it also could work with
>>> > > an elder DPDK version. Using the version scheme will not allow us doing
>>> > > such migration to an elder DPDK version. The MTU is a lively example
>>> > > here? (when MTU feature is provided by QEMU but is actually disabled
>>> > > by user, that it could also work with an elder DPDK without MTU support).
>>> > >
>>> > > 	--yliu
>> >
>> > OK, so does a list of values look better to you then?
> Yes, if there are no better way.
>
> And I think it may be better to not list all those features, literally.
> But instead, using the number should be better, say, features=0xdeadbeef.
>
> Listing the feature names means we have to come to an agreement in all
> components involved here (QEMU, libvirt, DPDK, VPP, and maybe more
> backends), that we have to use the exact same feature names. Though it
> may not be a big deal, it lacks some flexibility.
>
> A feature bits will not have this issue.

I initially thought having key/value pairs would be more flexible, and
could allow migrating to another application if compatible (i.e. from
OVS to VPP, and vice versa...) without needing synchronization between
the applications.

But Daniel pointed me out that it would add a lot of complexity on
management tool side, as it would need to know how to interpret these
key/value pairs. I think his argument is very valid.

So maybe the best way would be the version string, letting the
application (OVS-DPDK/VPP/...) specify which version it is
compatible with.
For the downsides, as soon as a new feature is supported in vhost-user
application, the new version will not be advertised as compatible with
the previous one, even if the user disables the feature in Qemu (as
pointed out by Yuanhan).

The question is are we ready to add complexity on management tool side
to permit more migration cases, or do we prefer keeping it simple but
sometimes prevent migration even if technically possible?

  -- Maxime

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] dpdk/vpp and cross-version migration for vhost
@ 2016-12-09 13:35                 ` Maxime Coquelin
  0 siblings, 0 replies; 44+ messages in thread
From: Maxime Coquelin @ 2016-12-09 13:35 UTC (permalink / raw)
  To: Yuanhan Liu, Michael S. Tsirkin, Daniel P. Berrange
  Cc: dev, Stephen Hemminger, qemu-devel, libvir-list, vpp-dev,
	Marc-André Lureau

++Daniel for libvirt

On 11/24/2016 07:31 AM, Yuanhan Liu wrote:
>>>>>>>> As version here is an opaque string for libvirt and qemu,
>>>>>>>> > > > > > >>anything can be used - but I suggest either a list
>>>>>>>> > > > > > >>of values defining the interface, e.g.
>>>>>>>> > > > > > >>any_layout=on,max_ring=256
>>>>>>>> > > > > > >>or a version including the name and vendor of the backend,
>>>>>>>> > > > > > >>e.g. "org.dpdk.v4.5.6".
>>> > >
>>> > > The version scheme may not be ideal here. Assume a QEMU is supposed
>>> > > to work with a specific DPDK version, however, user may disable some
>>> > > newer features through qemu command line, that it also could work with
>>> > > an elder DPDK version. Using the version scheme will not allow us doing
>>> > > such migration to an elder DPDK version. The MTU is a lively example
>>> > > here? (when MTU feature is provided by QEMU but is actually disabled
>>> > > by user, that it could also work with an elder DPDK without MTU support).
>>> > >
>>> > > 	--yliu
>> >
>> > OK, so does a list of values look better to you then?
> Yes, if there are no better way.
>
> And I think it may be better to not list all those features, literally.
> But instead, using the number should be better, say, features=0xdeadbeef.
>
> Listing the feature names means we have to come to an agreement in all
> components involved here (QEMU, libvirt, DPDK, VPP, and maybe more
> backends), that we have to use the exact same feature names. Though it
> may not be a big deal, it lacks some flexibility.
>
> A feature bits will not have this issue.

I initially thought having key/value pairs would be more flexible, and
could allow migrating to another application if compatible (i.e. from
OVS to VPP, and vice versa...) without needing synchronization between
the applications.

But Daniel pointed me out that it would add a lot of complexity on
management tool side, as it would need to know how to interpret these
key/value pairs. I think his argument is very valid.

So maybe the best way would be the version string, letting the
application (OVS-DPDK/VPP/...) specify which version it is
compatible with.
For the downsides, as soon as a new feature is supported in vhost-user
application, the new version will not be advertised as compatible with
the previous one, even if the user disables the feature in Qemu (as
pointed out by Yuanhan).

The question is are we ready to add complexity on management tool side
to permit more migration cases, or do we prefer keeping it simple but
sometimes prevent migration even if technically possible?

  -- Maxime

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-12-09 13:35                 ` [Qemu-devel] " Maxime Coquelin
@ 2016-12-09 14:42                   ` Daniel P. Berrange
  -1 siblings, 0 replies; 44+ messages in thread
From: Daniel P. Berrange @ 2016-12-09 14:42 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: Yuanhan Liu, Michael S. Tsirkin, dev, qemu-devel,
	Stephen Hemminger, libvir-list, vpp-dev, Marc-André Lureau

On Fri, Dec 09, 2016 at 02:35:58PM +0100, Maxime Coquelin wrote:
> ++Daniel for libvirt
> 
> On 11/24/2016 07:31 AM, Yuanhan Liu wrote:
> > > > > > > > > As version here is an opaque string for libvirt and qemu,
> > > > > > > > > > > > > > >>anything can be used - but I suggest either a list
> > > > > > > > > > > > > > >>of values defining the interface, e.g.
> > > > > > > > > > > > > > >>any_layout=on,max_ring=256
> > > > > > > > > > > > > > >>or a version including the name and vendor of the backend,
> > > > > > > > > > > > > > >>e.g. "org.dpdk.v4.5.6".
> > > > > >
> > > > > > The version scheme may not be ideal here. Assume a QEMU is supposed
> > > > > > to work with a specific DPDK version, however, user may disable some
> > > > > > newer features through qemu command line, that it also could work with
> > > > > > an elder DPDK version. Using the version scheme will not allow us doing
> > > > > > such migration to an elder DPDK version. The MTU is a lively example
> > > > > > here? (when MTU feature is provided by QEMU but is actually disabled
> > > > > > by user, that it could also work with an elder DPDK without MTU support).
> > > > > >
> > > > > > 	--yliu
> > > >
> > > > OK, so does a list of values look better to you then?
> > Yes, if there are no better way.
> > 
> > And I think it may be better to not list all those features, literally.
> > But instead, using the number should be better, say, features=0xdeadbeef.
> > 
> > Listing the feature names means we have to come to an agreement in all
> > components involved here (QEMU, libvirt, DPDK, VPP, and maybe more
> > backends), that we have to use the exact same feature names. Though it
> > may not be a big deal, it lacks some flexibility.
> > 
> > A feature bits will not have this issue.
> 
> I initially thought having key/value pairs would be more flexible, and
> could allow migrating to another application if compatible (i.e. from
> OVS to VPP, and vice versa...) without needing synchronization between
> the applications.
> 
> But Daniel pointed me out that it would add a lot of complexity on
> management tool side, as it would need to know how to interpret these
> key/value pairs. I think his argument is very valid.
> 
> So maybe the best way would be the version string, letting the
> application (OVS-DPDK/VPP/...) specify which version it is
> compatible with.
> For the downsides, as soon as a new feature is supported in vhost-user
> application, the new version will not be advertised as compatible with
> the previous one, even if the user disables the feature in Qemu (as
> pointed out by Yuanhan).

We need two distinct capabilities in order to make this work properly.

First, libvirt needs to be able to query the list of (one or more)
supported versions strings for a given host.

Second, when launching QEMU we need to be able to specify the desired
version against the NIC backend.

So, consider host A, initially supporting "ovsdpdk-v1". When libvirt
launches the VM it will specify 'ovsdpgk-v1' as the desired version
string to use.

Now some time later you add features X, Y & Z to a new release of
DPDK and install this on host B.  Host B is able to support two
versions 'ovsdppk-v1' and 'ovsdpdk-v2'.  When libvirt launches
a VM on host B, it'll pick 'ovsdpgk-v2' by default, since that's
the newest.   When libvirt migrates a VM from host A, however,
it will request the old version 'ovsdpdk-v1' in order to ensure
compatibility.  Similarly when launching a new VM on host B,
libvirt could choose to use 'ovsdpdk-v1' as the version, in
order to enable migration to the olver host A, if desired.

This is exactly the way QEMU machine types work, hiding the
existance of 100's low level settings / default values, that
a mgmt app would otherwise have to worry about.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] dpdk/vpp and cross-version migration for vhost
@ 2016-12-09 14:42                   ` Daniel P. Berrange
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel P. Berrange @ 2016-12-09 14:42 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: Yuanhan Liu, Michael S. Tsirkin, dev, Stephen Hemminger,
	qemu-devel, libvir-list, vpp-dev, Marc-André Lureau

On Fri, Dec 09, 2016 at 02:35:58PM +0100, Maxime Coquelin wrote:
> ++Daniel for libvirt
> 
> On 11/24/2016 07:31 AM, Yuanhan Liu wrote:
> > > > > > > > > As version here is an opaque string for libvirt and qemu,
> > > > > > > > > > > > > > >>anything can be used - but I suggest either a list
> > > > > > > > > > > > > > >>of values defining the interface, e.g.
> > > > > > > > > > > > > > >>any_layout=on,max_ring=256
> > > > > > > > > > > > > > >>or a version including the name and vendor of the backend,
> > > > > > > > > > > > > > >>e.g. "org.dpdk.v4.5.6".
> > > > > >
> > > > > > The version scheme may not be ideal here. Assume a QEMU is supposed
> > > > > > to work with a specific DPDK version, however, user may disable some
> > > > > > newer features through qemu command line, that it also could work with
> > > > > > an elder DPDK version. Using the version scheme will not allow us doing
> > > > > > such migration to an elder DPDK version. The MTU is a lively example
> > > > > > here? (when MTU feature is provided by QEMU but is actually disabled
> > > > > > by user, that it could also work with an elder DPDK without MTU support).
> > > > > >
> > > > > > 	--yliu
> > > >
> > > > OK, so does a list of values look better to you then?
> > Yes, if there are no better way.
> > 
> > And I think it may be better to not list all those features, literally.
> > But instead, using the number should be better, say, features=0xdeadbeef.
> > 
> > Listing the feature names means we have to come to an agreement in all
> > components involved here (QEMU, libvirt, DPDK, VPP, and maybe more
> > backends), that we have to use the exact same feature names. Though it
> > may not be a big deal, it lacks some flexibility.
> > 
> > A feature bits will not have this issue.
> 
> I initially thought having key/value pairs would be more flexible, and
> could allow migrating to another application if compatible (i.e. from
> OVS to VPP, and vice versa...) without needing synchronization between
> the applications.
> 
> But Daniel pointed me out that it would add a lot of complexity on
> management tool side, as it would need to know how to interpret these
> key/value pairs. I think his argument is very valid.
> 
> So maybe the best way would be the version string, letting the
> application (OVS-DPDK/VPP/...) specify which version it is
> compatible with.
> For the downsides, as soon as a new feature is supported in vhost-user
> application, the new version will not be advertised as compatible with
> the previous one, even if the user disables the feature in Qemu (as
> pointed out by Yuanhan).

We need two distinct capabilities in order to make this work properly.

First, libvirt needs to be able to query the list of (one or more)
supported versions strings for a given host.

Second, when launching QEMU we need to be able to specify the desired
version against the NIC backend.

So, consider host A, initially supporting "ovsdpdk-v1". When libvirt
launches the VM it will specify 'ovsdpgk-v1' as the desired version
string to use.

Now some time later you add features X, Y & Z to a new release of
DPDK and install this on host B.  Host B is able to support two
versions 'ovsdppk-v1' and 'ovsdpdk-v2'.  When libvirt launches
a VM on host B, it'll pick 'ovsdpgk-v2' by default, since that's
the newest.   When libvirt migrates a VM from host A, however,
it will request the old version 'ovsdpdk-v1' in order to ensure
compatibility.  Similarly when launching a new VM on host B,
libvirt could choose to use 'ovsdpdk-v1' as the version, in
order to enable migration to the olver host A, if desired.

This is exactly the way QEMU machine types work, hiding the
existance of 100's low level settings / default values, that
a mgmt app would otherwise have to worry about.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-12-09 14:42                   ` [Qemu-devel] " Daniel P. Berrange
@ 2016-12-09 16:45                     ` Maxime Coquelin
  -1 siblings, 0 replies; 44+ messages in thread
From: Maxime Coquelin @ 2016-12-09 16:45 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Yuanhan Liu, Michael S. Tsirkin, dev, Stephen Hemminger,
	qemu-devel, libvir-list, vpp-dev, Marc-André Lureau,
	Flavio Leitner, Aaron Conole



On 12/09/2016 03:42 PM, Daniel P. Berrange wrote:
> On Fri, Dec 09, 2016 at 02:35:58PM +0100, Maxime Coquelin wrote:
>> ++Daniel for libvirt
>>
>> On 11/24/2016 07:31 AM, Yuanhan Liu wrote:
>>>>>>>>>> As version here is an opaque string for libvirt and qemu,
>>>>>>>>>>>>>>>>> anything can be used - but I suggest either a list
>>>>>>>>>>>>>>>>> of values defining the interface, e.g.
>>>>>>>>>>>>>>>>> any_layout=on,max_ring=256
>>>>>>>>>>>>>>>>> or a version including the name and vendor of the backend,
>>>>>>>>>>>>>>>>> e.g. "org.dpdk.v4.5.6".
>>>>>>>
>>>>>>> The version scheme may not be ideal here. Assume a QEMU is supposed
>>>>>>> to work with a specific DPDK version, however, user may disable some
>>>>>>> newer features through qemu command line, that it also could work with
>>>>>>> an elder DPDK version. Using the version scheme will not allow us doing
>>>>>>> such migration to an elder DPDK version. The MTU is a lively example
>>>>>>> here? (when MTU feature is provided by QEMU but is actually disabled
>>>>>>> by user, that it could also work with an elder DPDK without MTU support).
>>>>>>>
>>>>>>> 	--yliu
>>>>>
>>>>> OK, so does a list of values look better to you then?
>>> Yes, if there are no better way.
>>>
>>> And I think it may be better to not list all those features, literally.
>>> But instead, using the number should be better, say, features=0xdeadbeef.
>>>
>>> Listing the feature names means we have to come to an agreement in all
>>> components involved here (QEMU, libvirt, DPDK, VPP, and maybe more
>>> backends), that we have to use the exact same feature names. Though it
>>> may not be a big deal, it lacks some flexibility.
>>>
>>> A feature bits will not have this issue.
>>
>> I initially thought having key/value pairs would be more flexible, and
>> could allow migrating to another application if compatible (i.e. from
>> OVS to VPP, and vice versa...) without needing synchronization between
>> the applications.
>>
>> But Daniel pointed me out that it would add a lot of complexity on
>> management tool side, as it would need to know how to interpret these
>> key/value pairs. I think his argument is very valid.
>>
>> So maybe the best way would be the version string, letting the
>> application (OVS-DPDK/VPP/...) specify which version it is
>> compatible with.
>> For the downsides, as soon as a new feature is supported in vhost-user
>> application, the new version will not be advertised as compatible with
>> the previous one, even if the user disables the feature in Qemu (as
>> pointed out by Yuanhan).
>
> We need two distinct capabilities in order to make this work properly.
>
> First, libvirt needs to be able to query the list of (one or more)
> supported versions strings for a given host.

Shouldn't be the role of OpenStack/Neutron? IIUC, libvirt knows nothing
about OVS.

> Second, when launching QEMU we need to be able to specify the desired
> version against the NIC backend.
>
> So, consider host A, initially supporting "ovsdpdk-v1". When libvirt
> launches the VM it will specify 'ovsdpgk-v1' as the desired version
> string to use.
>
> Now some time later you add features X, Y & Z to a new release of
> DPDK and install this on host B.  Host B is able to support two
> versions 'ovsdppk-v1' and 'ovsdpdk-v2'.  When libvirt launches
> a VM on host B, it'll pick 'ovsdpgk-v2' by default, since that's
> the newest.   When libvirt migrates a VM from host A, however,
> it will request the old version 'ovsdpdk-v1' in order to ensure
> compatibility.  Similarly when launching a new VM on host B,
> libvirt could choose to use 'ovsdpdk-v1' as the version, in
> order to enable migration to the olver host A, if desired.
>
> This is exactly the way QEMU machine types work, hiding the
> existance of 100's low level settings / default values, that
> a mgmt app would otherwise have to worry about.

I agree on the principle. I need to check what is missing for OVS to
support different versions on different vhost-user ports.

Thanks,
Maxime
>
> Regards,
> Daniel
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] dpdk/vpp and cross-version migration for vhost
@ 2016-12-09 16:45                     ` Maxime Coquelin
  0 siblings, 0 replies; 44+ messages in thread
From: Maxime Coquelin @ 2016-12-09 16:45 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Yuanhan Liu, Michael S. Tsirkin, dev, Stephen Hemminger,
	qemu-devel, libvir-list, vpp-dev, Marc-André Lureau,
	Flavio Leitner, Aaron Conole



On 12/09/2016 03:42 PM, Daniel P. Berrange wrote:
> On Fri, Dec 09, 2016 at 02:35:58PM +0100, Maxime Coquelin wrote:
>> ++Daniel for libvirt
>>
>> On 11/24/2016 07:31 AM, Yuanhan Liu wrote:
>>>>>>>>>> As version here is an opaque string for libvirt and qemu,
>>>>>>>>>>>>>>>>> anything can be used - but I suggest either a list
>>>>>>>>>>>>>>>>> of values defining the interface, e.g.
>>>>>>>>>>>>>>>>> any_layout=on,max_ring=256
>>>>>>>>>>>>>>>>> or a version including the name and vendor of the backend,
>>>>>>>>>>>>>>>>> e.g. "org.dpdk.v4.5.6".
>>>>>>>
>>>>>>> The version scheme may not be ideal here. Assume a QEMU is supposed
>>>>>>> to work with a specific DPDK version, however, user may disable some
>>>>>>> newer features through qemu command line, that it also could work with
>>>>>>> an elder DPDK version. Using the version scheme will not allow us doing
>>>>>>> such migration to an elder DPDK version. The MTU is a lively example
>>>>>>> here? (when MTU feature is provided by QEMU but is actually disabled
>>>>>>> by user, that it could also work with an elder DPDK without MTU support).
>>>>>>>
>>>>>>> 	--yliu
>>>>>
>>>>> OK, so does a list of values look better to you then?
>>> Yes, if there are no better way.
>>>
>>> And I think it may be better to not list all those features, literally.
>>> But instead, using the number should be better, say, features=0xdeadbeef.
>>>
>>> Listing the feature names means we have to come to an agreement in all
>>> components involved here (QEMU, libvirt, DPDK, VPP, and maybe more
>>> backends), that we have to use the exact same feature names. Though it
>>> may not be a big deal, it lacks some flexibility.
>>>
>>> A feature bits will not have this issue.
>>
>> I initially thought having key/value pairs would be more flexible, and
>> could allow migrating to another application if compatible (i.e. from
>> OVS to VPP, and vice versa...) without needing synchronization between
>> the applications.
>>
>> But Daniel pointed me out that it would add a lot of complexity on
>> management tool side, as it would need to know how to interpret these
>> key/value pairs. I think his argument is very valid.
>>
>> So maybe the best way would be the version string, letting the
>> application (OVS-DPDK/VPP/...) specify which version it is
>> compatible with.
>> For the downsides, as soon as a new feature is supported in vhost-user
>> application, the new version will not be advertised as compatible with
>> the previous one, even if the user disables the feature in Qemu (as
>> pointed out by Yuanhan).
>
> We need two distinct capabilities in order to make this work properly.
>
> First, libvirt needs to be able to query the list of (one or more)
> supported versions strings for a given host.

Shouldn't be the role of OpenStack/Neutron? IIUC, libvirt knows nothing
about OVS.

> Second, when launching QEMU we need to be able to specify the desired
> version against the NIC backend.
>
> So, consider host A, initially supporting "ovsdpdk-v1". When libvirt
> launches the VM it will specify 'ovsdpgk-v1' as the desired version
> string to use.
>
> Now some time later you add features X, Y & Z to a new release of
> DPDK and install this on host B.  Host B is able to support two
> versions 'ovsdppk-v1' and 'ovsdpdk-v2'.  When libvirt launches
> a VM on host B, it'll pick 'ovsdpgk-v2' by default, since that's
> the newest.   When libvirt migrates a VM from host A, however,
> it will request the old version 'ovsdpdk-v1' in order to ensure
> compatibility.  Similarly when launching a new VM on host B,
> libvirt could choose to use 'ovsdpdk-v1' as the version, in
> order to enable migration to the olver host A, if desired.
>
> This is exactly the way QEMU machine types work, hiding the
> existance of 100's low level settings / default values, that
> a mgmt app would otherwise have to worry about.

I agree on the principle. I need to check what is missing for OVS to
support different versions on different vhost-user ports.

Thanks,
Maxime
>
> Regards,
> Daniel
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: dpdk/vpp and cross-version migration for vhost
  2016-12-09 16:45                     ` [Qemu-devel] " Maxime Coquelin
@ 2016-12-09 16:48                       ` Daniel P. Berrange
  -1 siblings, 0 replies; 44+ messages in thread
From: Daniel P. Berrange @ 2016-12-09 16:48 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: Yuanhan Liu, Michael S. Tsirkin, dev, Stephen Hemminger,
	qemu-devel, libvir-list, vpp-dev, Marc-André Lureau,
	Flavio Leitner, Aaron Conole

On Fri, Dec 09, 2016 at 05:45:13PM +0100, Maxime Coquelin wrote:
> 
> 
> On 12/09/2016 03:42 PM, Daniel P. Berrange wrote:
> > On Fri, Dec 09, 2016 at 02:35:58PM +0100, Maxime Coquelin wrote:
> > > ++Daniel for libvirt
> > > 
> > > On 11/24/2016 07:31 AM, Yuanhan Liu wrote:
> > > > > > > > > > > As version here is an opaque string for libvirt and qemu,
> > > > > > > > > > > > > > > > > > anything can be used - but I suggest either a list
> > > > > > > > > > > > > > > > > > of values defining the interface, e.g.
> > > > > > > > > > > > > > > > > > any_layout=on,max_ring=256
> > > > > > > > > > > > > > > > > > or a version including the name and vendor of the backend,
> > > > > > > > > > > > > > > > > > e.g. "org.dpdk.v4.5.6".
> > > > > > > > 
> > > > > > > > The version scheme may not be ideal here. Assume a QEMU is supposed
> > > > > > > > to work with a specific DPDK version, however, user may disable some
> > > > > > > > newer features through qemu command line, that it also could work with
> > > > > > > > an elder DPDK version. Using the version scheme will not allow us doing
> > > > > > > > such migration to an elder DPDK version. The MTU is a lively example
> > > > > > > > here? (when MTU feature is provided by QEMU but is actually disabled
> > > > > > > > by user, that it could also work with an elder DPDK without MTU support).
> > > > > > > > 
> > > > > > > > 	--yliu
> > > > > > 
> > > > > > OK, so does a list of values look better to you then?
> > > > Yes, if there are no better way.
> > > > 
> > > > And I think it may be better to not list all those features, literally.
> > > > But instead, using the number should be better, say, features=0xdeadbeef.
> > > > 
> > > > Listing the feature names means we have to come to an agreement in all
> > > > components involved here (QEMU, libvirt, DPDK, VPP, and maybe more
> > > > backends), that we have to use the exact same feature names. Though it
> > > > may not be a big deal, it lacks some flexibility.
> > > > 
> > > > A feature bits will not have this issue.
> > > 
> > > I initially thought having key/value pairs would be more flexible, and
> > > could allow migrating to another application if compatible (i.e. from
> > > OVS to VPP, and vice versa...) without needing synchronization between
> > > the applications.
> > > 
> > > But Daniel pointed me out that it would add a lot of complexity on
> > > management tool side, as it would need to know how to interpret these
> > > key/value pairs. I think his argument is very valid.
> > > 
> > > So maybe the best way would be the version string, letting the
> > > application (OVS-DPDK/VPP/...) specify which version it is
> > > compatible with.
> > > For the downsides, as soon as a new feature is supported in vhost-user
> > > application, the new version will not be advertised as compatible with
> > > the previous one, even if the user disables the feature in Qemu (as
> > > pointed out by Yuanhan).
> > 
> > We need two distinct capabilities in order to make this work properly.
> > 
> > First, libvirt needs to be able to query the list of (one or more)
> > supported versions strings for a given host.
> 
> Shouldn't be the role of OpenStack/Neutron? IIUC, libvirt knows nothing
> about OVS.

If libvirt doesn't know about it, then libvirt can't do any migration
checks upfront. Nova will have todo a check against supported version
strings before triggering migrate in libvirt.  That's probably fine
from libvirt POV.


Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Qemu-devel] dpdk/vpp and cross-version migration for vhost
@ 2016-12-09 16:48                       ` Daniel P. Berrange
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel P. Berrange @ 2016-12-09 16:48 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: Yuanhan Liu, Michael S. Tsirkin, dev, Stephen Hemminger,
	qemu-devel, libvir-list, vpp-dev, Marc-André Lureau,
	Flavio Leitner, Aaron Conole

On Fri, Dec 09, 2016 at 05:45:13PM +0100, Maxime Coquelin wrote:
> 
> 
> On 12/09/2016 03:42 PM, Daniel P. Berrange wrote:
> > On Fri, Dec 09, 2016 at 02:35:58PM +0100, Maxime Coquelin wrote:
> > > ++Daniel for libvirt
> > > 
> > > On 11/24/2016 07:31 AM, Yuanhan Liu wrote:
> > > > > > > > > > > As version here is an opaque string for libvirt and qemu,
> > > > > > > > > > > > > > > > > > anything can be used - but I suggest either a list
> > > > > > > > > > > > > > > > > > of values defining the interface, e.g.
> > > > > > > > > > > > > > > > > > any_layout=on,max_ring=256
> > > > > > > > > > > > > > > > > > or a version including the name and vendor of the backend,
> > > > > > > > > > > > > > > > > > e.g. "org.dpdk.v4.5.6".
> > > > > > > > 
> > > > > > > > The version scheme may not be ideal here. Assume a QEMU is supposed
> > > > > > > > to work with a specific DPDK version, however, user may disable some
> > > > > > > > newer features through qemu command line, that it also could work with
> > > > > > > > an elder DPDK version. Using the version scheme will not allow us doing
> > > > > > > > such migration to an elder DPDK version. The MTU is a lively example
> > > > > > > > here? (when MTU feature is provided by QEMU but is actually disabled
> > > > > > > > by user, that it could also work with an elder DPDK without MTU support).
> > > > > > > > 
> > > > > > > > 	--yliu
> > > > > > 
> > > > > > OK, so does a list of values look better to you then?
> > > > Yes, if there are no better way.
> > > > 
> > > > And I think it may be better to not list all those features, literally.
> > > > But instead, using the number should be better, say, features=0xdeadbeef.
> > > > 
> > > > Listing the feature names means we have to come to an agreement in all
> > > > components involved here (QEMU, libvirt, DPDK, VPP, and maybe more
> > > > backends), that we have to use the exact same feature names. Though it
> > > > may not be a big deal, it lacks some flexibility.
> > > > 
> > > > A feature bits will not have this issue.
> > > 
> > > I initially thought having key/value pairs would be more flexible, and
> > > could allow migrating to another application if compatible (i.e. from
> > > OVS to VPP, and vice versa...) without needing synchronization between
> > > the applications.
> > > 
> > > But Daniel pointed me out that it would add a lot of complexity on
> > > management tool side, as it would need to know how to interpret these
> > > key/value pairs. I think his argument is very valid.
> > > 
> > > So maybe the best way would be the version string, letting the
> > > application (OVS-DPDK/VPP/...) specify which version it is
> > > compatible with.
> > > For the downsides, as soon as a new feature is supported in vhost-user
> > > application, the new version will not be advertised as compatible with
> > > the previous one, even if the user disables the feature in Qemu (as
> > > pointed out by Yuanhan).
> > 
> > We need two distinct capabilities in order to make this work properly.
> > 
> > First, libvirt needs to be able to query the list of (one or more)
> > supported versions strings for a given host.
> 
> Shouldn't be the role of OpenStack/Neutron? IIUC, libvirt knows nothing
> about OVS.

If libvirt doesn't know about it, then libvirt can't do any migration
checks upfront. Nova will have todo a check against supported version
strings before triggering migrate in libvirt.  That's probably fine
from libvirt POV.


Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2016-12-09 16:49 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-13 17:50 dpdk/vpp and cross-version migration for vhost Michael S. Tsirkin
2016-10-13 17:50 ` [Qemu-devel] " Michael S. Tsirkin
2016-11-16 20:43 ` Maxime Coquelin
2016-11-16 20:43   ` [Qemu-devel] " Maxime Coquelin
2016-11-17  8:29 ` Yuanhan Liu
2016-11-17  8:29   ` [Qemu-devel] " Yuanhan Liu
2016-11-17  8:47   ` Maxime Coquelin
2016-11-17  8:47     ` [Qemu-devel] " Maxime Coquelin
2016-11-17  9:49     ` Yuanhan Liu
2016-11-17  9:49       ` [Qemu-devel] " Yuanhan Liu
2016-11-17 15:25       ` [vpp-dev] " Thomas F Herbert
2016-11-17 15:25         ` [Qemu-devel] " Thomas F Herbert
2016-11-17 17:37       ` Michael S. Tsirkin
2016-11-17 17:37         ` [Qemu-devel] " Michael S. Tsirkin
2016-11-22 13:02         ` Yuanhan Liu
2016-11-22 13:02           ` [Qemu-devel] " Yuanhan Liu
2016-11-22 14:53           ` Michael S. Tsirkin
2016-11-22 14:53             ` [Qemu-devel] " Michael S. Tsirkin
2016-11-24  6:31             ` Yuanhan Liu
2016-11-24  6:31               ` [Qemu-devel] " Yuanhan Liu
2016-11-24  9:30               ` Kevin Traynor
2016-11-24  9:30                 ` [Qemu-devel] [dpdk-dev] " Kevin Traynor
2016-11-24 12:33                 ` Yuanhan Liu
2016-11-24 12:33                   ` [Qemu-devel] [dpdk-dev] " Yuanhan Liu
2016-11-24 12:47                   ` Maxime Coquelin
2016-11-24 12:47                     ` [Qemu-devel] [dpdk-dev] " Maxime Coquelin
2016-11-24 15:01                     ` Kevin Traynor
2016-11-24 15:01                       ` [Qemu-devel] [dpdk-dev] " Kevin Traynor
2016-11-24 15:24                       ` Kavanagh, Mark B
2016-11-24 15:24                         ` [Qemu-devel] [dpdk-dev] " Kavanagh, Mark B
2016-11-28 15:28                         ` Maxime Coquelin
2016-11-28 15:28                           ` [Qemu-devel] [dpdk-dev] " Maxime Coquelin
2016-11-28 22:18                           ` Thomas Monjalon
2016-11-28 22:18                             ` [Qemu-devel] [dpdk-dev] " Thomas Monjalon
2016-11-29  8:09                             ` Maxime Coquelin
2016-11-29  8:09                               ` [Qemu-devel] [dpdk-dev] " Maxime Coquelin
2016-12-09 13:35               ` Maxime Coquelin
2016-12-09 13:35                 ` [Qemu-devel] " Maxime Coquelin
2016-12-09 14:42                 ` Daniel P. Berrange
2016-12-09 14:42                   ` [Qemu-devel] " Daniel P. Berrange
2016-12-09 16:45                   ` Maxime Coquelin
2016-12-09 16:45                     ` [Qemu-devel] " Maxime Coquelin
2016-12-09 16:48                     ` Daniel P. Berrange
2016-12-09 16:48                       ` [Qemu-devel] " Daniel P. Berrange

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.