kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Yan Zhao <yan.y.zhao@intel.com>
Cc: devel@ovirt.org, openstack-discuss@lists.openstack.org,
	libvir-list@redhat.com, intel-gvt-dev@lists.freedesktop.org,
	kvm@vger.kernel.org, qemu-devel@nongnu.org, smooney@redhat.com,
	eskultet@redhat.com, alex.williamson@redhat.com,
	cohuck@redhat.com, dinechin@redhat.com, corbet@lwn.net,
	kwankhede@nvidia.com, dgilbert@redhat.com, eauger@redhat.com,
	jian-feng.ding@intel.com, hejie.xu@intel.com,
	kevin.tian@intel.com, zhenyuw@linux.intel.com,
	bao.yumeng@zte.com.cn, xin-ran.wang@intel.com,
	shaohe.feng@intel.com
Subject: Re: device compatibility interface for live migration with assigned devices
Date: Tue, 14 Jul 2020 11:21:29 +0100	[thread overview]
Message-ID: <20200714102129.GD25187@redhat.com> (raw)
In-Reply-To: <20200713232957.GD5955@joy-OptiPlex-7040>

On Tue, Jul 14, 2020 at 07:29:57AM +0800, Yan Zhao wrote:
> hi folks,
> we are defining a device migration compatibility interface that helps upper
> layer stack like openstack/ovirt/libvirt to check if two devices are
> live migration compatible.
> The "devices" here could be MDEVs, physical devices, or hybrid of the two.
> e.g. we could use it to check whether
> - a src MDEV can migrate to a target MDEV,
> - a src VF in SRIOV can migrate to a target VF in SRIOV,
> - a src MDEV can migration to a target VF in SRIOV.
>   (e.g. SIOV/SRIOV backward compatibility case)
> 
> The upper layer stack could use this interface as the last step to check
> if one device is able to migrate to another device before triggering a real
> live migration procedure.
> we are not sure if this interface is of value or help to you. please don't
> hesitate to drop your valuable comments.
> 
> 
> (1) interface definition
> The interface is defined in below way:
> 
>              __    userspace
>               /\              \
>              /                 \write
>             / read              \
>    ________/__________       ___\|/_____________
>   | migration_version |     | migration_version |-->check migration
>   ---------------------     ---------------------   compatibility
>      device A                    device B
> 
> 
> a device attribute named migration_version is defined under each device's
> sysfs node. e.g. (/sys/bus/pci/devices/0000\:00\:02.0/$mdev_UUID/migration_version).
> userspace tools read the migration_version as a string from the source device,
> and write it to the migration_version sysfs attribute in the target device.
> 
> The userspace should treat ANY of below conditions as two devices not compatible:
> - any one of the two devices does not have a migration_version attribute
> - error when reading from migration_version attribute of one device
> - error when writing migration_version string of one device to
>   migration_version attribute of the other device
> 
> The string read from migration_version attribute is defined by device vendor
> driver and is completely opaque to the userspace.
> for a Intel vGPU, string format can be defined like
> "parent device PCI ID" + "version of gvt driver" + "mdev type" + "aggregator count".
> 
> for an NVMe VF connecting to a remote storage. it could be
> "PCI ID" + "driver version" + "configured remote storage URL"
> 
> for a QAT VF, it may be
> "PCI ID" + "driver version" + "supported encryption set".
> 
> (to avoid namespace confliction from each vendor, we may prefix a driver name to
> each migration_version string. e.g. i915-v1-8086-591d-i915-GVTg_V5_8-1)
> 
> 
> (2) backgrounds
> 
> The reason we hope the migration_version string is opaque to the userspace
> is that it is hard to generalize standard comparing fields and comparing
> methods for different devices from different vendors.
> Though userspace now could still do a simple string compare to check if
> two devices are compatible, and result should also be right, it's still
> too limited as it excludes the possible candidate whose migration_version
> string fails to be equal.
> e.g. an MDEV with mdev_type_1, aggregator count 3 is probably compatible
> with another MDEV with mdev_type_3, aggregator count 1, even their
> migration_version strings are not equal.
> (assumed mdev_type_3 is of 3 times equal resources of mdev_type_1).
> 
> besides that, driver version + configured resources are all elements demanding
> to take into account.
> 
> So, we hope leaving the freedom to vendor driver and let it make the final decision
> in a simple reading from source side and writing for test in the target side way.
> 
> 
> we then think the device compatibility issues for live migration with assigned
> devices can be divided into two steps:
> a. management tools filter out possible migration target devices.
>    Tags could be created according to info from product specification.
>    we think openstack/ovirt may have vendor proprietary components to create
>    those customized tags for each product from each vendor.

>    for Intel vGPU, with a vGPU(a MDEV device) in source side, the tags to
>    search target vGPU are like:
>    a tag for compatible parent PCI IDs,
>    a tag for a range of gvt driver versions,
>    a tag for a range of mdev type + aggregator count
> 
>    for NVMe VF, the tags to search target VF may be like:
>    a tag for compatible PCI IDs,
>    a tag for a range of driver versions,
>    a tag for URL of configured remote storage.

Requiring management application developers to figure out this possible
compatibility based on prod specs is really unrealistic. Product specs
are typically as clear as mud, and with the suggestion we consider
different rules for different types of devices, add up to a huge amount
of complexity. This isn't something app developers should have to spend
their time figuring out.

The suggestion that we make use of vendor proprietary helper components
is totally unacceptable. We need to be able to build a solution that
works with exclusively an open source software stack.

IMHO there needs to be a mechanism for the kernel to report via sysfs
what versions are supported on a given device. This puts the job of
reporting compatible versions directly under the responsibility of the
vendor who writes the kernel driver for it. They are the ones with the
best knowledge of the hardware they've built and the rules around its
compatibility.

> b. with the output from step a, openstack/ovirt/libvirt could use our proposed
>    device migration compatibility interface to make sure the two devices are
>    indeed live migration compatible before launching the real live migration
>    process to start stream copying, src device stopping and target device
>    resuming.
>    It is supposed that this step would not bring any performance penalty as
>    -in kernel it's just a simple string decoding and comparing
>    -in openstack/ovirt, it could be done by extending current function
>     check_can_live_migrate_destination, along side claiming target resources.[1]




> 
> 
> [1] https://specs.openstack.org/openstack/nova-specs/specs/stein/approved/libvirt-neutron-sriov-livemigration.html
> 
> Thanks
> Yan
> 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


  reply	other threads:[~2020-07-14 10:22 UTC|newest]

Thread overview: 114+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-13 23:29 device compatibility interface for live migration with assigned devices Yan Zhao
2020-07-14 10:21 ` Daniel P. Berrangé [this message]
2020-07-14 12:33   ` Sean Mooney
     [not found]     ` <20200714110148.0471c03c@x1.home>
     [not found]       ` <eb705c72cdc8b6b8959b6ebaeeac6069a718d524.camel@redhat.com>
2020-07-14 21:15         ` Sean Mooney
2020-07-14 16:16   ` Alex Williamson
2020-07-14 16:47     ` Daniel P. Berrangé
2020-07-14 20:47       ` Alex Williamson
2020-07-15  9:16         ` Daniel P. Berrangé
2020-07-14 17:19     ` Dr. David Alan Gilbert
2020-07-14 20:59       ` Alex Williamson
2020-07-15  8:20         ` Yan Zhao
2020-07-15  8:49           ` Feng, Shaohe
2020-07-17 14:59           ` Alex Williamson
2020-07-17 18:03             ` Dr. David Alan Gilbert
2020-07-17 18:30               ` Alex Williamson
2020-07-15  8:23         ` Dr. David Alan Gilbert
     [not found]         ` <CAH7mGatPWsczh_rbVhx4a+psJXvkZgKou3r5HrEQTqE7SqZkKA@mail.gmail.com>
2020-07-17 15:18           ` Alex Williamson
2020-07-16  4:16 ` Jason Wang
2020-07-16  8:32   ` Yan Zhao
2020-07-16  9:30     ` Jason Wang
2020-07-17 16:12     ` Alex Williamson
2020-07-20  3:41       ` Jason Wang
2020-07-20 10:39         ` Sean Mooney
2020-07-21  2:11           ` Jason Wang
2020-07-21  0:51       ` Yan Zhao
2020-07-27  7:24         ` Yan Zhao
2020-07-27 22:23           ` Alex Williamson
2020-07-29  8:05             ` Yan Zhao
2020-07-29 11:28               ` Sean Mooney
2020-07-29 19:12                 ` Alex Williamson
2020-07-30  3:41                   ` Yan Zhao
2020-07-30 13:24                     ` Sean Mooney
2020-07-30 17:29                     ` Alex Williamson
2020-08-04  8:37                       ` Yan Zhao
2020-08-05  9:44                         ` Dr. David Alan Gilbert
2020-07-30  1:56                 ` Yan Zhao
2020-07-30 13:14                   ` Sean Mooney
2020-08-04 16:35               ` Cornelia Huck
2020-08-05  2:22                 ` Jason Wang
2020-08-05  2:16                   ` Yan Zhao
2020-08-05  2:41                     ` Jason Wang
2020-08-05  7:56                       ` Jiri Pirko
2020-08-05  8:02                         ` Jason Wang
2020-08-05  9:33                           ` Yan Zhao
2020-08-05 10:53                             ` Jiri Pirko
2020-08-05 11:35                               ` Sean Mooney
2020-08-07 11:59                                 ` Cornelia Huck
2020-08-13 15:33                                   ` Cornelia Huck
2020-08-13 19:02                                     ` Eric Farman
2020-08-17  6:38                                       ` Cornelia Huck
2020-08-10  7:46                               ` Yan Zhao
2020-08-13  4:24                                 ` Jason Wang
2020-08-14  5:16                                   ` Yan Zhao
2020-08-14 12:30                                     ` Sean Mooney
2020-08-17  1:52                                       ` Yan Zhao
2020-08-18  3:24                                     ` Jason Wang
2020-08-18  8:55                                       ` Daniel P. Berrangé
2020-08-18  9:06                                         ` Cornelia Huck
2020-08-18  9:24                                           ` Daniel P. Berrangé
2020-08-18  9:38                                             ` Cornelia Huck
     [not found]                                         ` <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
2020-08-18  9:16                                           ` Daniel P. Berrangé
2020-08-18  9:36                                             ` Cornelia Huck
2020-08-18  9:39                                               ` Parav Pandit
2020-08-19  3:30                                                 ` Yan Zhao
2020-08-19  5:58                                                   ` Parav Pandit
2020-08-19  9:41                                                     ` Jason Wang
2020-08-19  6:57                                                   ` [ovirt-devel] " Jason Wang
2020-08-19  6:59                                                     ` Yan Zhao
2020-08-19  7:39                                                       ` Jason Wang
2020-08-19  8:13                                                         ` Yan Zhao
2020-08-19  9:28                                                           ` Jason Wang
2020-08-20 12:27                                                             ` Cornelia Huck
2020-08-21  3:14                                                               ` Jason Wang
2020-08-21 14:52                                                                 ` Cornelia Huck
2020-08-31  3:07                                                                   ` Jason Wang
2020-08-19 17:50                                                   ` Alex Williamson
2020-08-20  0:18                                                     ` Yan Zhao
2020-08-20  3:13                                                       ` Alex Williamson
2020-08-20  3:09                                                         ` Yan Zhao
2020-08-19  2:54                                               ` Jason Wang
2020-08-20  0:39                                               ` Yan Zhao
2020-08-20  1:29                                                 ` Sean Mooney
2020-08-20  4:01                                                   ` Yan Zhao
2020-08-20  5:16                                                     ` Sean Mooney
2020-08-20  6:27                                                       ` Yan Zhao
2020-08-20 13:24                                                         ` Sean Mooney
2020-08-26  8:54                                                           ` Yan Zhao
2020-08-20  3:22                                                 ` Alex Williamson
2020-08-20  3:16                                                   ` Yan Zhao
2020-08-25 14:39                                                     ` Cornelia Huck
2020-08-26  6:41                                                       ` Yan Zhao
2020-08-28 13:47                                                         ` Cornelia Huck
2020-08-28 14:04                                                           ` Sean Mooney
2020-08-31  4:43                                                             ` Yan Zhao
2020-09-08 14:41                                                               ` Cornelia Huck
2020-09-09  2:13                                                                 ` Yan Zhao
2020-09-10 12:38                                                                   ` Cornelia Huck
2020-09-10 12:50                                                                     ` Sean Mooney
2020-09-10 18:02                                                                       ` Alex Williamson
2020-09-11  0:56                                                                         ` Yan Zhao
2020-09-11 10:08                                                                           ` Cornelia Huck
2020-09-11 10:18                                                                             ` Tian, Kevin
2020-09-11 16:51                                                                           ` Alex Williamson
2020-09-14 13:48                                                                             ` Zeng, Xin
2020-09-14 14:44                                                                               ` Alex Williamson
2020-09-09  5:37                                                               ` Yan Zhao
2020-08-31  2:23                                                           ` Yan Zhao
2020-08-19  2:38                                             ` Jason Wang
2020-08-18  9:32                                           ` Parav Pandit
2020-08-19  2:45                                             ` Jason Wang
2020-08-19  5:26                                               ` Parav Pandit
2020-08-19  6:48                                                 ` Jason Wang
2020-08-19  6:53                                                   ` Parav Pandit
2020-07-29 19:05             ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200714102129.GD25187@redhat.com \
    --to=berrange@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=bao.yumeng@zte.com.cn \
    --cc=cohuck@redhat.com \
    --cc=corbet@lwn.net \
    --cc=devel@ovirt.org \
    --cc=dgilbert@redhat.com \
    --cc=dinechin@redhat.com \
    --cc=eauger@redhat.com \
    --cc=eskultet@redhat.com \
    --cc=hejie.xu@intel.com \
    --cc=intel-gvt-dev@lists.freedesktop.org \
    --cc=jian-feng.ding@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=libvir-list@redhat.com \
    --cc=openstack-discuss@lists.openstack.org \
    --cc=qemu-devel@nongnu.org \
    --cc=shaohe.feng@intel.com \
    --cc=smooney@redhat.com \
    --cc=xin-ran.wang@intel.com \
    --cc=yan.y.zhao@intel.com \
    --cc=zhenyuw@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).