From: Alex Williamson <alex.williamson@redhat.com>
To: Yan Zhao <yan.y.zhao@intel.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"libvir-list@redhat.com" <libvir-list@redhat.com>,
"Jason Wang" <jasowang@redhat.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"Kirti Wankhede" <kwankhede@nvidia.com>,
"eauger@redhat.com" <eauger@redhat.com>,
"xin-ran.wang@intel.com" <xin-ran.wang@intel.com>,
"corbet@lwn.net" <corbet@lwn.net>,
"openstack-discuss@lists.openstack.org"
<openstack-discuss@lists.openstack.org>,
"shaohe.feng@intel.com" <shaohe.feng@intel.com>,
"kevin.tian@intel.com" <kevin.tian@intel.com>,
"Parav Pandit" <parav@mellanox.com>,
"jian-feng.ding@intel.com" <jian-feng.ding@intel.com>,
"dgilbert@redhat.com" <dgilbert@redhat.com>,
"zhenyuw@linux.intel.com" <zhenyuw@linux.intel.com>,
"hejie.xu@intel.com" <hejie.xu@intel.com>,
"bao.yumeng@zte.com.cn" <bao.yumeng@zte.com.cn>,
"Jiri Pirko" <jiri@mellanox.com>,
"eskultet@redhat.com" <eskultet@redhat.com>,
"Parav Pandit" <parav@nvidia.com>,
"smooney@redhat.com" <smooney@redhat.com>,
"intel-gvt-dev@lists.freedesktop.org"
<intel-gvt-dev@lists.freedesktop.org>,
"Daniel P. Berrangé" <berrange@redhat.com>,
"Cornelia Huck" <cohuck@redhat.com>,
"dinechin@redhat.com" <dinechin@redhat.com>,
"devel@ovirt.org" <devel@ovirt.org>
Subject: Re: device compatibility interface for live migration with assigned devices
Date: Wed, 19 Aug 2020 11:50:21 -0600 [thread overview]
Message-ID: <20200819115021.004427a3@x1.home> (raw)
In-Reply-To: <20200819033035.GA21172@joy-OptiPlex-7040>
On Wed, 19 Aug 2020 11:30:35 +0800
Yan Zhao <yan.y.zhao@intel.com> wrote:
> On Tue, Aug 18, 2020 at 09:39:24AM +0000, Parav Pandit wrote:
> > Hi Cornelia,
> >
> > > From: Cornelia Huck <cohuck@redhat.com>
> > > Sent: Tuesday, August 18, 2020 3:07 PM
> > > To: Daniel P. Berrangé <berrange@redhat.com>
> > > Cc: Jason Wang <jasowang@redhat.com>; Yan Zhao
> > > <yan.y.zhao@intel.com>; kvm@vger.kernel.org; libvir-list@redhat.com;
> > > qemu-devel@nongnu.org; Kirti Wankhede <kwankhede@nvidia.com>;
> > > eauger@redhat.com; xin-ran.wang@intel.com; corbet@lwn.net; openstack-
> > > discuss@lists.openstack.org; shaohe.feng@intel.com; kevin.tian@intel.com;
> > > Parav Pandit <parav@mellanox.com>; jian-feng.ding@intel.com;
> > > dgilbert@redhat.com; zhenyuw@linux.intel.com; hejie.xu@intel.com;
> > > bao.yumeng@zte.com.cn; Alex Williamson <alex.williamson@redhat.com>;
> > > eskultet@redhat.com; smooney@redhat.com; intel-gvt-
> > > dev@lists.freedesktop.org; Jiri Pirko <jiri@mellanox.com>;
> > > dinechin@redhat.com; devel@ovirt.org
> > > Subject: Re: device compatibility interface for live migration with assigned
> > > devices
> > >
> > > On Tue, 18 Aug 2020 10:16:28 +0100
> > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >
> > > > On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:
> > > > > On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > > > >
> > > > > On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > > > >
> > > > > On 2020/8/14 下午1:16, Yan Zhao wrote:
> > > > >
> > > > > On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > > >
> > > > > On 2020/8/10 下午3:46, Yan Zhao wrote:
> > > >
> > > > > we actually can also retrieve the same information through sysfs,
> > > > > .e.g
> > > > >
> > > > > |- [path to device]
> > > > > |--- migration
> > > > > | |--- self
> > > > > | | |---device_api
> > > > > | | |---mdev_type
> > > > > | | |---software_version
> > > > > | | |---device_id
> > > > > | | |---aggregator
> > > > > | |--- compatible
> > > > > | | |---device_api
> > > > > | | |---mdev_type
> > > > > | | |---software_version
> > > > > | | |---device_id
> > > > > | | |---aggregator
> > > > >
> > > > >
> > > > > Yes but:
> > > > >
> > > > > - You need one file per attribute (one syscall for one attribute)
> > > > > - Attribute is coupled with kobject
> > >
> > > Is that really that bad? You have the device with an embedded kobject
> > > anyway, and you can just put things into an attribute group?
> > >
> > > [Also, I think that self/compatible split in the example makes things
> > > needlessly complex. Shouldn't semantic versioning and matching already
> > > cover nearly everything? I would expect very few cases that are more
> > > complex than that. Maybe the aggregation stuff, but I don't think we need
> > > that self/compatible split for that, either.]
> > >
> > > > >
> > > > > All of above seems unnecessary.
> > > > >
> > > > > Another point, as we discussed in another thread, it's really hard
> > > > > to make sure the above API work for all types of devices and
> > > > > frameworks. So having a vendor specific API looks much better.
> > > > >
> > > > > From the POV of userspace mgmt apps doing device compat checking /
> > > > > migration, we certainly do NOT want to use different vendor
> > > > > specific APIs. We want to have an API that can be used / controlled in a
> > > standard manner across vendors.
> > > > >
> > > > > Yes, but it could be hard. E.g vDPA will chose to use devlink (there's a
> > > > > long debate on sysfs vs devlink). So if we go with sysfs, at least two
> > > > > APIs needs to be supported ...
> > > >
> > > > NB, I was not questioning devlink vs sysfs directly. If devlink is
> > > > related to netlink, I can't say I'm enthusiastic as IMKE sysfs is
> > > > easier to deal with. I don't know enough about devlink to have much of an
> > > opinion though.
> > > > The key point was that I don't want the userspace APIs we need to deal
> > > > with to be vendor specific.
> > >
> > > From what I've seen of devlink, it seems quite nice; but I understand why
> > > sysfs might be easier to deal with (especially as there's likely already a lot of
> > > code using it.)
> > >
> > > I understand that some users would like devlink because it is already widely
> > > used for network drivers (and some others), but I don't think the majority of
> > > devices used with vfio are network (although certainly a lot of them are.)
> > >
> > > >
> > > > What I care about is that we have a *standard* userspace API for
> > > > performing device compatibility checking / state migration, for use by
> > > > QEMU/libvirt/ OpenStack, such that we can write code without countless
> > > > vendor specific code paths.
> > > >
> > > > If there is vendor specific stuff on the side, that's fine as we can
> > > > ignore that, but the core functionality for device compat / migration
> > > > needs to be standardized.
> > >
> > > To summarize:
> > > - choose one of sysfs or devlink
> > > - have a common interface, with a standardized way to add
> > > vendor-specific attributes
> > > ?
> >
> > Please refer to my previous email which has more example and details.
> hi Parav,
> the example is based on a new vdpa tool running over netlink, not based
> on devlink, right?
> For vfio migration compatibility, we have to deal with both mdev and physical
> pci devices, I don't think it's a good idea to write a new tool for it, given
> we are able to retrieve the same info from sysfs and there's already an
> mdevctl from Alex (https://github.com/mdevctl/mdevctl).
>
> hi All,
> could we decide that sysfs is the interface that every VFIO vendor driver
> needs to provide in order to support vfio live migration, otherwise the
> userspace management tool would not list the device into the compatible
> list?
>
> if that's true, let's move to the standardizing of the sysfs interface.
> (1) content
> common part: (must)
> - software_version: (in major.minor.bugfix scheme)
> - device_api: vfio-pci or vfio-ccw ...
> - type: mdev type for mdev device or
> a signature for physical device which is a counterpart for
> mdev type.
>
> device api specific part: (must)
> - pci id: pci id of mdev parent device or pci id of physical pci
> device (device_api is vfio-pci)
As noted previously, the parent PCI ID should not matter for an mdev
device, if a vendor has a dependency on matching the parent device PCI
ID, that's a vendor specific restriction. An mdev device can also
expose a vfio-pci device API without the parent device being PCI. For
a physical PCI device, shouldn't the PCI ID be encompassed in the
signature? Thanks,
Alex
> - subchannel_type (device_api is vfio-ccw)
>
> vendor driver specific part: (optional)
> - aggregator
> - chpid_type
> - remote_url
>
> NOTE: vendors are free to add attributes in this part with a
> restriction that this attribute is able to be configured with the same
> name in sysfs too. e.g.
> for aggregator, there must be a sysfs attribute in device node
> /sys/devices/pci0000:00/0000:00:02.0/882cc4da-dede-11e7-9180-078a62063ab1/intel_vgpu/aggregator,
> so that the userspace tool is able to configure the target device
> according to source device's aggregator attribute.
>
>
> (2) where and structure
> proposal 1:
> |- [path to device]
> |--- migration
> | |--- self
> | | |-software_version
> | | |-device_api
> | | |-type
> | | |-[pci_id or subchannel_type]
> | | |-<aggregator or chpid_type>
> | |--- compatible
> | | |-software_version
> | | |-device_api
> | | |-type
> | | |-[pci_id or subchannel_type]
> | | |-<aggregator or chpid_type>
> multiple compatible is allowed.
> attributes should be ASCII text files, preferably with only one value
> per file.
>
>
> proposal 2: use bin_attribute.
> |- [path to device]
> |--- migration
> | |--- self
> | |--- compatible
>
> so we can continue use multiline format. e.g.
> cat compatible
> software_version=0.1.0
> device_api=vfio_pci
> type=i915-GVTg_V5_{val1:int:1,2,4,8}
> pci_id=80865963
> aggregator={val1}/2
>
> Thanks
> Yan
>
next prev parent reply other threads:[~2020-08-19 17:51 UTC|newest]
Thread overview: 118+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-13 23:29 device compatibility interface for live migration with assigned devices Yan Zhao
2020-07-14 10:21 ` Daniel P. Berrangé
2020-07-14 12:33 ` Sean Mooney
[not found] ` <20200714110148.0471c03c@x1.home>
[not found] ` <eb705c72cdc8b6b8959b6ebaeeac6069a718d524.camel@redhat.com>
2020-07-14 21:15 ` Sean Mooney
2020-07-14 16:16 ` Alex Williamson
2020-07-14 16:47 ` Daniel P. Berrangé
2020-07-14 20:47 ` Alex Williamson
2020-07-15 9:16 ` Daniel P. Berrangé
2020-07-14 17:19 ` Dr. David Alan Gilbert
2020-07-14 20:59 ` Alex Williamson
2020-07-15 7:37 ` Alex Xu
2020-07-17 15:18 ` Alex Williamson
2020-07-15 8:20 ` Yan Zhao
2020-07-15 8:49 ` Feng, Shaohe
2020-07-15 9:21 ` Alex Xu
2020-07-17 14:59 ` Alex Williamson
2020-07-17 18:03 ` Dr. David Alan Gilbert
2020-07-17 18:30 ` Alex Williamson
2020-07-15 8:23 ` Dr. David Alan Gilbert
2020-07-15 7:23 ` Alex Xu
2020-07-16 4:16 ` Jason Wang
2020-07-16 8:32 ` Yan Zhao
2020-07-16 9:30 ` Jason Wang
2020-07-17 16:12 ` Alex Williamson
2020-07-20 3:41 ` Jason Wang
2020-07-20 10:39 ` Sean Mooney
2020-07-21 2:11 ` Jason Wang
2020-07-21 0:51 ` Yan Zhao
2020-07-27 7:24 ` Yan Zhao
2020-07-27 22:23 ` Alex Williamson
2020-07-29 8:05 ` Yan Zhao
2020-07-29 11:28 ` Sean Mooney
2020-07-29 19:12 ` Alex Williamson
2020-07-30 3:41 ` Yan Zhao
2020-07-30 13:24 ` Sean Mooney
2020-07-30 17:29 ` Alex Williamson
2020-08-04 8:37 ` Yan Zhao
2020-08-05 9:44 ` Dr. David Alan Gilbert
2020-07-30 1:56 ` Yan Zhao
2020-07-30 13:14 ` Sean Mooney
2020-08-04 16:35 ` Cornelia Huck
2020-08-05 2:22 ` Jason Wang
2020-08-05 2:16 ` Yan Zhao
2020-08-05 2:41 ` Jason Wang
2020-08-05 7:56 ` Jiri Pirko
2020-08-05 8:02 ` Jason Wang
2020-08-05 9:33 ` Yan Zhao
2020-08-05 10:53 ` Jiri Pirko
2020-08-05 11:35 ` Sean Mooney
2020-08-07 11:59 ` Cornelia Huck
2020-08-13 15:33 ` Cornelia Huck
2020-08-13 19:02 ` Eric Farman
2020-08-17 6:38 ` Cornelia Huck
2020-08-10 7:46 ` Yan Zhao
2020-08-13 4:24 ` Jason Wang
2020-08-14 5:16 ` Yan Zhao
2020-08-14 12:30 ` Sean Mooney
2020-08-17 1:52 ` Yan Zhao
2020-08-18 3:24 ` Jason Wang
2020-08-18 8:55 ` Daniel P. Berrangé
2020-08-18 9:06 ` Cornelia Huck
2020-08-18 9:24 ` Daniel P. Berrangé
2020-08-18 9:38 ` Cornelia Huck
[not found] ` <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
2020-08-18 9:16 ` Daniel P. Berrangé
2020-08-18 9:36 ` Cornelia Huck
2020-08-18 9:39 ` Parav Pandit
2020-08-19 3:30 ` Yan Zhao
2020-08-19 5:58 ` Parav Pandit
2020-08-19 9:41 ` Jason Wang
2020-08-19 6:57 ` [ovirt-devel] " Jason Wang
2020-08-19 6:59 ` Yan Zhao
2020-08-19 7:39 ` Jason Wang
2020-08-19 8:13 ` Yan Zhao
2020-08-19 9:28 ` Jason Wang
2020-08-20 12:27 ` Cornelia Huck
2020-08-21 3:14 ` Jason Wang
2020-08-21 14:52 ` Cornelia Huck
2020-08-31 3:07 ` Jason Wang
2020-08-19 17:50 ` Alex Williamson [this message]
2020-08-20 0:18 ` Yan Zhao
2020-08-20 3:13 ` Alex Williamson
2020-08-20 3:09 ` Yan Zhao
2020-08-19 2:54 ` Jason Wang
2020-08-20 0:39 ` Yan Zhao
2020-08-20 1:29 ` Sean Mooney
2020-08-20 4:01 ` Yan Zhao
2020-08-20 5:16 ` Sean Mooney
2020-08-20 6:27 ` Yan Zhao
2020-08-20 13:24 ` Sean Mooney
2020-08-26 8:54 ` Yan Zhao
2020-08-20 3:22 ` Alex Williamson
2020-08-20 3:16 ` Yan Zhao
2020-08-25 14:39 ` Cornelia Huck
2020-08-26 6:41 ` Yan Zhao
2020-08-28 13:47 ` Cornelia Huck
2020-08-28 14:04 ` Sean Mooney
2020-08-31 4:43 ` Yan Zhao
2020-09-08 14:41 ` Cornelia Huck
2020-09-09 2:13 ` Yan Zhao
2020-09-10 12:38 ` Cornelia Huck
2020-09-10 12:50 ` Sean Mooney
2020-09-10 18:02 ` Alex Williamson
2020-09-11 0:56 ` Yan Zhao
2020-09-11 10:08 ` Cornelia Huck
2020-09-11 10:18 ` Tian, Kevin
2020-09-11 16:51 ` Alex Williamson
2020-09-14 13:48 ` Zeng, Xin
2020-09-14 14:44 ` Alex Williamson
2020-09-15 7:46 ` Zeng, Xin
2020-09-09 5:37 ` Yan Zhao
2020-08-31 2:23 ` Yan Zhao
2020-08-19 2:38 ` Jason Wang
2020-08-18 9:32 ` Parav Pandit
2020-08-19 2:45 ` Jason Wang
2020-08-19 5:26 ` Parav Pandit
2020-08-19 6:48 ` Jason Wang
2020-08-19 6:53 ` Parav Pandit
2020-07-29 19:05 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200819115021.004427a3@x1.home \
--to=alex.williamson@redhat.com \
--cc=bao.yumeng@zte.com.cn \
--cc=berrange@redhat.com \
--cc=cohuck@redhat.com \
--cc=corbet@lwn.net \
--cc=devel@ovirt.org \
--cc=dgilbert@redhat.com \
--cc=dinechin@redhat.com \
--cc=eauger@redhat.com \
--cc=eskultet@redhat.com \
--cc=hejie.xu@intel.com \
--cc=intel-gvt-dev@lists.freedesktop.org \
--cc=jasowang@redhat.com \
--cc=jian-feng.ding@intel.com \
--cc=jiri@mellanox.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=kwankhede@nvidia.com \
--cc=libvir-list@redhat.com \
--cc=openstack-discuss@lists.openstack.org \
--cc=parav@mellanox.com \
--cc=parav@nvidia.com \
--cc=qemu-devel@nongnu.org \
--cc=shaohe.feng@intel.com \
--cc=smooney@redhat.com \
--cc=xin-ran.wang@intel.com \
--cc=yan.y.zhao@intel.com \
--cc=zhenyuw@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).