kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yan Zhao <yan.y.zhao@intel.com>
To: Jiri Pirko <jiri@mellanox.com>
Cc: Jason Wang <jasowang@redhat.com>,
	Cornelia Huck <cohuck@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	kvm@vger.kernel.org, libvir-list@redhat.com,
	qemu-devel@nongnu.org, kwankhede@nvidia.com, eauger@redhat.com,
	xin-ran.wang@intel.com, corbet@lwn.net,
	openstack-discuss@lists.openstack.org, shaohe.feng@intel.com,
	kevin.tian@intel.com, eskultet@redhat.com,
	jian-feng.ding@intel.com, dgilbert@redhat.com,
	zhenyuw@linux.intel.com, hejie.xu@intel.com,
	bao.yumeng@zte.com.cn, smooney@redhat.com,
	intel-gvt-dev@lists.freedesktop.org, berrange@redhat.com,
	dinechin@redhat.com, devel@ovirt.org,
	Parav Pandit <parav@mellanox.com>
Subject: Re: device compatibility interface for live migration with assigned devices
Date: Mon, 10 Aug 2020 15:46:31 +0800	[thread overview]
Message-ID: <20200810074631.GA29059@joy-OptiPlex-7040> (raw)
In-Reply-To: <20200805105319.GF2177@nanopsycho>

On Wed, Aug 05, 2020 at 12:53:19PM +0200, Jiri Pirko wrote:
> Wed, Aug 05, 2020 at 11:33:38AM CEST, yan.y.zhao@intel.com wrote:
> >On Wed, Aug 05, 2020 at 04:02:48PM +0800, Jason Wang wrote:
> >> 
> >> On 2020/8/5 下午3:56, Jiri Pirko wrote:
> >> > Wed, Aug 05, 2020 at 04:41:54AM CEST, jasowang@redhat.com wrote:
> >> > > On 2020/8/5 上午10:16, Yan Zhao wrote:
> >> > > > On Wed, Aug 05, 2020 at 10:22:15AM +0800, Jason Wang wrote:
> >> > > > > On 2020/8/5 上午12:35, Cornelia Huck wrote:
> >> > > > > > [sorry about not chiming in earlier]
> >> > > > > > 
> >> > > > > > On Wed, 29 Jul 2020 16:05:03 +0800
> >> > > > > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> >> > > > > > 
> >> > > > > > > On Mon, Jul 27, 2020 at 04:23:21PM -0600, Alex Williamson wrote:
> >> > > > > > (...)
> >> > > > > > 
> >> > > > > > > > Based on the feedback we've received, the previously proposed interface
> >> > > > > > > > is not viable.  I think there's agreement that the user needs to be
> >> > > > > > > > able to parse and interpret the version information.  Using json seems
> >> > > > > > > > viable, but I don't know if it's the best option.  Is there any
> >> > > > > > > > precedent of markup strings returned via sysfs we could follow?
> >> > > > > > I don't think encoding complex information in a sysfs file is a viable
> >> > > > > > approach. Quoting Documentation/filesystems/sysfs.rst:
> >> > > > > > 
> >> > > > > > "Attributes should be ASCII text files, preferably with only one value
> >> > > > > > per file. It is noted that it may not be efficient to contain only one
> >> > > > > > value per file, so it is socially acceptable to express an array of
> >> > > > > > values of the same type.
> >> > > > > > Mixing types, expressing multiple lines of data, and doing fancy
> >> > > > > > formatting of data is heavily frowned upon."
> >> > > > > > 
> >> > > > > > Even though this is an older file, I think these restrictions still
> >> > > > > > apply.
> >> > > > > +1, that's another reason why devlink(netlink) is better.
> >> > > > > 
> >> > > > hi Jason,
> >> > > > do you have any materials or sample code about devlink, so we can have a good
> >> > > > study of it?
> >> > > > I found some kernel docs about it but my preliminary study didn't show me the
> >> > > > advantage of devlink.
> >> > > 
> >> > > CC Jiri and Parav for a better answer for this.
> >> > > 
> >> > > My understanding is that the following advantages are obvious (as I replied
> >> > > in another thread):
> >> > > 
> >> > > - existing users (NIC, crypto, SCSI, ib), mature and stable
> >> > > - much better error reporting (ext_ack other than string or errno)
> >> > > - namespace aware
> >> > > - do not couple with kobject
> >> > Jason, what is your use case?
> >> 
> >> 
> >> I think the use case is to report device compatibility for live migration.
> >> Yan proposed a simple sysfs based migration version first, but it looks not
> >> sufficient and something based on JSON is discussed.
> >> 
> >> Yan, can you help to summarize the discussion so far for Jiri as a
> >> reference?
> >> 
> >yes.
> >we are currently defining an device live migration compatibility
> >interface in order to let user space like openstack and libvirt knows
> >which two devices are live migration compatible.
> >currently the devices include mdev (a kernel emulated virtual device)
> >and physical devices (e.g.  a VF of a PCI SRIOV device).
> >
> >the attributes we want user space to compare including
> >common attribues:
> >    device_api: vfio-pci, vfio-ccw...
> >    mdev_type: mdev type of mdev or similar signature for physical device
> >               It specifies a device's hardware capability. e.g.
> >	       i915-GVTg_V5_4 means it's of 1/4 of a gen9 Intel graphics
> >	       device.
> >    software_version: device driver's version.
> >               in <major>.<minor>[.bugfix] scheme, where there is no
> >	       compatibility across major versions, minor versions have
> >	       forward compatibility (ex. 1-> 2 is ok, 2 -> 1 is not) and
> >	       bugfix version number indicates some degree of internal
> >	       improvement that is not visible to the user in terms of
> >	       features or compatibility,
> >
> >vendor specific attributes: each vendor may define different attributes
> >   device id : device id of a physical devices or mdev's parent pci device.
> >               it could be equal to pci id for pci devices
> >   aggregator: used together with mdev_type. e.g. aggregator=2 together
> >               with i915-GVTg_V5_4 means 2*1/4=1/2 of a gen9 Intel
> >	       graphics device.
> >   remote_url: for a local NVMe VF, it may be configured with a remote
> >               url of a remote storage and all data is stored in the
> >	       remote side specified by the remote url.
> >   ...
> >
> >Comparing those attributes by user space alone is not an easy job, as it
> >can't simply assume an equal relationship between source attributes and
> >target attributes. e.g.
> >for a source device of mdev_type=i915-GVTg_V5_4,aggregator=2, (1/2 of
> >gen9), it actually could find a compatible device of
> >mdev_type=i915-GVTg_V5_8,aggregator=4 (also 1/2 of gen9),
> >if mdev_type of i915-GVTg_V5_4 is not available in the target machine.
> >
> >So, in our current proposal, we want to create two sysfs attributes
> >under a device sysfs node.
> >/sys/<path to device>/migration/self
> >/sys/<path to device>/migration/compatible
> >
> >#cat /sys/<path to device>/migration/self
> >device_type=vfio_pci
> >mdev_type=i915-GVTg_V5_4
> >device_id=8086591d
> >aggregator=2
> >software_version=1.0.0
> >
> >#cat /sys/<path to device>/migration/compatible
> >device_type=vfio_pci
> >mdev_type=i915-GVTg_V5_{val1:int:2,4,8}
> >device_id=8086591d
> >aggregator={val1}/2
> >software_version=1.0.0
> >
> >The /sys/<path to device>/migration/self specifies self attributes of
> >a device.
> >The /sys/<path to device>/migration/compatible specifies the list of
> >compatible devices of a device. as in the example, compatible devices
> >could have
> >	device_type == vfio_pci &&
> >	device_id == 8086591d   &&
> >	software_version == 1.0.0 &&
> >        (
> >	(mdev_type of i915-GVTg_V5_2 && aggregator==1) ||
> >	(mdev_type of i915-GVTg_V5_4 && aggregator==2) ||
> >	(mdev_type of i915-GVTg_V5_8 && aggregator=4)
> >	)
> >
> >by comparing whether a target device is in compatible list of source
> >device, the user space can know whether a two devices are live migration
> >compatible.
> >
> >Additional notes:
> >1)software_version in the compatible list may not be necessary as it
> >already has a major.minor.bugfix scheme.
> >2)for vendor attribute like remote_url, it may not be statically
> >assigned and could be changed with a device interface.
> >
> >So, as Cornelia pointed that it's not good to use complex format in
> >a sysfs attribute, we'd like to know whether there're other good ways to
> >our use case, e.g. splitting a single attribute to multiple simple sysfs
> >attributes as what Cornelia suggested or devlink that Jason has strongly
> >recommended.
> 
> Hi Yan.
> 
Hi Jiri,
> Thanks for the explanation, I'm still fuzzy about the details.
> Anyway, I suggest you to check "devlink dev info" command we have
> implemented for multiple drivers. You can try netdevsim to test this.
> I think that the info you need to expose might be put there.
do you mean drivers/net/netdevsim/ ?
> 
> Devlink creates instance per-device. Specific device driver calls into
> devlink core to create the instance.  What device do you have? What
the devlink core is net/core/devlink.c ?

> driver is it handled by?

It looks that the devlink is for network device specific, and in
devlink.h, it says
include/uapi/linux/devlink.h - Network physical device Netlink
interface, I feel like it's not very appropriate for a GPU driver to use
this interface. Is that right?

Thanks
Yan
 

  parent reply	other threads:[~2020-08-10  8:04 UTC|newest]

Thread overview: 114+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-13 23:29 device compatibility interface for live migration with assigned devices Yan Zhao
2020-07-14 10:21 ` Daniel P. Berrangé
2020-07-14 12:33   ` Sean Mooney
     [not found]     ` <20200714110148.0471c03c@x1.home>
     [not found]       ` <eb705c72cdc8b6b8959b6ebaeeac6069a718d524.camel@redhat.com>
2020-07-14 21:15         ` Sean Mooney
2020-07-14 16:16   ` Alex Williamson
2020-07-14 16:47     ` Daniel P. Berrangé
2020-07-14 20:47       ` Alex Williamson
2020-07-15  9:16         ` Daniel P. Berrangé
2020-07-14 17:19     ` Dr. David Alan Gilbert
2020-07-14 20:59       ` Alex Williamson
2020-07-15  8:20         ` Yan Zhao
2020-07-15  8:49           ` Feng, Shaohe
2020-07-17 14:59           ` Alex Williamson
2020-07-17 18:03             ` Dr. David Alan Gilbert
2020-07-17 18:30               ` Alex Williamson
2020-07-15  8:23         ` Dr. David Alan Gilbert
     [not found]         ` <CAH7mGatPWsczh_rbVhx4a+psJXvkZgKou3r5HrEQTqE7SqZkKA@mail.gmail.com>
2020-07-17 15:18           ` Alex Williamson
2020-07-16  4:16 ` Jason Wang
2020-07-16  8:32   ` Yan Zhao
2020-07-16  9:30     ` Jason Wang
2020-07-17 16:12     ` Alex Williamson
2020-07-20  3:41       ` Jason Wang
2020-07-20 10:39         ` Sean Mooney
2020-07-21  2:11           ` Jason Wang
2020-07-21  0:51       ` Yan Zhao
2020-07-27  7:24         ` Yan Zhao
2020-07-27 22:23           ` Alex Williamson
2020-07-29  8:05             ` Yan Zhao
2020-07-29 11:28               ` Sean Mooney
2020-07-29 19:12                 ` Alex Williamson
2020-07-30  3:41                   ` Yan Zhao
2020-07-30 13:24                     ` Sean Mooney
2020-07-30 17:29                     ` Alex Williamson
2020-08-04  8:37                       ` Yan Zhao
2020-08-05  9:44                         ` Dr. David Alan Gilbert
2020-07-30  1:56                 ` Yan Zhao
2020-07-30 13:14                   ` Sean Mooney
2020-08-04 16:35               ` Cornelia Huck
2020-08-05  2:22                 ` Jason Wang
2020-08-05  2:16                   ` Yan Zhao
2020-08-05  2:41                     ` Jason Wang
2020-08-05  7:56                       ` Jiri Pirko
2020-08-05  8:02                         ` Jason Wang
2020-08-05  9:33                           ` Yan Zhao
2020-08-05 10:53                             ` Jiri Pirko
2020-08-05 11:35                               ` Sean Mooney
2020-08-07 11:59                                 ` Cornelia Huck
2020-08-13 15:33                                   ` Cornelia Huck
2020-08-13 19:02                                     ` Eric Farman
2020-08-17  6:38                                       ` Cornelia Huck
2020-08-10  7:46                               ` Yan Zhao [this message]
2020-08-13  4:24                                 ` Jason Wang
2020-08-14  5:16                                   ` Yan Zhao
2020-08-14 12:30                                     ` Sean Mooney
2020-08-17  1:52                                       ` Yan Zhao
2020-08-18  3:24                                     ` Jason Wang
2020-08-18  8:55                                       ` Daniel P. Berrangé
2020-08-18  9:06                                         ` Cornelia Huck
2020-08-18  9:24                                           ` Daniel P. Berrangé
2020-08-18  9:38                                             ` Cornelia Huck
     [not found]                                         ` <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
2020-08-18  9:16                                           ` Daniel P. Berrangé
2020-08-18  9:36                                             ` Cornelia Huck
2020-08-18  9:39                                               ` Parav Pandit
2020-08-19  3:30                                                 ` Yan Zhao
2020-08-19  5:58                                                   ` Parav Pandit
2020-08-19  9:41                                                     ` Jason Wang
2020-08-19  6:57                                                   ` [ovirt-devel] " Jason Wang
2020-08-19  6:59                                                     ` Yan Zhao
2020-08-19  7:39                                                       ` Jason Wang
2020-08-19  8:13                                                         ` Yan Zhao
2020-08-19  9:28                                                           ` Jason Wang
2020-08-20 12:27                                                             ` Cornelia Huck
2020-08-21  3:14                                                               ` Jason Wang
2020-08-21 14:52                                                                 ` Cornelia Huck
2020-08-31  3:07                                                                   ` Jason Wang
2020-08-19 17:50                                                   ` Alex Williamson
2020-08-20  0:18                                                     ` Yan Zhao
2020-08-20  3:13                                                       ` Alex Williamson
2020-08-20  3:09                                                         ` Yan Zhao
2020-08-19  2:54                                               ` Jason Wang
2020-08-20  0:39                                               ` Yan Zhao
2020-08-20  1:29                                                 ` Sean Mooney
2020-08-20  4:01                                                   ` Yan Zhao
2020-08-20  5:16                                                     ` Sean Mooney
2020-08-20  6:27                                                       ` Yan Zhao
2020-08-20 13:24                                                         ` Sean Mooney
2020-08-26  8:54                                                           ` Yan Zhao
2020-08-20  3:22                                                 ` Alex Williamson
2020-08-20  3:16                                                   ` Yan Zhao
2020-08-25 14:39                                                     ` Cornelia Huck
2020-08-26  6:41                                                       ` Yan Zhao
2020-08-28 13:47                                                         ` Cornelia Huck
2020-08-28 14:04                                                           ` Sean Mooney
2020-08-31  4:43                                                             ` Yan Zhao
2020-09-08 14:41                                                               ` Cornelia Huck
2020-09-09  2:13                                                                 ` Yan Zhao
2020-09-10 12:38                                                                   ` Cornelia Huck
2020-09-10 12:50                                                                     ` Sean Mooney
2020-09-10 18:02                                                                       ` Alex Williamson
2020-09-11  0:56                                                                         ` Yan Zhao
2020-09-11 10:08                                                                           ` Cornelia Huck
2020-09-11 10:18                                                                             ` Tian, Kevin
2020-09-11 16:51                                                                           ` Alex Williamson
2020-09-14 13:48                                                                             ` Zeng, Xin
2020-09-14 14:44                                                                               ` Alex Williamson
2020-09-09  5:37                                                               ` Yan Zhao
2020-08-31  2:23                                                           ` Yan Zhao
2020-08-19  2:38                                             ` Jason Wang
2020-08-18  9:32                                           ` Parav Pandit
2020-08-19  2:45                                             ` Jason Wang
2020-08-19  5:26                                               ` Parav Pandit
2020-08-19  6:48                                                 ` Jason Wang
2020-08-19  6:53                                                   ` Parav Pandit
2020-07-29 19:05             ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200810074631.GA29059@joy-OptiPlex-7040 \
    --to=yan.y.zhao@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=bao.yumeng@zte.com.cn \
    --cc=berrange@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=corbet@lwn.net \
    --cc=devel@ovirt.org \
    --cc=dgilbert@redhat.com \
    --cc=dinechin@redhat.com \
    --cc=eauger@redhat.com \
    --cc=eskultet@redhat.com \
    --cc=hejie.xu@intel.com \
    --cc=intel-gvt-dev@lists.freedesktop.org \
    --cc=jasowang@redhat.com \
    --cc=jian-feng.ding@intel.com \
    --cc=jiri@mellanox.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=libvir-list@redhat.com \
    --cc=openstack-discuss@lists.openstack.org \
    --cc=parav@mellanox.com \
    --cc=qemu-devel@nongnu.org \
    --cc=shaohe.feng@intel.com \
    --cc=smooney@redhat.com \
    --cc=xin-ran.wang@intel.com \
    --cc=zhenyuw@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).