linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Yan Zhao <yan.y.zhao@intel.com>
Cc: "cjia@nvidia.com" <cjia@nvidia.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"aik@ozlabs.ru" <aik@ozlabs.ru>,
	"Zhengxiao.zx@alibaba-inc.com" <Zhengxiao.zx@alibaba-inc.com>,
	"shuangtai.tst@alibaba-inc.com" <shuangtai.tst@alibaba-inc.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"kwankhede@nvidia.com" <kwankhede@nvidia.com>,
	"eauger@redhat.com" <eauger@redhat.com>,
	"Liu, Yi L" <yi.l.liu@intel.com>,
	"eskultet@redhat.com" <eskultet@redhat.com>,
	"Yang, Ziye" <ziye.yang@intel.com>,
	"mlevitsk@redhat.com" <mlevitsk@redhat.com>,
	"pasic@linux.ibm.com" <pasic@linux.ibm.com>,
	"libvir-list@redhat.com" <libvir-list@redhat.com>,
	"arei.gonglei@huawei.com" <arei.gonglei@huawei.com>,
	"felipe@nutanix.com" <felipe@nutanix.com>,
	"Ken.Xue@amd.com" <Ken.Xue@amd.com>,
	"Tian, Kevin" <kevin.tian@intel.com>,
	"dgilbert@redhat.com" <dgilbert@redhat.com>,
	"zhenyuw@linux.intel.com" <zhenyuw@linux.intel.com>,
	"intel-gvt-dev@lists.freedesktop.org" 
	<intel-gvt-dev@lists.freedesktop.org>,
	"Liu, Changpeng" <changpeng.liu@intel.com>,
	"cohuck@redhat.com" <cohuck@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Wang, Zhi A" <zhi.a.wang@intel.com>,
	"jonathan.davies@nutanix.com" <jonathan.davies@nutanix.com>,
	"He, Shaopeng" <shaopeng.he@intel.com>
Subject: Re: [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device
Date: Tue, 23 Apr 2019 09:02:56 -0600	[thread overview]
Message-ID: <20190423090256.662d5907@x1.home> (raw)
In-Reply-To: <20190423054157.GA26190@joy-OptiPlex-7040>

On Tue, 23 Apr 2019 01:41:57 -0400
Yan Zhao <yan.y.zhao@intel.com> wrote:

> On Tue, Apr 23, 2019 at 09:21:00AM +0800, Alex Williamson wrote:
> > On Mon, 22 Apr 2019 21:01:52 -0400
> > Yan Zhao <yan.y.zhao@intel.com> wrote:
> >   
> > > On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:  
> > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> > > >     
> > > > > device version attribute in mdev sysfs is used by user space software
> > > > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > > > migration.    
> > > > 
> > > > The Subject: doesn't quite match what's being proposed here.
> > > >      
> > > > > It consists of two parts: common part and vendor proprietary part.
> > > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > > >              identifies device type. e.g., for pci device, it is
> > > > >              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).    
> > > > 
> > > > What purpose does this serve?  If it's intended as some sort of
> > > > namespace feature, shouldn't we first assume that we can only support
> > > > migration to devices of the same type?  Therefore each type would
> > > > already have its own namespace.  Also that would make the trailing bit
> > > > of the version string listed below in the example redundant.  A vendor
> > > > is still welcome to include this in their version string if they wish,
> > > > but I think the string should be entirely vendor defined.
> > > >    
> > > hi Alex,
> > > This common part is a kind of namespace.
> > > Because if version string is entirely defined by vendors, I'm worried about
> > > if there is a case that one vendor's version string happens to deceive and
> > > interfere with another vendor's version checking?
> > > e.g.
> > > vendor A has a version string like: vendor id + device id + mdev type
> > > vendor B has a version string like: device id + vendor id + mdev type
> > > but vendor A's vendor id is 0x8086, device id is 0x1217
> > > vendor B's vendor id is 0x1217, device id is 0x8086.
> > > 
> > > In this corner case, the two vendors may regard the two device is
> > > migratable but actually they are not.
> > > 
> > > That's the reason for this common part that serve as a kind of namespace
> > > that all vendors will comply with to avoid overlap.  
> > 
> > If we assume that migration can only occur between matching mdev types,
> > this is redundant, each type already has their own namespace.
> >  
> hi Alex,
> do you mean user space software like libvirt needs to first check whether
> mdev type is matching and then check whether version is matching?

Yes.
 
> if user space software only checks version for migration, it means vendor
> driver has to include mdev type in their vendor proprietary part string,
> right?

Userspace attempting to migrate an nvidia-64 to an i915-GVT_V5_4 would
be a failure on the part of the user.

> Another thing is that could there be any future mdev parent driver that
> applies to all mdev devices, just like vfio-pci? like Yi's vfio-pci-mdev
> driver (https://lkml.org/lkml/2019/3/13/114)?

For starters, this is just a sample driver from which vendor specific
mdev drivers could be forked to support these features, but
additionally, the type is defined by the vendor driver, so even a meta
driver like vfio-pci-mdev could create types like
"vfio-pci-mdev-8086_10c9_abcdef" if it wanted to provide that specific
device.  The "vfio-pci-mdev-type1" is just a sample implementation to
say "the native type of the thing bound to me" and it's going to have
limited usefulness for any sort of persistence to userspace.  Thus,
it's a sample driver.  Thanks,

Alex

> > > > > vendor proprietary part: this part is varied in length. vendor driver can
> > > > >              specify any string to identify a device.
> > > > > 
> > > > > When reading this attribute, it should show device version string of the
> > > > > device of type <type-id>. If a device does not support live migration, it
> > > > > should return errno.
> > > > > When writing a string to this attribute, it returns errno for
> > > > > incompatibility or returns written string length in compatibility case.
> > > > > If a device does not support live migration, it always returns errno.
> > > > > 
> > > > > For user space software to use:
> > > > > 1.
> > > > > Before starting live migration, user space software first reads source side
> > > > > mdev device's version. e.g.
> > > > > "#cat \
> > > > > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > > > > 00028086-193b-i915-GVTg_V5_4
> > > > > 
> > > > > 2.
> > > > > Then, user space software writes the source side returned version string
> > > > > to device version attribute in target side, and checks the return value.
> > > > > If a negative errno is returned in the target side, then mdev devices in
> > > > > source and target sides are not compatible;
> > > > > If a positive number is returned and it equals to the length of written
> > > > > string, then the two mdev devices in source and target side are compatible.
> > > > > e.g.
> > > > > (a) compatibility case
> > > > > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > > 
> > > > > (b) incompatibility case
> > > > > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > > -bash: echo: write error: Invalid argument
> > > > > 
> > > > > 3. if two mdev devices are compatible, user space software can start
> > > > > live migration, and vice versa.
> > > > > 
> > > > > Note: if a mdev device does not support live migration, it either does
> > > > > not provide a version attribute, or always returns errno when its version
> > > > > attribute is read/written.    
> > > > 
> > > > I think it would be cleaner to do the former, not supply the
> > > > attribute.  This seems to do the latter in the sample drivers.  Thanks,    
> > > Ok. you are right!
> > > what about just keep one sample driver to show how to do the latter,
> > > and let the others do the former?  
> > 
> > I'd rather that if a vendor driver doesn't support features requiring
> > the version attribute that they don't implement it.  It's confusing to
> > developers looking at the sample driver for guidance if we have
> > different implementations.  Of course if you'd like to add migration
> > support to one of the sample drivers, that'd be very welcome.  Thanks,
> >  
> Got it:)
> 
> Thanks!
> Yan
> 
> >   
> > > > > Cc: Alex Williamson <alex.williamson@redhat.com>
> > > > > Cc: Erik Skultety <eskultet@redhat.com>
> > > > > Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > > > Cc: Cornelia Huck <cohuck@redhat.com>
> > > > > Cc: "Tian, Kevin" <kevin.tian@intel.com>
> > > > > Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> > > > > Cc: "Wang, Zhi A" <zhi.a.wang@intel.com>
> > > > > Cc: Neo Jia <cjia@nvidia.com>
> > > > > Cc: Kirti Wankhede <kwankhede@nvidia.com>
> > > > > 
> > > > > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > > > > ---
> > > > >  Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> > > > >  samples/vfio-mdev/mbochs.c             | 17 ++++++++++++
> > > > >  samples/vfio-mdev/mdpy.c               | 16 ++++++++++++
> > > > >  samples/vfio-mdev/mtty.c               | 16 ++++++++++++
> > > > >  4 files changed, 85 insertions(+)
> > > > > 
> > > > > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > > > > index c3f69bcaf96e..bc28471c0667 100644
> > > > > --- a/Documentation/vfio-mediated-device.txt
> > > > > +++ b/Documentation/vfio-mediated-device.txt
> > > > > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > >    |     |   |--- available_instances
> > > > >    |     |   |--- device_api
> > > > >    |     |   |--- description
> > > > > +  |     |   |--- version
> > > > >    |     |   |--- [devices]
> > > > >    |     |--- [<type-id>]
> > > > >    |     |   |--- create
> > > > > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > >    |     |   |--- available_instances
> > > > >    |     |   |--- device_api
> > > > >    |     |   |--- description
> > > > > +  |     |   |--- version
> > > > >    |     |   |--- [devices]
> > > > >    |     |--- [<type-id>]
> > > > >    |          |--- create
> > > > > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > >    |          |--- available_instances
> > > > >    |          |--- device_api
> > > > >    |          |--- description
> > > > > +  |          |--- version
> > > > >    |          |--- [devices]
> > > > >  
> > > > >  * [mdev_supported_types]
> > > > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > > > >    [<type-id>], device_api, and available_instances are mandatory attributes
> > > > >    that should be provided by vendor driver.
> > > > >  
> > > > > +  version is a mandatory attribute if a mdev device supports live migration.
> > > > > +
> > > > >  * [<type-id>]
> > > > >  
> > > > >    The [<type-id>] name is created by adding the device driver string as a prefix
> > > > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > > > >    This attribute should show the number of devices of type <type-id> that can be
> > > > >    created.
> > > > >  
> > > > > +* version
> > > > > +
> > > > > +  This attribute is rw. It is used to check whether two devices are compatible
> > > > > +  for live migration. If this attribute is missing, then the corresponding mdev
> > > > > +  device is regarded as not supporting live migration.
> > > > > +
> > > > > +  It consists of two parts: common part and vendor proprietary part.
> > > > > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > > +               device type. e.g., for pci device, it is
> > > > > +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > > +  vendor proprietary part: this part is varied in length. vendor driver can
> > > > > +               specify any string to identify a device.
> > > > > +
> > > > > +  When reading this attribute, it should show device version string of the device
> > > > > +  of type <type-id>. If a device does not support live migration, it should
> > > > > +  return errno.
> > > > > +  When writing a string to this attribute, it returns errno for incompatibility
> > > > > +  or returns written string length in compatibility case. If a device does not
> > > > > +  support live migration, it always returns errno.
> > > > > +
> > > > > +  for example.
> > > > > +  # cat \
> > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > > > +  00028086-193b-i915-GVTg_V5_2
> > > > > +
> > > > > +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > > > + -bash: echo: write error: Invalid argument
> > > > > +
> > > > >  * [device]
> > > > >  
> > > > >    This directory contains links to the devices of type <type-id> that have been
> > > > > @@ -327,12 +361,14 @@ card.
> > > > >          |   |   |-- available_instances
> > > > >          |   |   |-- create
> > > > >          |   |   |-- device_api
> > > > > +        |   |   |-- version
> > > > >          |   |   |-- devices
> > > > >          |   |   `-- name
> > > > >          |   `-- mtty-2
> > > > >          |       |-- available_instances
> > > > >          |       |-- create
> > > > >          |       |-- device_api
> > > > > +        |       |-- version
> > > > >          |       |-- devices
> > > > >          |       `-- name
> > > > >          |-- mtty_dev
> > > > > diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
> > > > > index b038aa9f5a70..2f5ba96b91a2 100644
> > > > > --- a/samples/vfio-mdev/mbochs.c
> > > > > +++ b/samples/vfio-mdev/mbochs.c
> > > > > @@ -1391,11 +1391,28 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > >  }
> > > > >  MDEV_TYPE_ATTR_RO(device_api);
> > > > >  
> > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > +		char *buf)
> > > > > +{
> > > > > +	/* do not support live migration */
> > > > > +	return -EINVAL;
> > > > > +}
> > > > > +
> > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > +		const char *buf, size_t count)
> > > > > +{
> > > > > +	/* do not support live migration */
> > > > > +	return -EINVAL;
> > > > > +}
> > > > > +
> > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > +
> > > > >  static struct attribute *mdev_types_attrs[] = {
> > > > >  	&mdev_type_attr_name.attr,
> > > > >  	&mdev_type_attr_description.attr,
> > > > >  	&mdev_type_attr_device_api.attr,
> > > > >  	&mdev_type_attr_available_instances.attr,
> > > > > +	&mdev_type_attr_version.attr,
> > > > >  	NULL,
> > > > >  };
> > > > >  
> > > > > diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
> > > > > index cc86bf6566e4..ff15fdfc7d46 100644
> > > > > --- a/samples/vfio-mdev/mdpy.c
> > > > > +++ b/samples/vfio-mdev/mdpy.c
> > > > > @@ -695,11 +695,27 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > >  }
> > > > >  MDEV_TYPE_ATTR_RO(device_api);
> > > > >  
> > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > +		char *buf)
> > > > > +{
> > > > > +	/* do not support live migration */
> > > > > +	return -EINVAL;
> > > > > +}
> > > > > +
> > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > +		const char *buf, size_t count)
> > > > > +{
> > > > > +	/* do not support live migration */
> > > > > +	return -EINVAL;
> > > > > +}
> > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > +
> > > > >  static struct attribute *mdev_types_attrs[] = {
> > > > >  	&mdev_type_attr_name.attr,
> > > > >  	&mdev_type_attr_description.attr,
> > > > >  	&mdev_type_attr_device_api.attr,
> > > > >  	&mdev_type_attr_available_instances.attr,
> > > > > +	&mdev_type_attr_version.attr,
> > > > >  	NULL,
> > > > >  };
> > > > >  
> > > > > diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> > > > > index 1c77c370c92f..4ae3aad3474d 100644
> > > > > --- a/samples/vfio-mdev/mtty.c
> > > > > +++ b/samples/vfio-mdev/mtty.c
> > > > > @@ -1390,10 +1390,26 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > >  
> > > > >  MDEV_TYPE_ATTR_RO(device_api);
> > > > >  
> > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > +		char *buf)
> > > > > +{
> > > > > +	/* do not support live migration */
> > > > > +	return -EINVAL;
> > > > > +}
> > > > > +
> > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > +		const char *buf, size_t count)
> > > > > +{
> > > > > +	/* do not support live migration */
> > > > > +	return -EINVAL;
> > > > > +}
> > > > > +
> > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > >  static struct attribute *mdev_types_attrs[] = {
> > > > >  	&mdev_type_attr_name.attr,
> > > > >  	&mdev_type_attr_device_api.attr,
> > > > >  	&mdev_type_attr_available_instances.attr,
> > > > > +	&mdev_type_attr_version.attr,
> > > > >  	NULL,
> > > > >  };
> > > > >      
> > > > 
> > > > _______________________________________________
> > > > intel-gvt-dev mailing list
> > > > intel-gvt-dev@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev    
> > 
> > _______________________________________________
> > intel-gvt-dev mailing list
> > intel-gvt-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev  


  parent reply	other threads:[~2019-04-23 15:03 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-19  8:32 [PATCH 0/2] introduction of version attribute for VFIO live migration Yan Zhao
2019-04-19  8:35 ` [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device Yan Zhao
2019-04-22 14:39   ` Alex Williamson
2019-04-23  1:01     ` Yan Zhao
2019-04-23  1:21       ` Alex Williamson
2019-04-23  5:41         ` Yan Zhao
2019-04-23  9:45           ` Cornelia Huck
2019-04-23 10:24           ` [Qemu-devel] " Daniel P. Berrangé
2019-04-24  3:33             ` Yan Zhao
2019-04-23 15:02           ` Alex Williamson [this message]
     [not found]             ` <20190424033934.GD26247@joy-OptiPlex-7040>
2019-04-24 14:14               ` Alex Williamson
2019-04-26  1:44                 ` Yan Zhao
2019-04-23  9:59   ` Cornelia Huck
2019-04-24  3:10     ` Yan Zhao
2019-04-24  7:56       ` Cornelia Huck
2019-04-24  8:15         ` Yan Zhao
2019-04-30 15:29           ` Cornelia Huck
2019-05-07  5:39             ` Yan Zhao
2019-05-07  8:51               ` Cornelia Huck
2019-04-23 10:39   ` [Qemu-devel] " Daniel P. Berrangé
2019-04-23 12:35     ` Alex Williamson
2019-04-23 13:44       ` Daniel P. Berrangé
2019-04-23 14:48         ` Alex Williamson
2019-04-23 14:57           ` Daniel P. Berrangé
2019-04-24  4:13     ` Neo Jia
2019-04-24  9:10     ` Christophe de Dinechin
2019-04-19  8:35 ` [PATCH 2/2] drm/i915/gvt: export mdev device version to sysfs for Intel vGPU Yan Zhao
2019-04-22  8:37   ` Zhenyu Wang
2019-04-23 11:39   ` Cornelia Huck
2019-04-24  2:33     ` Yan Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190423090256.662d5907@x1.home \
    --to=alex.williamson@redhat.com \
    --cc=Ken.Xue@amd.com \
    --cc=Zhengxiao.zx@alibaba-inc.com \
    --cc=aik@ozlabs.ru \
    --cc=arei.gonglei@huawei.com \
    --cc=changpeng.liu@intel.com \
    --cc=cjia@nvidia.com \
    --cc=cohuck@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eauger@redhat.com \
    --cc=eskultet@redhat.com \
    --cc=felipe@nutanix.com \
    --cc=intel-gvt-dev@lists.freedesktop.org \
    --cc=jonathan.davies@nutanix.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=libvir-list@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mlevitsk@redhat.com \
    --cc=pasic@linux.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=shaopeng.he@intel.com \
    --cc=shuangtai.tst@alibaba-inc.com \
    --cc=yan.y.zhao@intel.com \
    --cc=yi.l.liu@intel.com \
    --cc=zhenyuw@linux.intel.com \
    --cc=zhi.a.wang@intel.com \
    --cc=ziye.yang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).