From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E57C7C433EB for ; Wed, 15 Jul 2020 12:56:59 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A060E20657 for ; Wed, 15 Jul 2020 12:56:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hXi3lq/P" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A060E20657 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:50084 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jvgy2-0006hu-Uk for qemu-devel@archiver.kernel.org; Wed, 15 Jul 2020 08:56:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60538) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jvbll-0000BC-U1 for qemu-devel@nongnu.org; Wed, 15 Jul 2020 03:23:57 -0400 Received: from mail-il1-x12b.google.com ([2607:f8b0:4864:20::12b]:41916) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jvblj-00046I-9g for qemu-devel@nongnu.org; Wed, 15 Jul 2020 03:23:57 -0400 Received: by mail-il1-x12b.google.com with SMTP id q3so1119527ilt.8 for ; Wed, 15 Jul 2020 00:23:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=iVI9OEi15ucgF+g8KcXMdUgFoTPpFbhlrX4hmZLnLRA=; b=hXi3lq/PYvPejiJDwHppBjJ1qQACESfwwvsxhlSxeJDCiWfGeT0bckB613seWRQba/ sTvlimxF7JxaBYc5ZuA8iXKjV/72zDV4aO2LBQLys9xk5O6s9ckzcGqZhWr/vb2pI3aH p18h72+O+M08CrtVCMUSEACVB84Gj5E/Y+i9Ln9aoAEdvDka9wfCjVrxcQ7izDaRD1mi k3eGqiM+CHNqQnzSyG+NZFNE1epVl2IMgazzDswqrWZJHpWL57LvnHneeAFVL69z7MEz 33f1uXItnU+GpOcRXnFAhcJNW7kXPHbeRALFGRYItYqS/Ry0a2quh6rA+gZOFETiVGRL GwVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=iVI9OEi15ucgF+g8KcXMdUgFoTPpFbhlrX4hmZLnLRA=; b=MF9nExngerFrvOy83bkGqgPgNQLPLsKz77FyUbaZT3HSaa1BA8UiVjTaUKSmlg576u OqMBym5cJr7C3yNADBas5nKAW32OUkcjHYicR9PvSe3hLW7QvaWXQsTppHJyhwhUFyTp xRmrJriusmIis6YRJDDkZmOw3bYDbHXFHa1PwyFLYmUL5c+IEdvJWgqFV2cJCZ4xA4NL KFlU3feoKZ92lt5R/oFO/3yRKMabVasUD00BwHOPWM1aqL0QTWJ/EcBWO07aNvVGCwGN uj12jzHDQSbHdwAKdH5SJX3iEaSwZ9I05G9eyfCjN+FPlJbThX9IoDJE6cjbUrv8vGHA KAwQ== X-Gm-Message-State: AOAM530bzXAz3BrsTKoBqNFagyrFO5nffEng/gRLSrG8UkuFCxlmetAR IKpmgtZYahzUyht7XUYUcLSWCkMgKwsd5aEK9KU= X-Google-Smtp-Source: ABdhPJyq827fJkM7uE904D7wZUH3bEyTHy1LgFjmrX7ip5QaBhrdUlZ4BC1QOhhgM3Gp/e11NzzPPsGxH6U7tUwD1VU= X-Received: by 2002:a92:c78d:: with SMTP id c13mr8710235ilk.85.1594797833754; Wed, 15 Jul 2020 00:23:53 -0700 (PDT) MIME-Version: 1.0 References: <20200713232957.GD5955@joy-OptiPlex-7040> <20200714102129.GD25187@redhat.com> <20200714101616.5d3a9e75@x1.home> In-Reply-To: <20200714101616.5d3a9e75@x1.home> From: Alex Xu Date: Wed, 15 Jul 2020 15:23:42 +0800 Message-ID: Subject: Re: device compatibility interface for live migration with assigned devices To: Alex Williamson Content-Type: multipart/alternative; boundary="0000000000005e9f7d05aa75ce78" Received-SPF: pass client-ip=2607:f8b0:4864:20::12b; envelope-from=soulxu@gmail.com; helo=mail-il1-x12b.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Wed, 15 Jul 2020 08:56:08 -0400 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kvm@vger.kernel.org, libvir-list@redhat.com, qemu-devel@nongnu.org, kwankhede@nvidia.com, eauger@redhat.com, "Wang, Xin-ran" , corbet@lwn.net, openstack-discuss , shaohe.feng@intel.com, kevin.tian@intel.com, Yan Zhao , eskultet@redhat.com, "Ding, Jian-feng" , dgilbert@redhat.com, zhenyuw@linux.intel.com, "Xu, Hejie" , bao.yumeng@zte.com.cn, Sean Mooney , intel-gvt-dev@lists.freedesktop.org, =?UTF-8?Q?Daniel_P=2E_Berrang=C3=A9?= , cohuck@redhat.com, dinechin@redhat.com, devel@ovirt.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" --0000000000005e9f7d05aa75ce78 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Alex Williamson =E4=BA=8E2020=E5=B9=B47=E6=9C= =8815=E6=97=A5=E5=91=A8=E4=B8=89 =E4=B8=8A=E5=8D=8812:16=E5=86=99=E9=81=93= =EF=BC=9A > On Tue, 14 Jul 2020 11:21:29 +0100 > Daniel P. Berrang=C3=A9 wrote: > > > On Tue, Jul 14, 2020 at 07:29:57AM +0800, Yan Zhao wrote: > > > hi folks, > > > we are defining a device migration compatibility interface that helps > upper > > > layer stack like openstack/ovirt/libvirt to check if two devices are > > > live migration compatible. > > > The "devices" here could be MDEVs, physical devices, or hybrid of the > two. > > > e.g. we could use it to check whether > > > - a src MDEV can migrate to a target MDEV, > > > - a src VF in SRIOV can migrate to a target VF in SRIOV, > > > - a src MDEV can migration to a target VF in SRIOV. > > > (e.g. SIOV/SRIOV backward compatibility case) > > > > > > The upper layer stack could use this interface as the last step to > check > > > if one device is able to migrate to another device before triggering = a > real > > > live migration procedure. > > > we are not sure if this interface is of value or help to you. please > don't > > > hesitate to drop your valuable comments. > > > > > > > > > (1) interface definition > > > The interface is defined in below way: > > > > > > __ userspace > > > /\ \ > > > / \write > > > / read \ > > > ________/__________ ___\|/_____________ > > > | migration_version | | migration_version |-->check migration > > > --------------------- --------------------- compatibility > > > device A device B > > > > > > > > > a device attribute named migration_version is defined under each > device's > > > sysfs node. e.g. > (/sys/bus/pci/devices/0000\:00\:02.0/$mdev_UUID/migration_version). > > > userspace tools read the migration_version as a string from the sourc= e > device, > > > and write it to the migration_version sysfs attribute in the target > device. > > > > > > The userspace should treat ANY of below conditions as two devices not > compatible: > > > - any one of the two devices does not have a migration_version > attribute > > > - error when reading from migration_version attribute of one device > > > - error when writing migration_version string of one device to > > > migration_version attribute of the other device > > > > > > The string read from migration_version attribute is defined by device > vendor > > > driver and is completely opaque to the userspace. > > > for a Intel vGPU, string format can be defined like > > > "parent device PCI ID" + "version of gvt driver" + "mdev type" + > "aggregator count". > > > > > for an NVMe VF connecting to a remote storage. it could be > > > "PCI ID" + "driver version" + "configured remote storage URL" > If the "configured remote storage URL" is something configuration setting before the usage, then it isn't something we need for migration compatible check. Openstack only needs to know the target device's driver and hardware compatible for migration, then the scheduler will choose a host which such device, and then Openstack will pre-configure the target host and target device before the migration, then openstack will configure the correct remote storage URL to the device. If we want, we can do a sanity check after the live migration with the os. > > > > > > for a QAT VF, it may be > > > "PCI ID" + "driver version" + "supported encryption set". > > > > > > (to avoid namespace confliction from each vendor, we may prefix a > driver name to > > > each migration_version string. e.g. i915-v1-8086-591d-i915-GVTg_V5_8-= 1) > > It's very strange to define it as opaque and then proceed to describe > the contents of that opaque string. The point is that its contents > are defined by the vendor driver to describe the device, driver version, > and possibly metadata about the configuration of the device. One > instance of a device might generate a different string from another. > The string that a device produces is not necessarily the only string > the vendor driver will accept, for example the driver might support > backwards compatible migrations. > > > > (2) backgrounds > > > > > > The reason we hope the migration_version string is opaque to the > userspace > > > is that it is hard to generalize standard comparing fields and > comparing > > > methods for different devices from different vendors. > > > Though userspace now could still do a simple string compare to check = if > > > two devices are compatible, and result should also be right, it's sti= ll > > > too limited as it excludes the possible candidate whose > migration_version > > > string fails to be equal. > > > e.g. an MDEV with mdev_type_1, aggregator count 3 is probably > compatible > > > with another MDEV with mdev_type_3, aggregator count 1, even their > > > migration_version strings are not equal. > > > (assumed mdev_type_3 is of 3 times equal resources of mdev_type_1). > > > > > > besides that, driver version + configured resources are all elements > demanding > > > to take into account. > > > > > > So, we hope leaving the freedom to vendor driver and let it make the > final decision > > > in a simple reading from source side and writing for test in the > target side way. > > > > > > > > > we then think the device compatibility issues for live migration with > assigned > > > devices can be divided into two steps: > > > a. management tools filter out possible migration target devices. > > > Tags could be created according to info from product specification= . > > > we think openstack/ovirt may have vendor proprietary components to > create > > > those customized tags for each product from each vendor. > > > > > for Intel vGPU, with a vGPU(a MDEV device) in source side, the tag= s > to > > > search target vGPU are like: > > > a tag for compatible parent PCI IDs, > > > a tag for a range of gvt driver versions, > > > a tag for a range of mdev type + aggregator count > > > > > > for NVMe VF, the tags to search target VF may be like: > > > a tag for compatible PCI IDs, > > > a tag for a range of driver versions, > > > a tag for URL of configured remote storage. > > I interpret this as hand waving, ie. the first step is for management > tools to make a good guess :-\ We don't seem to be willing to say that > a given mdev type can only migrate to a device with that same type. > There's this aggregation discussion happening separately where a base > mdev type might be created or later configured to be equivalent to a > different type. The vfio migration API we've defined is also not > limited to mdev devices, for example we could create vendor specific > quirks or hooks to provide migration support for a physical PF/VF > device. Within the realm of possibility then is that we could migrate > between a physical device and an mdev device, which are simply > different degrees of creating a virtualization layer in front of the > device. > > > Requiring management application developers to figure out this possible > > compatibility based on prod specs is really unrealistic. Product specs > > are typically as clear as mud, and with the suggestion we consider > > different rules for different types of devices, add up to a huge amount > > of complexity. This isn't something app developers should have to spend > > their time figuring out. > > Agreed. > > > The suggestion that we make use of vendor proprietary helper components > > is totally unacceptable. We need to be able to build a solution that > > works with exclusively an open source software stack. > > I'm surprised to see this as well, but I'm not sure if Yan was really > suggesting proprietary software so much as just vendor specific > knowledge. > > > IMHO there needs to be a mechanism for the kernel to report via sysfs > > what versions are supported on a given device. This puts the job of > > reporting compatible versions directly under the responsibility of the > > vendor who writes the kernel driver for it. They are the ones with the > > best knowledge of the hardware they've built and the rules around its > > compatibility. > > The version string discussed previously is the version string that > represents a given device, possibly including driver information, > configuration, etc. I think what you're asking for here is an > enumeration of every possible version string that a given device could > accept as an incoming migration stream. If we consider the string as > opaque, that means the vendor driver needs to generate a separate > string for every possible version it could accept, for every possible > configuration option. That potentially becomes an excessive amount of > data to either generate or manage. For the configuration options, there are two kinds of configuration options are needn't for the migration check. * The configuration option makes the device different, for example(could be wrong example, not matching any real hardware), A GPU supports 1024* 768 resolution and 800 * 600 resolution VGPUs, the OpenStack will separate this two kinds of VGPUs into two separate resource pool. so the scheduler already ensures we get a host with such vGPU support. so it needn't encode into the 'version string' discussed here. * The configuration option is setting before usage, just like the 'configured remote storage URL' above, it needn't encoded into the 'version string' also. Since the openstack will configure the correct value before the migration. > Am I overestimating how vendors intend to use the version string? > > We'd also need to consider devices that we could create, for instance > providing the same interface enumeration prior to creating an mdev > device to have a confidence level that the new device would be a valid > target. > > We defined the string as opaque to allow vendor flexibility and because > defining a common format is hard. Do we need to revisit this part of > the discussion to define the version string as non-opaque with parsing > rules, probably with separate incoming vs outgoing interfaces? Thanks, > > Alex > > --0000000000005e9f7d05aa75ce78 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
Alex Williamson <alex.williamson@redhat.com> =E4=BA=8E20= 20=E5=B9=B47=E6=9C=8815=E6=97=A5=E5=91=A8=E4=B8=89 =E4=B8=8A=E5=8D=8812:16= =E5=86=99=E9=81=93=EF=BC=9A
On Tue, 14 Jul 2020 11:21:29 +0100
Daniel P. Berrang=C3=A9 <berrange@redhat.com> wrote:

> On Tue, Jul 14, 2020 at 07:29:57AM +0800, Yan Zhao wrote:
> > hi folks,
> > we are defining a device migration compatibility interface that h= elps upper
> > layer stack like openstack/ovirt/libvirt to check if two devices = are
> > live migration compatible.
> > The "devices" here could be MDEVs, physical devices, or= hybrid of the two.
> > e.g. we could use it to check whether
> > - a src MDEV can migrate to a target MDEV,
> > - a src VF in SRIOV can migrate to a target VF in SRIOV,
> > - a src MDEV can migration to a target VF in SRIOV.
> >=C2=A0 =C2=A0(e.g. SIOV/SRIOV backward compatibility case)
> >
> > The upper layer stack could use this interface as the last step t= o check
> > if one device is able to migrate to another device before trigger= ing a real
> > live migration procedure.
> > we are not sure if this interface is of value or help to you. ple= ase don't
> > hesitate to drop your valuable comments.
> >
> >
> > (1) interface definition
> > The interface is defined in below way:
> >
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 __=C2=A0 =C2=A0 u= serspace
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/\=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 \
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0\write
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/ read=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 \
> >=C2=A0 =C2=A0 ________/__________=C2=A0 =C2=A0 =C2=A0 =C2=A0___\|/= _____________
> >=C2=A0 =C2=A0| migration_version |=C2=A0 =C2=A0 =C2=A0| migration_= version |-->check migration
> >=C2=A0 =C2=A0---------------------=C2=A0 =C2=A0 =C2=A0------------= ---------=C2=A0 =C2=A0compatibility
> >=C2=A0 =C2=A0 =C2=A0 device A=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 device B
> >
> >
> > a device attribute named migration_version is defined under each = device's
> > sysfs node. e.g. (/sys/bus/pci/devices/0000\:00\:02.0/$mdev_UUID/= migration_version).
> > userspace tools read the migration_version as a string from the s= ource device,
> > and write it to the migration_version sysfs attribute in the targ= et device.
> >
> > The userspace should treat ANY of below conditions as two devices= not compatible:
> > - any one of the two devices does not have a migration_version at= tribute
> > - error when reading from migration_version attribute of one devi= ce
> > - error when writing migration_version string of one device to > >=C2=A0 =C2=A0migration_version attribute of the other device
> >
> > The string read from migration_version attribute is defined by de= vice vendor
> > driver and is completely opaque to the userspace.
> > for a Intel vGPU, string format can be defined like
> > "parent device PCI ID" + "version of gvt driver&qu= ot; + "mdev type" + "aggregator count".
> >
> > for an NVMe VF connecting to a remote storage. it could be
> > "PCI ID" + "driver version" + "configure= d remote storage URL"

If the "= ;configured remote storage URL" is something configuration=C2=A0settin= g before the usage, then it isn't something we need for migration compa= tible check. Openstack only needs to know the target device's driver an= d hardware compatible for migration, then the scheduler will choose a host = which such device, and then Openstack will pre-configure the target host an= d target device before the migration, then openstack will configure the cor= rect remote storage URL to the device. If we want, we can do a sanity check= after the live migration with the os.
=C2=A0
> >
> > for a QAT VF, it may be
> > "PCI ID" + "driver version" + "supported= encryption set".
> >
> > (to avoid namespace confliction from each vendor, we may prefix a= driver name to
> > each migration_version string. e.g. i915-v1-8086-591d-i915-GVTg_V= 5_8-1)

It's very strange to define it as opaque and then proceed to describe the contents of that opaque string.=C2=A0 The point is that its contents are defined by the vendor driver to describe the device, driver version, and possibly metadata about the configuration of the device.=C2=A0 One
instance of a device might generate a different string from another.
The string that a device produces is not necessarily the only string
the vendor driver will accept, for example the driver might support
backwards compatible migrations.

> > (2) backgrounds
> >
> > The reason we hope the migration_version string is opaque to the = userspace
> > is that it is hard to generalize standard comparing fields and co= mparing
> > methods for different devices from different vendors.
> > Though userspace now could still do a simple string compare to ch= eck if
> > two devices are compatible, and result should also be right, it&#= 39;s still
> > too limited as it excludes the possible candidate whose migration= _version
> > string fails to be equal.
> > e.g. an MDEV with mdev_type_1, aggregator count 3 is probably com= patible
> > with another MDEV with mdev_type_3, aggregator count 1, even thei= r
> > migration_version strings are not equal.
> > (assumed mdev_type_3 is of 3 times equal resources of mdev_type_1= ).
> >
> > besides that, driver version + configured resources are all eleme= nts demanding
> > to take into account.
> >
> > So, we hope leaving the freedom to vendor driver and let it make = the final decision
> > in a simple reading from source side and writing for test in the = target side way.
> >
> >
> > we then think the device compatibility issues for live migration = with assigned
> > devices can be divided into two steps:
> > a. management tools filter out possible migration target devices.=
> >=C2=A0 =C2=A0 Tags could be created according to info from product= specification.
> >=C2=A0 =C2=A0 we think openstack/ovirt may have vendor proprietary= components to create
> >=C2=A0 =C2=A0 those customized tags for each product from each ven= dor.=C2=A0
>
> >=C2=A0 =C2=A0 for Intel vGPU, with a vGPU(a MDEV device) in source= side, the tags to
> >=C2=A0 =C2=A0 search target vGPU are like:
> >=C2=A0 =C2=A0 a tag for compatible parent PCI IDs,
> >=C2=A0 =C2=A0 a tag for a range of gvt driver versions,
> >=C2=A0 =C2=A0 a tag for a range of mdev type + aggregator count > >
> >=C2=A0 =C2=A0 for NVMe VF, the tags to search target VF may be lik= e:
> >=C2=A0 =C2=A0 a tag for compatible PCI IDs,
> >=C2=A0 =C2=A0 a tag for a range of driver versions,
> >=C2=A0 =C2=A0 a tag for URL of configured remote storage.=C2=A0
I interpret this as hand waving, ie. the first step is for management
tools to make a good guess :-\=C2=A0 We don't seem to be willing to say= that
a given mdev type can only migrate to a device with that same type.
There's this aggregation discussion happening separately where a base mdev type might be created or later configured to be equivalent to a
different type.=C2=A0 The vfio migration API we've defined is also not<= br> limited to mdev devices, for example we could create vendor specific
quirks or hooks to provide migration support for a physical PF/VF
device.=C2=A0 Within the realm of possibility then is that we could migrate=
between a physical device and an mdev device, which are simply
different degrees of creating a virtualization layer in front of the
device.

> Requiring management application developers to figure out this possibl= e
> compatibility based on prod specs is really unrealistic. Product specs=
> are typically as clear as mud, and with the suggestion we consider
> different rules for different types of devices, add up to a huge amoun= t
> of complexity. This isn't something app developers should have to = spend
> their time figuring out.

Agreed.

> The suggestion that we make use of vendor proprietary helper component= s
> is totally unacceptable. We need to be able to build a solution that > works with exclusively an open source software stack.

I'm surprised to see this as well, but I'm not sure if Yan was real= ly
suggesting proprietary software so much as just vendor specific
knowledge.

> IMHO there needs to be a mechanism for the kernel to report via sysfs<= br> > what versions are supported on a given device. This puts the job of > reporting compatible versions directly under the responsibility of the=
> vendor who writes the kernel driver for it. They are the ones with the=
> best knowledge of the hardware they've built and the rules around = its
> compatibility.

The version string discussed previously is the version string that
represents a given device, possibly including driver information,
configuration, etc.=C2=A0 I think what you're asking for here is an
enumeration of every possible version string that a given device could
accept as an incoming migration stream.=C2=A0 If we consider the string as<= br> opaque, that means the vendor driver needs to generate a separate
string for every possible version it could accept, for every possible
configuration option.=C2=A0 That potentially becomes an excessive amount of=
data to either generate or manage.=C2=A0

Fo= r the configuration options, there are two kinds of configuration options a= re needn't for the migration check.

* The conf= iguration option makes the device different, for example(could be wrong exa= mple, not matching any real hardware),=C2=A0 A GPU supports 1024* 768 resol= ution and 800 * 600 resolution VGPUs, the OpenStack will separate this two = kinds of VGPUs into two separate resource pool. so the scheduler already en= sures we get a host with such vGPU support. so it needn't=C2=A0encode i= nto the 'version string' discussed here.
* The configurat= ion option is setting before usage, just like the 'configured remote st= orage URL' above, it needn't encoded into the 'version string&#= 39; also. Since the openstack will configure the correct value before the m= igration.


Am I overestimating how vendors intend to use the version string?

We'd also need to consider devices that we could create, for instance providing the same interface enumeration prior to creating an mdev
device to have a confidence level that the new device would be a valid
target.

We defined the string as opaque to allow vendor flexibility and because
defining a common format is hard.=C2=A0 Do we need to revisit this part of<= br> the discussion to define the version string as non-opaque with parsing
rules, probably with separate incoming vs outgoing interfaces?=C2=A0 Thanks= ,

Alex

--0000000000005e9f7d05aa75ce78--