All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: "John G Johnson" <john.g.johnson@oracle.com>,
	"Tian, Kevin" <kevin.tian@intel.com>,
	mtsirkin@redhat.com, "Daniel P. Berrangé" <berrange@redhat.com>,
	quintela@redhat.com, "Jason Wang" <jasowang@redhat.com>,
	"Zeng,  Xin" <xin.zeng@intel.com>,
	qemu-devel@nongnu.org,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Yan Zhao" <yan.y.zhao@intel.com>,
	"Kirti Wankhede" <kwankhede@nvidia.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Gerd Hoffmann" <kraxel@redhat.com>,
	"Felipe Franciosi" <felipe@nutanix.com>,
	"Christophe de Dinechin" <dinechin@redhat.com>,
	"Thanos Makatos" <thanos.makatos@nutanix.com>
Subject: Re: [RFC v3] VFIO Migration
Date: Tue, 10 Nov 2020 13:14:04 -0700	[thread overview]
Message-ID: <20201110131404.2c0f0d9d@w520.home> (raw)
In-Reply-To: <20201110095349.GA1082456@stefanha-x1.localdomain>

On Tue, 10 Nov 2020 09:53:49 +0000
Stefan Hajnoczi <stefanha@redhat.com> wrote:
> VFIO mdev Drivers
> -----------------
> The following mdev type sysfs attrs are available for managing device
> instances::
> 
>   /sys/.../<parent-device>/mdev_supported_types/<type-id>/
>     create - writing a UUID to this file instantiates a device
>     migration_info.json - read-only migration information JSON
> 
> TODO The JSON can be represented as a file system hierarchy but sysfs seems
> limited to <kobject>/<group>/<attr> and <kobject>/<attr> so it is not possible
> to express deeper attr groups like <kobject>/migration/params/<param>/<attr>?


Complex structured formats have been proposed in other threads related
to migration compatibility and generally been dismissed as not adhering
to the standards of sysfs per:

Documentation/filesystems/sysfs.rst:
---
Attributes
~~~~~~~~~~

Attributes can be exported for kobjects in the form of regular files in
the filesystem. Sysfs forwards file I/O operations to methods defined
for the attributes, providing a means to read and write kernel
attributes.

Attributes should be ASCII text files, preferably with only one value
per file. It is noted that it may not be efficient to contain only one
value per file, so it is socially acceptable to express an array of
values of the same type.

Mixing types, expressing multiple lines of data, and doing fancy
formatting of data is heavily frowned upon. Doing these things may get
you publicly humiliated and your code rewritten without notice.
---

We'd either need to address your TODO and create a hierarchical
representation or find another means to exchange this format.


> Device models supported by an mdev driver and their details can be read from
> the migration_info.json attr. Each mdev type supports one device model. If a
> parent device supports multiple device models then each device model has an
> mdev type. There may be multiple mdev types for a single device model when they
> offer different migration parameters such as resource capacity or feature
> availability.
> 
> For example, a graphics card that supports 4 GB and 8 GB device instances would
> provide gfx-4GB and gfx-8GB mdev types with memory=4096 and memory=8192
> migration parameters, respectively.


I think this example could be expanded for clarity.  I think this is
suggesting we have mdev_types of gfx-4GB and gfx-8GB, which each
implement some common device model, ie. com.gfx/GPU, where the
migration parameter 'memory' for each defaults to a value matching the
type name.  But it seems like this can also lead to some combinatorial
challenges for management tools if these parameters are writable.  For
example, should a management tool create a gfx-4GB device and change to
memory parameter to 8192 or a gfx-8GB device with the default parameter?


> The following mdev device sysfs attrs relate to a specific device instance::
> 
>   /sys/.../<parent-device>/<uuid>/
>     mdev_type/ - symlink to mdev type sysfs attrs, e.g. to fetch migration/model


We need a mechanism that translates to non-mdev vfio devices as well,
the device "model" creates a clean separation from an mdev-type, we
shouldn't reintroduce that dependency here.


>     migration/ - migration related files
>       <param> - read/write migration parameter "param"
>       ...
> 
> When the device is created all migration/<param> attrs take their
> migration_info.json "init_value".
> 
> When preparing for migration on the source, each migration parameter from
> migration/<param> is read and added to the migration parameter list if its
> value differs from "off_value" in migration_info.json. If a migration parameter
> in the list is not available on the destination, then migration is not
> possible. If a migration parameter value is not in the destination
> "allowed_values" migration_info.json then migration is not possible.
> 
> In order to prepare an mdev device instance for an incoming migration on the
> destination, the "off_value" from migration_info.json is written to each
> migration parameter in migration/<param>. Then the migration parameter list
> from the source is written to migration/<param> one migration parameter at a
> time. If an error occurs while writing a migration parameter on the destination
> then migration is not possible. Once the migration parameter list has been
> written the mdev can be opened and migration can proceed.


What's the logic behind setting the value twice?  If we have a
preconfigured pool of devices where the off_value might use less
resources, we risk that resources might be consumed elsewhere if we
release them and try to get them back.  It also seems rather
inefficient.

 
> An open mdev device typically does not allow migration parameters to be changed
> at runtime. However, certain migration/params attrs may allow writes at
> runtime. Usually these migration parameters only affect the device state
> representation and not the hardware interface. This makes it possible to
> upgrade or downgrade the device state representation at runtime so that
> migration is possible to newer or older device implementations.


Which begs the question of how we'd determine which can be modified
runtime...  Thanks,

Alex



  parent reply	other threads:[~2020-11-10 20:16 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-10  9:53 [RFC v3] VFIO Migration Stefan Hajnoczi
2020-11-10 11:12 ` Paolo Bonzini
2020-11-11 14:36   ` Stefan Hajnoczi
2020-11-11 15:48     ` Daniel P. Berrangé
2020-11-12 15:26       ` Cornelia Huck
2020-11-16 10:48       ` Stefan Hajnoczi
2020-11-16 11:15       ` Stefan Hajnoczi
2020-11-16 11:41         ` Daniel P. Berrangé
2020-11-16 12:03           ` Michael S. Tsirkin
2020-11-16 12:05             ` Daniel P. Berrangé
2020-11-16 12:34               ` Michael S. Tsirkin
2020-11-16 12:45                 ` Daniel P. Berrangé
2020-11-16 12:51                   ` Michael S. Tsirkin
2020-11-16 12:48         ` Gerd Hoffmann
2020-11-16 12:54           ` Michael S. Tsirkin
2020-11-16 12:06       ` Michael S. Tsirkin
2020-11-10 20:14 ` Alex Williamson [this message]
2020-11-11 11:48   ` Cornelia Huck
2020-11-11 15:14     ` Stefan Hajnoczi
2020-11-11 15:35       ` Cornelia Huck
2020-11-16 11:02         ` Stefan Hajnoczi
2020-11-16 13:52           ` Cornelia Huck
2020-11-16 17:30             ` Alex Williamson
2020-11-24 17:24               ` Dr. David Alan Gilbert
2020-11-11 15:10   ` Stefan Hajnoczi
2020-11-11 15:28     ` Cornelia Huck
2020-11-16 11:36       ` Stefan Hajnoczi
2020-11-11 11:19 ` Cornelia Huck
2020-11-11 15:35   ` Stefan Hajnoczi
2020-11-11 12:56 ` Dr. David Alan Gilbert
2020-11-11 15:34   ` Stefan Hajnoczi
2020-11-11 15:41     ` Dr. David Alan Gilbert
2020-11-16 14:38       ` Stefan Hajnoczi
2020-11-17  9:44         ` Michael S. Tsirkin
2020-12-01 13:17           ` Stefan Hajnoczi
2020-11-11 16:18 ` Thanos Makatos
2020-11-16 15:24   ` Stefan Hajnoczi
2020-11-24 17:29     ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201110131404.2c0f0d9d@w520.home \
    --to=alex.williamson@redhat.com \
    --cc=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=dinechin@redhat.com \
    --cc=felipe@nutanix.com \
    --cc=jasowang@redhat.com \
    --cc=john.g.johnson@oracle.com \
    --cc=kevin.tian@intel.com \
    --cc=kraxel@redhat.com \
    --cc=kwankhede@nvidia.com \
    --cc=mtsirkin@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=thanos.makatos@nutanix.com \
    --cc=xin.zeng@intel.com \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.