All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: John G Johnson <john.g.johnson@oracle.com>,
	mtsirkin@redhat.com, quintela@redhat.com,
	Jason Wang <jasowang@redhat.com>,
	Felipe Franciosi <felipe@nutanix.com>,
	Kirti Wankhede <kwankhede@nvidia.com>,
	qemu-devel@nongnu.org,
	Alex Williamson <alex.williamson@redhat.com>,
	Thanos Makatos <thanos.makatos@nutanix.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: VFIO Migration
Date: Tue, 3 Nov 2020 15:05:08 +0000	[thread overview]
Message-ID: <20201103150508.GB253848@stefanha-x1.localdomain> (raw)
In-Reply-To: <20201103113929.GH205187@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 4410 bytes --]

On Tue, Nov 03, 2020 at 11:39:29AM +0000, Daniel P. Berrangé wrote:
> On Mon, Nov 02, 2020 at 11:11:53AM +0000, Stefan Hajnoczi wrote:
> > Overview
> > --------
> > The purpose of device states is to save the device at a point in time and then
> > restore the device back to the saved state later. This is more challenging than
> > it first appears.
> > 
> > The process of saving a device state and loading it later is called
> > *migration*. The state may be loaded by the same device that saved it or by a
> > new instance of the device, possibly running on a different computer.
> > 
> > It must be possible to migrate to a newer implementation of the device
> > as well as to an older implementation of the device. This allows users
> > to upgrade and roll back their systems.
> > 
> > Migration can fail if loading the device state is not possible. It should fail
> > early with a clear error message. It must not appear to complete but leave the
> > device inoperable due to a migration problem.
> 
> I think there needs to be an addition requirement.
> 
>  It must be possible for a management application to query the supported
>  versions, independantly of execution of a migration  operation.
> 
> This is important to large scale data center / cloud management applications
> because before initiating a migration they need to *automatically* select
> a target host with high level of confidence that is will be compatible with
> the source host.
> 
> Today QEMU migration compatibility is largely determined by the machine
> type version. Apps can query the supported machine types for host to
> check whether it is compatible. Similarly they will query CPU model
> features to check compatiblity.
> 
> Validation and error checking at time of migration is of course still
> required, but the goal should be that an mgmt application will *NEVER*
> hit these errors because they will have pre-selected a host that is
> known to be compatible based on reported versions that are supported.

Okay. What do you think of the following?

  [
    {
      "model": "https://qemu.org/devices/e1000e",
      "params": [
        "rss",
	...more configuration parameters...
      ],
      "versions": [
        {
	  "name": "1",
	  "params": [],
	},
	{
	  "name": "2",
	  "params": ["rss=on"],
	},
	...more versions...
      ]
    },
    ...more device models...
  ]

The management tool can generate the configuration parameter list by
expanding a version into its params.

Configuration parameter types and input ranges need more thought. For
example, version 1 of the device might not have rx-table-size (it's
effectively 0). Version 2 introduces rx-table-size and sets it to 32.
Version 3 raises the value to 64. In addition, the user can set a custom
value like rx-table-size=48. I haven't defined the rules for this yet,
but it's clear there needs to be a way to extend configuration
parameters.

To check migration compatibility:
1. Verify that the device model URL matches the JSON data[n].model
   field.
2. For every configuration parameter name from the source device,
   check that it is contained within the JSON data[n].params list.

> > VFIO Implementation
> > -------------------
> > The following applies both to kernel VFIO/mdev drivers and vfio-user device
> > backends.
> > 
> > Devices are instantiated based on a version and/or configuration parameters:
> > * ``version=1`` - use the device configuration aliased by version 1
> > * ``version=2,rx-filter-size=64`` - use version 1 and override ``rx-filter-size``
> > * ``rx-filter-size=0`` - directly set configuration parameters without using a version
> > 
> > Device creation fails if the version and/or configuration parameters are not
> > supported.
> > 
> > There must be a mechanism to query the "latest" configuration for a device
> > model. It may simply report the ``version=5`` where 5 is the latest version but
> > it could also report all configuration parameters instead of using a version
> > alias.
> 
> The mechanism needs to be able to report all supported versions strings,
> not simple the latest version string. I think we need to specify the
> actual mechanism todo this query too, because we can't end up in a place
> where there's a different approach to queries for each device type.

Makes sense.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2020-11-03 15:13 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-02 11:11 VFIO Migration Stefan Hajnoczi
2020-11-02 12:28 ` Cornelia Huck
2020-11-02 14:56   ` Stefan Hajnoczi
2020-11-04  8:07     ` Gerd Hoffmann
2020-11-04 16:40       ` Stefan Hajnoczi
2020-11-05  6:47         ` Gerd Hoffmann
2020-11-05 11:42           ` Stefan Hajnoczi
2020-11-02 19:38 ` Alex Williamson
2020-11-03 11:03   ` Stefan Hajnoczi
2020-11-03 17:13     ` Alex Williamson
2020-11-03 18:09       ` Stefan Hajnoczi
2020-11-05 23:37       ` Yan Zhao
2020-11-03  8:46 ` Jason Wang
2020-11-03 12:15   ` Stefan Hajnoczi
2020-11-04  3:32     ` Jason Wang
2020-11-04  7:16       ` Stefan Hajnoczi
2020-11-03 11:39 ` Daniel P. Berrangé
2020-11-03 15:05   ` Stefan Hajnoczi [this message]
2020-11-03 15:23     ` Daniel P. Berrangé
2020-11-03 18:16       ` Stefan Hajnoczi
2020-11-03 12:17 ` Dr. David Alan Gilbert
2020-11-03 15:27   ` Stefan Hajnoczi
2020-11-03 18:49     ` Dr. David Alan Gilbert
2020-11-04  7:36       ` Stefan Hajnoczi
2020-11-04 10:14         ` Dr. David Alan Gilbert
2020-11-04 16:47           ` Stefan Hajnoczi
2020-11-04 17:32             ` Dr. David Alan Gilbert
2020-11-05 11:40               ` Stefan Hajnoczi
2020-11-05 12:13                 ` Dr. David Alan Gilbert
2020-11-05 12:47                   ` Michael S. Tsirkin
2020-11-05 14:17                     ` Dr. David Alan Gilbert
2020-11-05 12:53                 ` Michael S. Tsirkin
2020-11-04 11:05       ` Christophe de Dinechin
2020-11-03 15:23 ` Christophe de Dinechin
2020-11-03 15:33   ` Daniel P. Berrangé
2020-11-03 17:31     ` Alex Williamson
2020-11-04 10:13       ` Stefan Hajnoczi
2020-11-04 11:10   ` Stefan Hajnoczi
2020-11-04  7:50 ` Michael S. Tsirkin
2020-11-04 16:37   ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201103150508.GB253848@stefanha-x1.localdomain \
    --to=stefanha@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=felipe@nutanix.com \
    --cc=jasowang@redhat.com \
    --cc=john.g.johnson@oracle.com \
    --cc=kwankhede@nvidia.com \
    --cc=mtsirkin@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=thanos.makatos@nutanix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.