All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cornelia Huck <cohuck@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: "Neo Jia" <cjia@nvidia.com>,
	"Juan Quintela" <quintela@redhat.com>,
	qemu-devel@nongnu.org,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Kirti Wankhede" <kwankhede@nvidia.com>,
	"Philippe Mathieu-Daudé" <philmd@redhat.com>
Subject: Re: [RFC PATCH for-QEMU-5.2] vfio: Make migration support experimental
Date: Tue, 10 Nov 2020 13:03:14 +0100	[thread overview]
Message-ID: <20201110130314.5621be1c.cohuck@redhat.com> (raw)
In-Reply-To: <160494787833.1473.10514376876696596117.stgit@gimli.home>

On Mon, 09 Nov 2020 11:56:02 -0700
Alex Williamson <alex.williamson@redhat.com> wrote:

> Per the proposed documentation for vfio device migration:
> 
>   Dirty pages are tracked when device is in stop-and-copy phase
>   because if pages are marked dirty during pre-copy phase and
>   content is transfered from source to destination, there is no
>   way to know newly dirtied pages from the point they were copied
>   earlier until device stops. To avoid repeated copy of same
>   content, pinned pages are marked dirty only during
>   stop-and-copy phase.
> 
> Essentially, since we don't have hardware dirty page tracking for
> assigned devices at this point, we consider any page that is pinned
> by an mdev vendor driver or pinned and mapped through the IOMMU to
> be perpetually dirty.  In the worst case, this may result in all of
> guest memory being considered dirty during every iteration of live
> migration.  The current vfio implementation of migration has chosen
> to mask device dirtied pages until the final stages of migration in
> order to avoid this worst case scenario.
> 
> Allowing the device to implement a policy decision to prioritize
> reduced migration data like this jeopardizes QEMU's overall ability
> to implement any degree of service level guarantees during migration.
> For example, any estimates towards achieving acceptable downtime
> margins cannot be trusted when such a device is present.  The vfio
> device should participate in dirty page tracking to the best of its
> ability throughout migration, even if that means the dirty footprint
> of the device impedes migration progress, allowing both QEMU and
> higher level management tools to decide whether to continue the
> migration or abort due to failure to achieve the desired behavior.
> 
> Link: https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg00807.html
> Cc: Kirti Wankhede <kwankhede@nvidia.com>
> Cc: Neo Jia <cjia@nvidia.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: Juan Quintela <quintela@redhat.com>
> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---
> 
> Given that our discussion in the link above seems to be going in
> circles, I'm afraid it seems necessary to both have a contigency
> plan and to raise the visibility of the current behavior to
> determine whether others agree that this is a sufficiently
> troubling behavior to consider migration support experimental
> at this stage.  Please voice your opinion or contribute patches
> to resolve this before QEMU 5.2.  Thanks,
> 
> Alex
> 
>  hw/vfio/migration.c           |    2 +-
>  hw/vfio/pci.c                 |    2 ++
>  include/hw/vfio/vfio-common.h |    1 +
>  3 files changed, 4 insertions(+), 1 deletion(-)

Given the ongoing discussions, I'd be rather more comfortable making
this experimental for the upcoming release and spent some time getting
this into a state that everyone is happy to live with, so

Acked-by: Cornelia Huck <cohuck@redhat.com>



      parent reply	other threads:[~2020-11-10 12:05 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-09 18:56 [RFC PATCH for-QEMU-5.2] vfio: Make migration support experimental Alex Williamson
2020-11-09 19:44 ` Dr. David Alan Gilbert
2020-11-09 20:29   ` Alex Williamson
2020-11-10  9:10     ` Dr. David Alan Gilbert
2020-11-10 14:16       ` Kirti Wankhede
2020-11-10 15:20         ` Alex Williamson
2020-11-10 21:26           ` Neo Jia
2020-11-10 12:03 ` Cornelia Huck [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201110130314.5621be1c.cohuck@redhat.com \
    --to=cohuck@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=cjia@nvidia.com \
    --cc=dgilbert@redhat.com \
    --cc=kwankhede@nvidia.com \
    --cc=philmd@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.