All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Juan Quintela <quintela@redhat.com>,
	Avihai Horon <avihaih@nvidia.com>,
	qemu-devel@nongnu.org, "Michael S . Tsirkin" <mst@redhat.com>,
	Cornelia Huck <cohuck@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	Yishai Hadas <yishaih@nvidia.com>, Mark Bloch <mbloch@nvidia.com>,
	Maor Gottlieb <maorg@nvidia.com>,
	Kirti Wankhede <kwankhede@nvidia.com>,
	Tarun Gupta <targupta@nvidia.com>
Subject: Re: [PATCH 4/9] vfio/migration: Skip pre-copy if dirty page tracking is not supported
Date: Tue, 17 May 2022 10:00:45 -0600	[thread overview]
Message-ID: <20220517100045.27a696c9.alex.williamson@redhat.com> (raw)
In-Reply-To: <20220516230832.GP1343366@nvidia.com>

On Mon, 16 May 2022 20:08:32 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Mon, May 16, 2022 at 02:22:00PM -0600, Alex Williamson wrote:
> > On Mon, 16 May 2022 13:22:14 +0200
> > Juan Quintela <quintela@redhat.com> wrote:
> >   
> > > Avihai Horon <avihaih@nvidia.com> wrote:  
> > > > Currently, if IOMMU of a VFIO container doesn't support dirty page
> > > > tracking, migration is blocked completely. This is because a DMA-able
> > > > VFIO device can dirty RAM pages without updating QEMU about it, thus
> > > > breaking the migration.
> > > >
> > > > However, this doesn't mean that migration can't be done at all. If
> > > > migration pre-copy phase is skipped, the VFIO device doesn't have a
> > > > chance to dirty RAM pages that have been migrated already, thus
> > > > eliminating the problem previously mentioned.
> > > >
> > > > Hence, in such case allow migration but skip pre-copy phase.
> > > >
> > > > Signed-off-by: Avihai Horon <avihaih@nvidia.com>    
> > > 
> > > I don't know (TM).
> > > Several issues:
> > > - Patch is ugly as hell (ok, that depends on taste)
> > > - It changes migration_iteration_run() instead of directly
> > >   migration_thread.
> > > - There is already another case where we skip the sending of RAM
> > >   (localhost migration with shared memory)
> > > 
> > > In migration/ram.c:
> > > 
> > > static int ram_find_and_save_block(RAMState *rs, bool last_stage)
> > > {
> > >     PageSearchStatus pss;
> > >     int pages = 0;
> > >     bool again, found;
> > > 
> > >     /* No dirty page as there is zero RAM */
> > >     if (!ram_bytes_total()) {
> > >         return pages;
> > >     }
> > > 
> > > This is the other place where we _don't_ send any RAM at all.
> > > 
> > > I don't have a great idea about how to make things clear at a higher
> > > level, I have to think about this.  
> > 
> > It seems like if we have devices dictating what type of migrations can
> > be performed then there probably needs to be a switch to restrict use of
> > such devices just as we have the -only-migratable switch now to prevent
> > attaching devices that don't support migration.  I'd guess that we need
> > the switch to opt-in to allowing such devices to maintain
> > compatibility.  There's probably a whole pile of qapi things missing to
> > expose this to management tools as well.  Thanks,  
> 
> This is really intended to be a NOP from where things are now, as if
> you use mlx5 live migration without a patch like this then it causes a
> botched pre-copy since everything just ends up permanently dirty.
> 
> If it makes more sense we could abort the pre-copy too - in the end
> there will be dirty tracking so I don't know if I'd invest in a big
> adventure to fully define non-dirty tracking migration.

How is pre-copy currently "botched" without a patch like this?  If it's
simply that the pre-copy doesn't converge and the downtime constraints
don't allow the VM to enter stop-and-copy, that's the expected behavior
AIUI, and supports backwards compatibility with existing SLAs.

I'm assuming that by setting this new skip_precopy flag that we're
forcing the VM to move to stop-and-copy, regardless of any other SLA
constraints placed on the migration.  It seems like a better solution
would be to expose to management tools that the VM contains a device
that does not support the pre-copy phase so that downtime expectations
can be adjusted.  Thanks,

Alex



  reply	other threads:[~2022-05-17 16:05 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-12 15:43 [PATCH 0/9] vfio/migration: Implement VFIO migration protocol v2 Avihai Horon
2022-05-12 15:43 ` [PATCH 1/9] linux-headers: Update headers to v5.18-rc6 Avihai Horon
2022-05-12 15:43 ` [PATCH 2/9] vfio: Fix compilation errors caused by VFIO migration v1 deprecation Avihai Horon
2022-05-12 17:57   ` Alex Williamson
2022-05-12 18:25     ` Jason Gunthorpe
2022-05-12 21:11       ` Alex Williamson
2022-05-12 23:32         ` Jason Gunthorpe
2022-05-13  7:08     ` Cornelia Huck
2022-05-12 15:43 ` [PATCH 3/9] vfio/migration: Fix NULL pointer dereference bug Avihai Horon
2022-05-16 11:15   ` Juan Quintela
2022-05-17 12:28     ` Avihai Horon
2022-05-18 11:36       ` Juan Quintela
2022-05-12 15:43 ` [PATCH 4/9] vfio/migration: Skip pre-copy if dirty page tracking is not supported Avihai Horon
2022-05-16 11:22   ` Juan Quintela
2022-05-16 20:22     ` Alex Williamson
2022-05-16 23:08       ` Jason Gunthorpe
2022-05-17 16:00         ` Alex Williamson [this message]
2022-05-17 16:08           ` Jason Gunthorpe
2022-05-17 17:22             ` Alex Williamson
2022-05-17 17:39               ` Jason Gunthorpe
2022-05-18  3:46                 ` Alex Williamson
2022-05-18 17:01                   ` Jason Gunthorpe
2022-05-18 11:39             ` Juan Quintela
2022-05-18 15:50               ` Jason Gunthorpe
2022-05-24 15:11                 ` Avihai Horon
2022-05-20 10:58   ` Joao Martins
2022-05-23  6:11     ` Avihai Horon
2022-05-23  9:45       ` Joao Martins
2022-05-12 15:43 ` [PATCH 5/9] migration/qemu-file: Add qemu_file_get_to_fd() Avihai Horon
2022-05-16 11:31   ` Juan Quintela
2022-05-17 12:36     ` Avihai Horon
2022-05-18 11:54       ` Juan Quintela
2022-05-18 15:42         ` Jason Gunthorpe
2022-05-18 16:00           ` Daniel P. Berrangé
2022-05-18 16:16             ` Jason Gunthorpe
2022-05-12 15:43 ` [PATCH 6/9] vfio/migration: Implement VFIO migration protocol v2 Avihai Horon
2022-05-23 18:14   ` Joao Martins
2022-05-24 15:35     ` Avihai Horon
2022-05-12 15:43 ` [PATCH 7/9] vfio/migration: Reset device if setting recover state fails Avihai Horon
2022-05-12 15:43 ` [PATCH 8/9] vfio: Alphabetize migration section of VFIO trace-events file Avihai Horon
2022-05-12 15:43 ` [PATCH 9/9] docs/devel: Align vfio-migration docs to VFIO migration v2 Avihai Horon
2022-05-12 18:02 ` [PATCH 0/9] vfio/migration: Implement VFIO migration protocol v2 Alex Williamson
2022-05-13  7:04   ` Cornelia Huck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220517100045.27a696c9.alex.williamson@redhat.com \
    --to=alex.williamson@redhat.com \
    --cc=avihaih@nvidia.com \
    --cc=cohuck@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=kwankhede@nvidia.com \
    --cc=maorg@nvidia.com \
    --cc=mbloch@nvidia.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=targupta@nvidia.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.