All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>,
	qemu-devel@nongnu.org,
	Leonardo Bras Soares Passos <lsoaresp@redhat.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [PATCH v4 10/19] migration: Postcopy preemption enablement
Date: Wed, 20 Apr 2022 15:39:22 -0400	[thread overview]
Message-ID: <YmBhauRdlKCV1kQE@xz-m1.local> (raw)
In-Reply-To: <Yl/o9HCW54hslTbs@redhat.com>

On Wed, Apr 20, 2022 at 12:05:24PM +0100, Daniel P. Berrangé wrote:
> On Thu, Mar 31, 2022 at 11:08:48AM -0400, Peter Xu wrote:
> > This patch enables postcopy-preempt feature.
> > 
> > It contains two major changes to the migration logic:
> > 
> > (1) Postcopy requests are now sent via a different socket from precopy
> >     background migration stream, so as to be isolated from very high page
> >     request delays.
> > 
> > (2) For huge page enabled hosts: when there's postcopy requests, they can now
> >     intercept a partial sending of huge host pages on src QEMU.
> > 
> > After this patch, we'll live migrate a VM with two channels for postcopy: (1)
> > PRECOPY channel, which is the default channel that transfers background pages;
> > and (2) POSTCOPY channel, which only transfers requested pages.
> > 
> > There's no strict rule of which channel to use, e.g., if a requested page is
> > already being transferred on precopy channel, then we will keep using the same
> > precopy channel to transfer the page even if it's explicitly requested.  In 99%
> > of the cases we'll prioritize the channels so we send requested page via the
> > postcopy channel as long as possible.
> > 
> > On the source QEMU, when we found a postcopy request, we'll interrupt the
> > PRECOPY channel sending process and quickly switch to the POSTCOPY channel.
> > After we serviced all the high priority postcopy pages, we'll switch back to
> > PRECOPY channel so that we'll continue to send the interrupted huge page again.
> > There's no new thread introduced on src QEMU.
> 
> Implicit in this approach is that the delay in sending postcopy
> OOB pages is from the pending socket buffers the kernel already
> has, and not any delay caused by the QEMU sending thread being
> busy doing other stuff.

Yes.

> 
> Is there any scenario in which the QEMU sending thread is stalled
> in sendmsg() with a 1GB huge page waiting for the kernel to
> get space in the socket outgoing buffer ?

Another yes..

It doesn't necessarily to be during sending a 1GB huge page, the guest can
be using small pages and IMHO we could get stuck at sendmsg() for a precopy
small page while there's actually postcopy requests in the queue.

We can't solve this as long as we keep using 1 single thread for sending
page.

This patchset doesn't solve this issue, yet.  And it's actually the chunk
discussed and mention in the cover letter too in the section "Avoid precopy
write() blocks postcopy" as an TODO item.

Logically in the future we could try to make two or more sender threads so
postcopy pages can use a separate sender thread.

Note that this change will _not_ require interface change either from qemu
cmdline or on migration protocol, because this patchset should have handled
all the migration protocol already even for that, but then if it'll work
well we could get pure speed up on further shrinked latency when preempt
mode enabled comparing to before.

The other thing is I never measured such an effect, so I can't tell how
would it perform at last.  We need more work on top if we'd like to persue
it, mostly on doing proper synchronizations on senders.

Thanks,

-- 
Peter Xu



  reply	other threads:[~2022-04-20 20:08 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-31 15:08 [PATCH v4 00/19] migration: Postcopy Preemption Peter Xu
2022-03-31 15:08 ` [PATCH v4 01/19] migration: Postpone releasing MigrationState.hostname Peter Xu
2022-04-07 17:21   ` Dr. David Alan Gilbert
2022-04-20 10:34   ` Daniel P. Berrangé
2022-04-20 18:19     ` Peter Xu
2022-03-31 15:08 ` [PATCH v4 02/19] migration: Drop multifd tls_hostname cache Peter Xu
2022-04-07 17:42   ` Dr. David Alan Gilbert
2022-04-20 10:35   ` Daniel P. Berrangé
2022-03-31 15:08 ` [PATCH v4 03/19] migration: Add pss.postcopy_requested status Peter Xu
2022-04-20 10:36   ` Daniel P. Berrangé
2022-03-31 15:08 ` [PATCH v4 04/19] migration: Move migrate_allow_multifd and helpers into migration.c Peter Xu
2022-04-20 10:41   ` Daniel P. Berrangé
2022-04-20 19:30     ` Peter Xu
2022-03-31 15:08 ` [PATCH v4 05/19] migration: Export ram_load_postcopy() Peter Xu
2022-04-20 10:42   ` Daniel P. Berrangé
2022-03-31 15:08 ` [PATCH v4 06/19] migration: Move channel setup out of postcopy_try_recover() Peter Xu
2022-04-20 10:43   ` Daniel P. Berrangé
2022-03-31 15:08 ` [PATCH v4 07/19] migration: Allow migrate-recover to run multiple times Peter Xu
2022-04-20 10:44   ` Daniel P. Berrangé
2022-03-31 15:08 ` [PATCH v4 08/19] migration: Add postcopy-preempt capability Peter Xu
2022-04-20 10:51   ` Daniel P. Berrangé
2022-04-20 19:31     ` Peter Xu
2022-03-31 15:08 ` [PATCH v4 09/19] migration: Postcopy preemption preparation on channel creation Peter Xu
2022-04-20 10:59   ` Daniel P. Berrangé
2022-03-31 15:08 ` [PATCH v4 10/19] migration: Postcopy preemption enablement Peter Xu
2022-04-20 11:05   ` Daniel P. Berrangé
2022-04-20 19:39     ` Peter Xu [this message]
2022-05-11 15:54   ` manish.mishra
2022-05-12 16:22     ` Peter Xu
2022-05-13 18:53       ` manish.mishra
2022-05-13 19:31         ` Peter Xu
2022-03-31 15:08 ` [PATCH v4 11/19] migration: Postcopy recover with preempt enabled Peter Xu
2022-03-31 15:08 ` [PATCH v4 12/19] migration: Create the postcopy preempt channel asynchronously Peter Xu
2022-03-31 15:08 ` [PATCH v4 13/19] migration: Parameter x-postcopy-preempt-break-huge Peter Xu
2022-03-31 15:08 ` [PATCH v4 14/19] migration: Add helpers to detect TLS capability Peter Xu
2022-04-20 11:10   ` Daniel P. Berrangé
2022-04-20 19:52     ` Peter Xu
2022-03-31 15:08 ` [PATCH v4 15/19] migration: Export tls-[creds|hostname|authz] params to cmdline too Peter Xu
2022-04-20 11:13   ` Daniel P. Berrangé
2022-04-20 20:01     ` Peter Xu
2022-03-31 15:08 ` [PATCH v4 16/19] migration: Enable TLS for preempt channel Peter Xu
2022-04-20 11:35   ` Daniel P. Berrangé
2022-04-20 20:10     ` Peter Xu
2022-03-31 15:08 ` [PATCH v4 17/19] tests: Add postcopy tls migration test Peter Xu
2022-04-20 11:39   ` Daniel P. Berrangé
2022-04-20 20:15     ` Peter Xu
2022-03-31 15:08 ` [PATCH v4 18/19] tests: Add postcopy tls recovery " Peter Xu
2022-04-20 11:42   ` Daniel P. Berrangé
2022-04-20 20:38     ` Peter Xu
2022-03-31 15:08 ` [PATCH v4 19/19] tests: Add postcopy preempt tests Peter Xu
2022-03-31 15:25   ` Peter Xu
2022-04-20 11:43   ` Daniel P. Berrangé
2022-04-20 20:51     ` Peter Xu
2022-04-21 13:57 ` [PATCH v4 00/19] migration: Postcopy Preemption Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YmBhauRdlKCV1kQE@xz-m1.local \
    --to=peterx@redhat.com \
    --cc=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=lsoaresp@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.