All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, Laurent Vivier <lvivier@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	Leonardo Bras <leobras@redhat.com>,
	Thomas Huth <thuth@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH v3 9/9] tests/qtest: massively speed up migration-test
Date: Thu, 1 Jun 2023 17:05:23 +0100	[thread overview]
Message-ID: <ZHjBw2E+eJKNsniO@redhat.com> (raw)
In-Reply-To: <ZHi9OQz8PGuHMPpN@x1n>

On Thu, Jun 01, 2023 at 11:46:01AM -0400, Peter Xu wrote:
> On Wed, May 31, 2023 at 02:24:00PM +0100, Daniel P. Berrangé wrote:
> > The migration test cases that actually exercise live migration want to
> > ensure there is a minimum of two iterations of pre-copy, in order to
> > exercise the dirty tracking code.
> > 
> > Historically we've queried the migration status, looking for the
> > 'dirty-sync-count' value to increment to track iterations. This was
> > not entirely reliable because often all the data would get transferred
> > quickly enough that the migration would finish before we wanted it
> > to. So we massively dropped the bandwidth and max downtime to
> > guarantee non-convergance. This had the unfortunate side effect
> > that every migration took at least 30 seconds to run (100 MB of
> > dirty pages / 3 MB/sec).
> > 
> > This optimization takes a different approach to ensuring that a
> > mimimum of two iterations. Rather than waiting for dirty-sync-count
> > to increment, directly look for an indication that the source VM
> > has dirtied RAM that has already been transferred.
> > 
> > On the source VM a magic marker is written just after the 3 MB
> > offset. The destination VM is now montiored to detect when the
> > magic marker is transferred. This gives a guarantee that the
> > first 3 MB of memory have been transferred. Now the source VM
> > memory is monitored at exactly the 3MB offset until we observe
> > a flip in its value. This gives us a guaranteed that the guest
> > workload has dirtied a byte that has already been transferred.
> > 
> > Since we're looking at a place that is only 3 MB from the start
> > of memory, with the 3 MB/sec bandwidth, this test should complete
> > in 1 second, instead of 30 seconds.
> > 
> > Once we've proved there is some dirty memory, migration can be
> > set back to full speed for the remainder of the 1st iteration,
> > and the entire of the second iteration at which point migration
> > should be complete.
> > 
> > On a test machine this further reduces the migration test time
> > from 8 minutes to 1 minute 40.
> 
> The outcome is definitely nice, but it does looks slightly hacky to me and
> make the test slightly more complicated.
> 
> If it's all about making sure we finish the 1st iteration, can we simply
> add a src qemu parameter "switchover-hold"?  If it's set, src never
> switchover to dst but keeps the iterations.

For *most* of the tests, we want to ensure there are a minimum
of 2 iterations. For the XBZRLE test we want to ensure there are
a minimum of 3 iterations, so the cache gets  workout.

> Then migrate_ensure_non_converge() will be as simple as setting
> switchover-hold to true.
> 
> I am even thinking whether there can even be real-life use case for that,
> e.g., where a user might want to have a pre-heat of a migration of some VM,
> and trigger it immediately when the admin really wants (the pre-heats moved
> most of the pages and keep doing so).
> 
> It'll be also similar to what Avihai proposed here on switchover-ack, just
> an ack mechanism on the src side:
> 
> https://lore.kernel.org/r/20230530144821.1557-3-avihaih@nvidia.com

In general I strongly wanted to avoid adding special logic to the
migration code that makes it directly synchronize with the  test
suite, because once we do that I don't think the test suite is a
providing coverage of the real world usage scenario.

IOW, if we add a switchover-ack feature, we should certainly have
*a* test that uses it, but we should not modify other tests because
we want to exercise the logic as it would run with apps that don't
rely on switchover-ack.

Also, this slow migration test is incredibly painful for people right
now, so I'd like to see us get a speed up committed to git quickly.
I don't want to block it on a feature proposal that might yet take
months to merge.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



  reply	other threads:[~2023-06-01 16:06 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-31 13:23 [PATCH v3 0/9] tests/qtest: make migration-test massively faster Daniel P. Berrangé
2023-05-31 13:23 ` [PATCH v3 1/9] tests/qtest: add various qtest_qmp_assert_success() variants Daniel P. Berrangé
2023-06-01  9:23   ` Thomas Huth
2023-06-01 12:48     ` Daniel P. Berrangé
2023-06-01 12:04   ` Juan Quintela
2023-06-01 12:20   ` Juan Quintela
2023-06-01 12:51     ` Daniel P. Berrangé
2023-05-31 13:23 ` [PATCH v3 2/9] tests/qtest: add support for callback to receive QMP events Daniel P. Berrangé
2023-05-31 14:57   ` Thomas Huth
2023-06-01 12:14   ` Juan Quintela
2023-06-01 12:56     ` Daniel P. Berrangé
2023-05-31 13:23 ` [PATCH v3 3/9] tests/qtest: get rid of 'qmp_command' helper in migration test Daniel P. Berrangé
2023-06-01  9:26   ` Thomas Huth
2023-06-01  9:32     ` Daniel P. Berrangé
2023-06-01 12:17   ` Juan Quintela
2023-05-31 13:23 ` [PATCH v3 4/9] tests/qtest: get rid of some 'qtest_qmp' usage " Daniel P. Berrangé
2023-06-01  9:28   ` Thomas Huth
2023-06-01 12:10   ` Juan Quintela
2023-05-31 13:23 ` [PATCH v3 5/9] tests/qtest: switch to using event callbacks for STOP event Daniel P. Berrangé
2023-06-01  9:31   ` Thomas Huth
2023-06-01 12:23   ` Juan Quintela
2023-05-31 13:23 ` [PATCH v3 6/9] tests/qtest: replace wait_command() with qtest_qmp_assert_success Daniel P. Berrangé
2023-06-01  9:37   ` Thomas Huth
2023-06-01 12:27   ` Juan Quintela
2023-05-31 13:23 ` [PATCH v3 7/9] tests/qtest: capture RESUME events during migration Daniel P. Berrangé
2023-06-01  9:38   ` Thomas Huth
2023-06-01 12:31   ` Juan Quintela
2023-06-01 12:34     ` Daniel P. Berrangé
2023-06-01 12:37       ` Juan Quintela
2023-06-01 12:44         ` Daniel P. Berrangé
2023-05-31 13:23 ` [PATCH v3 8/9] tests/qtest: make more migration pre-copy scenarios run non-live Daniel P. Berrangé
2023-06-01  9:47   ` Thomas Huth
2023-06-01 12:33   ` Juan Quintela
2023-06-01 12:38     ` Daniel P. Berrangé
2023-06-01 16:09     ` Thomas Huth
2023-06-01 16:17       ` Daniel P. Berrangé
2023-06-01 16:26         ` Peter Xu
2023-06-01 15:30   ` Peter Xu
2023-06-01 15:39     ` Daniel P. Berrangé
2023-06-01 15:53       ` Peter Xu
2023-06-01 15:55         ` Daniel P. Berrangé
2023-06-01 16:17           ` Peter Xu
2023-06-01 16:35             ` Daniel P. Berrangé
2023-06-01 16:59               ` Peter Xu
2023-06-01 22:58             ` Juan Quintela
2023-06-01 22:55           ` Juan Quintela
2023-05-31 13:24 ` [PATCH v3 9/9] tests/qtest: massively speed up migration-test Daniel P. Berrangé
2023-06-01 10:04   ` Thomas Huth
2023-06-01 15:46   ` Peter Xu
2023-06-01 16:05     ` Daniel P. Berrangé [this message]
2023-06-01 16:22       ` Peter Xu
2023-06-01 16:36         ` Daniel P. Berrangé
2023-06-01 17:04           ` Peter Xu
2023-06-01 23:00     ` Juan Quintela
2023-06-01 23:43       ` Peter Xu
2023-07-10  9:35         ` Daniel P. Berrangé
2023-07-10  9:40           ` Thomas Huth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZHjBw2E+eJKNsniO@redhat.com \
    --to=berrange@redhat.com \
    --cc=leobras@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.