All of lore.kernel.org
 help / color / mirror / Atom feed
From: Juan Quintela <quintela@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>,
	qemu-devel@nongnu.org, "Laurent Vivier" <lvivier@redhat.com>,
	"Leonardo Bras" <leobras@redhat.com>,
	"Thomas Huth" <thuth@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>
Subject: Re: [PATCH v3 8/9] tests/qtest: make more migration pre-copy scenarios run non-live
Date: Fri, 02 Jun 2023 00:58:36 +0200	[thread overview]
Message-ID: <877csmrdr7.fsf@secure.mitica> (raw)
In-Reply-To: <ZHjEohiecAsu3ht6@x1n> (Peter Xu's message of "Thu, 1 Jun 2023 12:17:38 -0400")

Peter Xu <peterx@redhat.com> wrote:
> On Thu, Jun 01, 2023 at 04:55:25PM +0100, Daniel P. Berrangé wrote:
>> On Thu, Jun 01, 2023 at 11:53:17AM -0400, Peter Xu wrote:
>> > On Thu, Jun 01, 2023 at 04:39:48PM +0100, Daniel P. Berrangé wrote:
>> > > On Thu, Jun 01, 2023 at 11:30:10AM -0400, Peter Xu wrote:
>> > > > Thanks for looking into this.. definitely worthwhile.
>> > > > 
>> > > > On Wed, May 31, 2023 at 02:23:59PM +0100, Daniel P. Berrangé wrote:
>> > > > > There are 27 pre-copy live migration scenarios being tested. In all of
>> > > > > these we force non-convergance and run for one iteration, then let it
>> > > > > converge and wait for completion during the second (or following)
>> > > > > iterations. At 3 mbps bandwidth limit the first iteration takes a very
>> > > > > long time (~30 seconds).
>> > > > > 
>> > > > > While it is important to test the migration passes and convergance
>> > > > > logic, it is overkill to do this for all 27 pre-copy scenarios. The
>> > > > > TLS migration scenarios in particular are merely exercising different
>> > > > > code paths during connection establishment.
>> > > > > 
>> > > > > To optimize time taken, switch most of the test scenarios to run
>> > > > > non-live (ie guest CPUs paused) with no bandwidth limits. This gives
>> > > > > a massive speed up for most of the test scenarios.
>> > > > > 
>> > > > > For test coverage the following scenarios are unchanged
>> > > > 
>> > > > Curious how are below chosen?  I assume..
>> > > 
>> > > Chosen based on whether they exercise code paths that are unique
>> > > and interesting during the RAM transfer phase.
>> > > 
>> > > Essentially the goal is that if we have N% code coverage before this
>> > > patch, then we should still have the same N% code coverage after this
>> > > patch.
>> > > 
>> > > The TLS tests exercise code paths that are unique during the migration
>> > > establishment phase. Once establishd they don't exercise anything
>> > > "interesting" during RAM transfer phase. Thus we don't loose code coverage
>> > > by runing TLS tests non-live.
>> > > 
>> > > > 
>> > > > > 
>> > > > >  * Precopy with UNIX sockets
>> > > > 
>> > > > this one verifies dirty log.
>> > > > 
>> > > > >  * Precopy with UNIX sockets and dirty ring tracking
>> > > > 
>> > > > ... dirty ring...
>> > > > 
>> > > > >  * Precopy with XBZRLE
>> > > > 
>> > > > ... xbzrle I think needs a diff on old/new, makes sense.
>> > > > 
>> > > > >  * Precopy with UNIX compress
>> > > > >  * Precopy with UNIX compress (nowait)
>> > > > >  * Precopy with multifd
>> > > > 
>> > > > What about the rest three?  Especially for two compression tests.
>> > > 
>> > > The compress thread logic is unique/interesting during RAM transfer
>> > > so benefits from running live. The wait vs non-wait scenario tests
>> > > a distinct codepath/logic.
>> > 
>> > I assume you mean e.g. when compressing with guest page being modified and
>> > we should survive that rather than crashing the compressor?
>> 
>> No, i mean the compression code has a significant behaviour difference
>> between its two tests, because they toggle:
>> 
>>  @compress-wait-thread: Controls behavior when all compression
>>      threads are currently busy.  If true (default), wait for a free
>>      compression thread to become available; otherwise, send the page
>>      uncompressed.  (Since 3.1)
>> 
>> so we need to exercise the code path that falls back to sending
>> uncompressed, as well as the code path that waits for free threads.
>
> But then the question is why live is needed?
>
> IIUC whether the wait thing triggers have nothing directly related to VM is
> live or not, but whether all compress thread busy.  IOW, IIUC all compress
> paths will be tested even if non-live as long as we feed enough pages to
> the compressor threads.

It is even wrong.

We didn't fix this for compression:

commit 007e179ef0e97eafda4c9ff2a9d665a1947c7c6d
Author: Ilya Leoshkevich <iii@linux.ibm.com>
Date:   Tue Jul 5 22:35:59 2022 +0200

    multifd: Copy pages before compressing them with zlib
    
    zlib_send_prepare() compresses pages of a running VM. zlib does not
    make any thread-safety guarantees with respect to changing deflate()
    input concurrently with deflate() [1].


Not that anyone is going to use any accelerator to run zlib when we are
compression just 4k.

Intel AT engine had to also move to 64 pages at a time to make it a
difference.  As said, I can't think of a single scenary where
compression is a good option.

Later, Juan.



  parent reply	other threads:[~2023-06-01 22:59 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-31 13:23 [PATCH v3 0/9] tests/qtest: make migration-test massively faster Daniel P. Berrangé
2023-05-31 13:23 ` [PATCH v3 1/9] tests/qtest: add various qtest_qmp_assert_success() variants Daniel P. Berrangé
2023-06-01  9:23   ` Thomas Huth
2023-06-01 12:48     ` Daniel P. Berrangé
2023-06-01 12:04   ` Juan Quintela
2023-06-01 12:20   ` Juan Quintela
2023-06-01 12:51     ` Daniel P. Berrangé
2023-05-31 13:23 ` [PATCH v3 2/9] tests/qtest: add support for callback to receive QMP events Daniel P. Berrangé
2023-05-31 14:57   ` Thomas Huth
2023-06-01 12:14   ` Juan Quintela
2023-06-01 12:56     ` Daniel P. Berrangé
2023-05-31 13:23 ` [PATCH v3 3/9] tests/qtest: get rid of 'qmp_command' helper in migration test Daniel P. Berrangé
2023-06-01  9:26   ` Thomas Huth
2023-06-01  9:32     ` Daniel P. Berrangé
2023-06-01 12:17   ` Juan Quintela
2023-05-31 13:23 ` [PATCH v3 4/9] tests/qtest: get rid of some 'qtest_qmp' usage " Daniel P. Berrangé
2023-06-01  9:28   ` Thomas Huth
2023-06-01 12:10   ` Juan Quintela
2023-05-31 13:23 ` [PATCH v3 5/9] tests/qtest: switch to using event callbacks for STOP event Daniel P. Berrangé
2023-06-01  9:31   ` Thomas Huth
2023-06-01 12:23   ` Juan Quintela
2023-05-31 13:23 ` [PATCH v3 6/9] tests/qtest: replace wait_command() with qtest_qmp_assert_success Daniel P. Berrangé
2023-06-01  9:37   ` Thomas Huth
2023-06-01 12:27   ` Juan Quintela
2023-05-31 13:23 ` [PATCH v3 7/9] tests/qtest: capture RESUME events during migration Daniel P. Berrangé
2023-06-01  9:38   ` Thomas Huth
2023-06-01 12:31   ` Juan Quintela
2023-06-01 12:34     ` Daniel P. Berrangé
2023-06-01 12:37       ` Juan Quintela
2023-06-01 12:44         ` Daniel P. Berrangé
2023-05-31 13:23 ` [PATCH v3 8/9] tests/qtest: make more migration pre-copy scenarios run non-live Daniel P. Berrangé
2023-06-01  9:47   ` Thomas Huth
2023-06-01 12:33   ` Juan Quintela
2023-06-01 12:38     ` Daniel P. Berrangé
2023-06-01 16:09     ` Thomas Huth
2023-06-01 16:17       ` Daniel P. Berrangé
2023-06-01 16:26         ` Peter Xu
2023-06-01 15:30   ` Peter Xu
2023-06-01 15:39     ` Daniel P. Berrangé
2023-06-01 15:53       ` Peter Xu
2023-06-01 15:55         ` Daniel P. Berrangé
2023-06-01 16:17           ` Peter Xu
2023-06-01 16:35             ` Daniel P. Berrangé
2023-06-01 16:59               ` Peter Xu
2023-06-01 22:58             ` Juan Quintela [this message]
2023-06-01 22:55           ` Juan Quintela
2023-05-31 13:24 ` [PATCH v3 9/9] tests/qtest: massively speed up migration-test Daniel P. Berrangé
2023-06-01 10:04   ` Thomas Huth
2023-06-01 15:46   ` Peter Xu
2023-06-01 16:05     ` Daniel P. Berrangé
2023-06-01 16:22       ` Peter Xu
2023-06-01 16:36         ` Daniel P. Berrangé
2023-06-01 17:04           ` Peter Xu
2023-06-01 23:00     ` Juan Quintela
2023-06-01 23:43       ` Peter Xu
2023-07-10  9:35         ` Daniel P. Berrangé
2023-07-10  9:40           ` Thomas Huth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877csmrdr7.fsf@secure.mitica \
    --to=quintela@redhat.com \
    --cc=berrange@redhat.com \
    --cc=leobras@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.