All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Chuang Xu <xuchuangxclwt@bytedance.com>
Cc: qemu-devel@nongnu.org, dgilbert@redhat.com, quintela@redhat.com,
	pbonzini@redhat.com, david@redhat.com, philmd@linaro.org,
	zhouyibo@bytedance.com
Subject: Re: [RFC v4 0/3] migration: reduce time of loading non-iterable vmstate
Date: Fri, 23 Dec 2022 10:50:47 -0500	[thread overview]
Message-ID: <Y6XOV+ib6a8p2Zjl@x1n> (raw)
In-Reply-To: <20221223142307.1614945-1-xuchuangxclwt@bytedance.com>

Chuang,

On Fri, Dec 23, 2022 at 10:23:04PM +0800, Chuang Xu wrote:
> In this version:
> 
> - attach more information in the cover letter.
> - remove changes on virtio_load().
> - add rcu_read_locked() to detect holding of rcu lock.
> 
> The duration of loading non-iterable vmstate accounts for a significant
> portion of downtime (starting with the timestamp of source qemu stop and
> ending with the timestamp of target qemu start). Most of the time is spent
> committing memory region changes repeatedly.
> 
> This patch packs all the changes to memory region during the period of
> loading non-iterable vmstate in a single memory transaction. With the
> increase of devices, this patch will greatly improve the performance.
> 
> Here are the test1 results:
> test info:
> - Host
>   - Intel(R) Xeon(R) Platinum 8260 CPU
>   - NVIDIA Mellanox ConnectX-5
> - VM
>   - 32 CPUs 128GB RAM VM
>   - 8 16-queue vhost-net device
>   - 16 4-queue vhost-user-blk device.
> 
> 	time of loading non-iterable vmstate     downtime
> before	about 150 ms			  740+ ms
> after		about 30 ms			  630+ ms

Have you investigated why multi-queue added so much downtime overhead with
the same environment, comparing to below [1]?

> 
> (This result is different from that of v1. It may be that someone has 
> changed something on my host.., but it does not affect the display of 
> the optimization effect.)
> 
> 
> In test2, we keep the number of the device the same as test1, reduce the 
> number of queues per device:
> 
> Here are the test2 results:
> test info:
> - Host
>   - Intel(R) Xeon(R) Platinum 8260 CPU
>   - NVIDIA Mellanox ConnectX-5
> - VM
>   - 32 CPUs 128GB RAM VM
>   - 8 1-queue vhost-net device
>   - 16 1-queue vhost-user-blk device.
> 
> 	time of loading non-iterable vmstate     downtime
> before	about 90 ms			 about 250 ms
> after		about 25 ms			 about 160 ms

[1]

> 
> In test3, we keep the number of queues per device the same as test1, reduce 
> the number of devices:
> 
> Here are the test3 results:
> test info:
> - Host
>   - Intel(R) Xeon(R) Platinum 8260 CPU
>   - NVIDIA Mellanox ConnectX-5
> - VM
>   - 32 CPUs 128GB RAM VM
>   - 1 16-queue vhost-net device
>   - 1 4-queue vhost-user-blk device.
> 
> 	time of loading non-iterable vmstate     downtime
> before	about 20 ms			 about 70 ms
> after		about 11 ms			 about 60 ms
> 
> 
> As we can see from the test results above, both the number of queues and 
> the number of devices have a great impact on the time of loading non-iterable 
> vmstate. The growth of the number of devices and queues will lead to more 
> mr commits, and the time consumption caused by the flatview reconstruction 
> will also increase.

The downtime measured in precopy can be more complicated than postcopy
because the time of switch is calculated by qemu based on the downtime
setup, and also that contains part of RAM migrations.  Postcopy should be
more accurate on that because there's no calculation done, meanwhile
there's no RAM transferred during downtime.

However postcopy downtime is not accurate either in implementation of it in
postcopy_start(), where the downtime is measured right after we flushed the
packed data, and right below it there's some idea of optimizing it:

    if (migrate_postcopy_ram()) {
        /*
         * Although this ping is just for debug, it could potentially be
         * used for getting a better measurement of downtime at the source.
         */
        qemu_savevm_send_ping(ms->to_dst_file, 4);
    }

So maybe I'll have a look there.

Besides above, personally I'm happy with the series, one trivial comment in
patch 2 but not a huge deal.  I don't expect you can get any more comment
before the end of this year.. but let's wait until after the Xmas holiday.

Thanks!

-- 
Peter Xu



  parent reply	other threads:[~2022-12-23 15:51 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-23 14:23 [RFC v4 0/3] migration: reduce time of loading non-iterable vmstate Chuang Xu
2022-12-23 14:23 ` [RFC v4 1/3] rcu: introduce rcu_read_locked() Chuang Xu
2023-01-04 14:20   ` Alex Bennée
2023-01-05  8:17     ` Chuang Xu
2022-12-23 14:23 ` [RFC v4 2/3] memory: add depth assert in address_space_to_flatview Chuang Xu
2022-12-23 15:37   ` Peter Xu
2022-12-23 15:47   ` Paolo Bonzini
2022-12-23 15:54     ` Peter Xu
2022-12-28  8:27       ` Paolo Bonzini
2023-01-03 17:43         ` Peter Xu
2023-01-10  8:09           ` Chuang Xu
2023-01-10 14:45             ` Peter Xu
2023-01-12  7:59               ` Chuang Xu
2023-01-12 15:13                 ` Peter Xu
2023-01-13 19:29                   ` Chuang Xu
2022-12-28 10:50   ` Philippe Mathieu-Daudé
2023-01-04  7:39     ` [External] " Chuang Xu
2022-12-23 14:23 ` [RFC v4 3/3] migration: reduce time of loading non-iterable vmstate Chuang Xu
2022-12-23 16:06   ` David Hildenbrand
2023-01-04  7:31     ` Chuang Xu
2022-12-23 15:50 ` Peter Xu [this message]
2022-12-23 19:11   ` [RFC v4 0/3] " Chuang Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y6XOV+ib6a8p2Zjl@x1n \
    --to=peterx@redhat.com \
    --cc=david@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=xuchuangxclwt@bytedance.com \
    --cc=zhouyibo@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.