qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Keqian Zhu <zhukeqian1@huawei.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Hyman <huangy81@chinatelecom.cn>,
	qemu-devel@nongnu.org,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [PATCH v5 00/10] KVM: Dirty ring support (QEMU part)
Date: Mon, 22 Mar 2021 15:45:33 -0400	[thread overview]
Message-ID: <20210322194533.GE16645@xz-x1> (raw)
In-Reply-To: <2e057323-8102-7bfc-051b-cd3950c93875@huawei.com>

On Mon, Mar 22, 2021 at 10:02:38PM +0800, Keqian Zhu wrote:
> Hi Peter,

Hi, Keqian,

[...]

> You emphasize that dirty ring is a "Thread-local buffers", but dirty bitmap is global,
> but I don't see it has optimization about "locking" compared to dirty bitmap.
> 
> The thread-local means that vCPU can flush hardware buffer into dirty ring without
> locking, but for bitmap, vCPU can also use atomic set to mark dirty without locking.
> Maybe I miss something?

Yes, the atomic ops guaranteed locking as you said, but afaiu atomics are
expensive already, since at least on x86 I think it needs to lock the memory
bus.  IIUC that'll become even slower as cores grow, as long as the cores share
the memory bus.

KVM dirty ring is per-vcpu, it means its metadata can be modified locally
without atomicity at all (but still, we'll need READ_ONCE/WRITE_ONCE to
guarantee ordering of memory accesses).  It should scale better especially with
hosts who have lots of cores.

> 
> The second question is that you observed longer migration time (55s->73s) when guest
> has 24G ram and dirty rate is 800M/s. I am not clear about the reason. As with dirty
> ring enabled, Qemu can get dirty info faster which means it handles dirty page more
> quick, and guest can be throttled which means dirty page is generated slower. What's
> the rationale for the longer migration time?

Because dirty ring is more sensitive to dirty rate, while dirty bitmap is more
sensitive to memory footprint.  In above 24G mem + 800MB/s dirty rate
condition, dirty bitmap seems to be more efficient, say, collecting dirty
bitmap of 24G mem (24G/4K/8=0.75MB) for each migration cycle is fast enough.

Not to mention that current implementation of dirty ring in QEMU is not
complete - we still have two more layers of dirty bitmap, so it's actually a
mixture of dirty bitmap and dirty ring.  This series is more like a POC on
dirty ring interface, so as to let QEMU be able to run on KVM dirty ring.
E.g., we won't have hang issue when getting dirty pages since it's totally
async, however we'll still have some legacy dirty bitmap issues e.g. memory
consumption of userspace dirty bitmaps are still linear to memory footprint.

Moreover, IMHO another important feature that dirty ring provided is actually
the full-exit, where we can pause a vcpu when it dirties too fast, while other
vcpus won't be affected.  That's something I really wanted to POC too but I
don't have enough time.  I think it's a worth project in the future to really
make the full-exit throttle vcpus, then ideally we'll remove all the dirty
bitmaps in QEMU as long as dirty ring is on.

So I'd say the number I got at that time is not really helping a lot - as you
can see for small VMs it won't make things faster.  Maybe a bit more efficient?
I can't tell.  From design-wise it looks actually still better.  However dirty
logging still has the reasoning to be the default interface we use for small
vms, imho.

> 
> PS: As the dirty ring is still converted into dirty_bitmap of kvm_slot, so the
> "get dirty info faster" maybe not true. :-(

We can get dirty info faster even now, I think, because previously we only do
KVM_GET_DIRTY_LOG once per migration iteration, which could be tens of seconds
for a VM mentioned above with 24G and 800MB/s dirty rate.  Dirty ring is fully
async, we'll get that after the reaper thread timeout.  However I must also
confess "get dirty info faster" doesn't help us a lot on anything yet, afaict,
comparing to a full-featured dirty logging where clear dirty log and so on.

Hope above helps.

Thanks,

-- 
Peter Xu



  reply	other threads:[~2021-03-22 19:47 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-10 20:32 [PATCH v5 00/10] KVM: Dirty ring support (QEMU part) Peter Xu
2021-03-10 20:32 ` [PATCH v5 01/10] memory: Introduce log_sync_global() to memory listener Peter Xu
2021-03-10 20:32 ` [PATCH v5 02/10] KVM: Use a big lock to replace per-kml slots_lock Peter Xu
2021-03-22 10:47   ` Keqian Zhu
2021-03-22 13:54     ` Paolo Bonzini
2021-03-22 16:27       ` Peter Xu
2021-03-24 18:08         ` Peter Xu
2021-03-10 20:32 ` [PATCH v5 03/10] KVM: Create the KVMSlot dirty bitmap on flag changes Peter Xu
2021-03-10 20:32 ` [PATCH v5 04/10] KVM: Provide helper to get kvm dirty log Peter Xu
2021-03-10 20:32 ` [PATCH v5 05/10] KVM: Provide helper to sync dirty bitmap from slot to ramblock Peter Xu
2021-03-10 20:32 ` [PATCH v5 06/10] KVM: Simplify dirty log sync in kvm_set_phys_mem Peter Xu
2021-03-10 20:32 ` [PATCH v5 07/10] KVM: Cache kvm slot dirty bitmap size Peter Xu
2021-03-10 20:32 ` [PATCH v5 08/10] KVM: Add dirty-gfn-count property Peter Xu
2021-03-10 20:33 ` [PATCH v5 09/10] KVM: Disable manual dirty log when dirty ring enabled Peter Xu
2021-03-22  9:17   ` Keqian Zhu
2021-03-22 13:55     ` Paolo Bonzini
2021-03-22 16:21       ` Peter Xu
2021-03-10 20:33 ` [PATCH v5 10/10] KVM: Dirty ring support Peter Xu
2021-03-22 13:37   ` Keqian Zhu
2021-03-22 18:52     ` Peter Xu
2021-03-23  1:25       ` Keqian Zhu
2021-03-19 18:12 ` [PATCH v5 00/10] KVM: Dirty ring support (QEMU part) Peter Xu
2021-03-22 14:02 ` Keqian Zhu
2021-03-22 19:45   ` Peter Xu [this message]
2021-03-23  6:40     ` Keqian Zhu
2021-03-23 14:34       ` Peter Xu
2021-03-24  2:56         ` Keqian Zhu
2021-03-24 15:09           ` Peter Xu
2021-03-25  1:21             ` Keqian Zhu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210322194533.GE16645@xz-x1 \
    --to=peterx@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=huangy81@chinatelecom.cn \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=zhukeqian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).