kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Zdenek Kaspar <zkaspar82@gmail.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>
Subject: Re: Bad performance since 5.9-rc1
Date: Wed, 13 Jan 2021 12:17:19 -0800	[thread overview]
Message-ID: <X/9VT6ZgLPZW3dxc@google.com> (raw)
In-Reply-To: <20210112121811.408e32fe.zkaspar82@gmail.com>

On Tue, Jan 12, 2021, Zdenek Kaspar wrote:
> On Tue, 22 Dec 2020 22:26:45 +0100
> Zdenek Kaspar <zkaspar82@gmail.com> wrote:
> 
> > On Tue, 22 Dec 2020 09:07:39 -0800
> > Sean Christopherson <seanjc@google.com> wrote:
> > 
> > > On Mon, Dec 21, 2020, Zdenek Kaspar wrote:
> > > > [  179.364305] WARNING: CPU: 0 PID: 369 at
> > > > kvm_mmu_zap_oldest_mmu_pages+0xd1/0xe0 [kvm] [  179.365415] Call
> > > > Trace: [  179.365443]  paging64_page_fault+0x244/0x8e0 [kvm]
> > > 
> > > This means the shadow page zapping is occuring because KVM is
> > > hitting the max number of allowed MMU shadow pages.  Can you
> > > provide your QEMU command line?  I can reproduce the performance
> > > degredation, but only by deliberately overriding the max number of
> > > MMU pages via `-machine kvm-shadow-mem` to be an absurdly low value.
> > > 
> > > > [  179.365596]  kvm_mmu_page_fault+0x376/0x550 [kvm]
> > > > [  179.365725]  kvm_arch_vcpu_ioctl_run+0xbaf/0x18f0 [kvm]
> > > > [  179.365772]  kvm_vcpu_ioctl+0x203/0x520 [kvm]
> > > > [  179.365938]  __x64_sys_ioctl+0x338/0x720
> > > > [  179.365992]  do_syscall_64+0x33/0x40
> > > > [  179.366013]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > 
> > It's one long line, added "\" for mail readability:
> > 
> > qemu-system-x86_64 -machine type=q35,accel=kvm            \
> > -cpu host,host-cache-info=on -smp cpus=2,cores=2          \
> > -m size=1024 -global virtio-pci.disable-legacy=on         \
> > -global virtio-pci.disable-modern=off                     \
> > -device virtio-balloon                                    \
> > -device virtio-net,netdev=tap-build,mac=DE:AD:BE:EF:00:80 \
> > -object rng-random,filename=/dev/urandom,id=rng0          \
> > -device virtio-rng,rng=rng0                               \
> > -name build,process=qemu-build                            \
> > -drive
> > file=/mnt/data/export/unix/kvm/build/openbsd-amd64.img,if=virtio,cache=none,format=raw,aio=native
> > \ -netdev type=tap,id=tap-build,vhost=on                    \ -serial
> > none                                              \ -parallel none
> >                                         \ -monitor
> > unix:/dev/shm/kvm-build.sock,server,nowait       \ -enable-kvm
> > -daemonize -runas qemu
> > 
> > Z.
> 
> BTW, v5.11-rc3 with kvm-shadow-mem=1073741824 it seems OK.
>
> Just curious what v5.8 does

Aha!  Figured it out.  v5.9 (the commit you bisected to) broke the zapping,
that's what it did.  The list of MMU pages is a FIFO list, meaning KVM adds
entries to the head, not the tail.  I botched the zapping flow and used
for_each instead of for_each_reverse, which meant KVM would zap the _newest_
pages instead of the _oldest_ pages.  So once a VM hit its limit, KVM would
constantly zap the shadow pages it just allocated.

This should resolve the performance regression, or at least make it far less
painful.  It's possible you may still see some performance degredation due to
other changes in the the zapping, e.g. more aggressive recursive zapping.  If
that's the case, I can explore other tweaks, e.g. skip higher levels when
possible.  I'll get a proper patch posted later today.

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index c478904af518..2c6e6fdb26ad 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2417,7 +2417,7 @@ static unsigned long kvm_mmu_zap_oldest_mmu_pages(struct kvm *kvm,
                return 0;

 restart:
-       list_for_each_entry_safe(sp, tmp, &kvm->arch.active_mmu_pages, link) {
+       list_for_each_entry_safe_reverse(sp, tmp, &kvm->arch.active_mmu_pages, link) {
                /*
                 * Don't zap active root pages, the page itself can't be freed
                 * and zapping it will just force vCPUs to realloc and reload.

Side topic, I still can't figure out how on earth your guest kernel is hitting
the max number of default pages.  Even with large pages completely disabled, PTI
enabled, multiple guest processes running, etc... I hit OOM in the guest before
the host's shadow page limit kicks in.  I had to force the limit down to 25% of
the default to reproduce the bad behavior.  All I can figure is that BSD has a
substantially different paging scheme than Linux.

> so by any chance is there command for kvm-shadow-mem value via qemu monitor?
> 
> Z.

  reply	other threads:[~2021-01-13 20:18 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-19  3:05 Bad performance since 5.9-rc1 Zdenek Kaspar
2020-12-01  6:35 ` Zdenek Kaspar
2020-12-18 19:33   ` Zdenek Kaspar
2020-12-21 19:41     ` Sean Christopherson
2020-12-21 21:13       ` Zdenek Kaspar
2020-12-22 17:07         ` Sean Christopherson
2020-12-22 21:26           ` Zdenek Kaspar
2021-01-12 11:18             ` Zdenek Kaspar
2021-01-13 20:17               ` Sean Christopherson [this message]
2021-01-13 22:17                 ` Zdenek Kaspar
2020-12-02  0:31 ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=X/9VT6ZgLPZW3dxc@google.com \
    --to=seanjc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=zkaspar82@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).