linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ben Gardon <bgardon@google.com>
To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>, Peter Xu <peterx@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Peter Shier <pshier@google.com>,
	David Matlack <dmatlack@google.com>,
	Mingwei Zhang <mizhang@google.com>,
	Yulei Zhang <yulei.kernel@gmail.com>,
	Wanpeng Li <kernellwp@gmail.com>,
	Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
	Kai Huang <kai.huang@intel.com>,
	Keqian Zhu <zhukeqian1@huawei.com>,
	David Hildenbrand <david@redhat.com>
Subject: Re: [RFC 00/19] KVM: x86/mmu: Optimize disabling dirty logging
Date: Mon, 15 Nov 2021 13:24:13 -0800	[thread overview]
Message-ID: <CANgfPd8_LhPe5fngddL2b=0cSeDwO5pNUGAtboioCMDhKT8Vnw@mail.gmail.com> (raw)
In-Reply-To: <20211110223010.1392399-1-bgardon@google.com>

On Wed, Nov 10, 2021 at 2:30 PM Ben Gardon <bgardon@google.com> wrote:
>
> Currently disabling dirty logging with the TDP MMU is extremely slow.
> On a 96 vCPU / 96G VM it takes ~45 seconds to disable dirty logging
> with the TDP MMU, as opposed to ~3.5 seconds with the legacy MMU. This
> series optimizes TLB flushes and introduces in-place large page
> promotion, to bring the disable dirty log time down to ~2 seconds.
>
> Testing:
> Ran KVM selftests and kvm-unit-tests on an Intel Skylake. This
> series introduced no new failures.
>
> Performance:
> To collect these results I needed to apply Mingwei's patch
> "selftests: KVM: align guest physical memory base address to 1GB"
> https://lkml.org/lkml/2021/8/29/310
> David Matlack is going to send out an updated version of that patch soon.
>
> Without this series, TDP MMU:
> > ./dirty_log_perf_test -v 96 -s anonymous_hugetlb_1gb
> Test iterations: 2
> Testing guest mode: PA-bits:ANY, VA-bits:48,  4K pages
> guest physical test memory offset: 0x3fe7c0000000
> Populate memory time: 10.966500447s
> Enabling dirty logging time: 0.002068737s
>
> Iteration 1 dirty memory time: 0.047556280s
> Iteration 1 get dirty log time: 0.001253914s
> Iteration 1 clear dirty log time: 0.049716661s
> Iteration 2 dirty memory time: 3.679662016s
> Iteration 2 get dirty log time: 0.000659546s
> Iteration 2 clear dirty log time: 1.834329322s
> Disabling dirty logging time: 45.738439510s
> Get dirty log over 2 iterations took 0.001913460s. (Avg 0.000956730s/iteration)
> Clear dirty log over 2 iterations took 1.884045983s. (Avg 0.942022991s/iteration)
>
> Without this series, Legacy MMU:
> > ./dirty_log_perf_test -v 96 -s anonymous_hugetlb_1gb
> Test iterations: 2
> Testing guest mode: PA-bits:ANY, VA-bits:48,  4K pages
> guest physical test memory offset: 0x3fe7c0000000
> Populate memory time: 12.664750666s
> Enabling dirty logging time: 0.002025510s
>
> Iteration 1 dirty memory time: 0.046240875s
> Iteration 1 get dirty log time: 0.001864342s
> Iteration 1 clear dirty log time: 0.170243637s
> Iteration 2 dirty memory time: 31.571088701s
> Iteration 2 get dirty log time: 0.000626245s
> Iteration 2 clear dirty log time: 1.294817729s
> Disabling dirty logging time: 3.566831573s
> Get dirty log over 2 iterations took 0.002490587s. (Avg 0.001245293s/iteration)
> Clear dirty log over 2 iterations took 1.465061366s. (Avg 0.732530683s/iteration)
>
> With this series, TDP MMU:
> > ./dirty_log_perf_test -v 96 -s anonymous_hugetlb_1gb
> Test iterations: 2
> Testing guest mode: PA-bits:ANY, VA-bits:48,  4K pages
> guest physical test memory offset: 0x3fe7c0000000
> Populate memory time: 12.016653537s
> Enabling dirty logging time: 0.001992860s
>
> Iteration 1 dirty memory time: 0.046701599s
> Iteration 1 get dirty log time: 0.001214806s
> Iteration 1 clear dirty log time: 0.049519923s
> Iteration 2 dirty memory time: 3.581931268s
> Iteration 2 get dirty log time: 0.000621383s
> Iteration 2 clear dirty log time: 1.894597059s
> Disabling dirty logging time: 1.950542092s
> Get dirty log over 2 iterations took 0.001836189s. (Avg 0.000918094s/iteration)
> Clear dirty log over 2 iterations took 1.944116982s. (Avg 0.972058491s/iteration)
>
> Patch breakdown:
> Patch 1 is a fix for a bug in the way the TBP MMU issues TLB flushes
> Patches 2-5 eliminate many unnecessary TLB flushes through better batching
> Patches 6-12 remove the need for a vCPU pointer to make_spte
> Patches 13-18 are small refactors in perparation for patch 19
> Patch 19 implements in-place largepage promotion when disabling dirty logging
>
> Ben Gardon (19):
>   KVM: x86/mmu: Fix TLB flush range when handling disconnected pt
>   KVM: x86/mmu: Batch TLB flushes for a single zap
>   KVM: x86/mmu: Factor flush and free up when zapping under MMU write
>     lock
>   KVM: x86/mmu: Yield while processing disconnected_sps
>   KVM: x86/mmu: Remove redundant flushes when disabling dirty logging
>   KVM: x86/mmu: Introduce vcpu_make_spte
>   KVM: x86/mmu: Factor wrprot for nested PML out of make_spte
>   KVM: x86/mmu: Factor mt_mask out of make_spte
>   KVM: x86/mmu: Remove need for a vcpu from
>     kvm_slot_page_track_is_active
>   KVM: x86/mmu: Remove need for a vcpu from mmu_try_to_unsync_pages
>   KVM: x86/mmu: Factor shadow_zero_check out of make_spte
>   KVM: x86/mmu: Replace vcpu argument with kvm pointer in make_spte
>   KVM: x86/mmu: Factor out the meat of reset_tdp_shadow_zero_bits_mask
>   KVM: x86/mmu: Propagate memslot const qualifier
>   KVM: x86/MMU: Refactor vmx_get_mt_mask
>   KVM: x86/mmu: Factor out part of vmx_get_mt_mask which does not depend
>     on vcpu
>   KVM: x86/mmu: Add try_get_mt_mask to x86_ops
>   KVM: x86/mmu: Make kvm_is_mmio_pfn usable outside of spte.c
>   KVM: x86/mmu: Promote pages in-place when disabling dirty logging
>
>  arch/x86/include/asm/kvm-x86-ops.h    |   1 +
>  arch/x86/include/asm/kvm_host.h       |   2 +
>  arch/x86/include/asm/kvm_page_track.h |   6 +-
>  arch/x86/kvm/mmu/mmu.c                |  45 +++---
>  arch/x86/kvm/mmu/mmu_internal.h       |   6 +-
>  arch/x86/kvm/mmu/page_track.c         |   8 +-
>  arch/x86/kvm/mmu/paging_tmpl.h        |   6 +-
>  arch/x86/kvm/mmu/spte.c               |  43 +++--
>  arch/x86/kvm/mmu/spte.h               |  17 +-
>  arch/x86/kvm/mmu/tdp_mmu.c            | 217 +++++++++++++++++++++-----
>  arch/x86/kvm/mmu/tdp_mmu.h            |   5 +-
>  arch/x86/kvm/svm/svm.c                |   8 +
>  arch/x86/kvm/vmx/vmx.c                |  40 +++--
>  include/linux/kvm_host.h              |  10 +-
>  virt/kvm/kvm_main.c                   |  12 +-
>  15 files changed, 302 insertions(+), 124 deletions(-)
>
> --
> 2.34.0.rc0.344.g81b53c2807-goog
>

In a conversation with Sean today, he expressed interest in taking
over patches 2-4 from this series as it conflicted with another fix he
was working on.
I'll leave it to him to incorporate the feedback on these patches.
In the meantime, I've sent another iteration of patch 1 from this
series (a standalone bug fix) and will work on putting together
another version of patches 5-19.

      parent reply	other threads:[~2021-11-16  0:12 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-10 22:29 [RFC 00/19] KVM: x86/mmu: Optimize disabling dirty logging Ben Gardon
2021-11-10 22:29 ` [RFC 01/19] KVM: x86/mmu: Fix TLB flush range when handling disconnected pt Ben Gardon
2021-11-11 17:44   ` David Matlack
2021-11-10 22:29 ` [RFC 02/19] KVM: x86/mmu: Batch TLB flushes for a single zap Ben Gardon
2021-11-11 18:06   ` David Matlack
2021-11-12 23:53   ` Sean Christopherson
2021-11-10 22:29 ` [RFC 03/19] KVM: x86/mmu: Factor flush and free up when zapping under MMU write lock Ben Gardon
2021-11-11 18:31   ` David Matlack
2021-11-10 22:29 ` [RFC 04/19] KVM: x86/mmu: Yield while processing disconnected_sps Ben Gardon
2021-11-11 18:50   ` David Matlack
2021-11-10 22:29 ` [RFC 05/19] KVM: x86/mmu: Remove redundant flushes when disabling dirty logging Ben Gardon
2021-11-11 18:55   ` David Matlack
2021-11-10 22:29 ` [RFC 06/19] KVM: x86/mmu: Introduce vcpu_make_spte Ben Gardon
2021-11-10 22:29 ` [RFC 07/19] KVM: x86/mmu: Factor wrprot for nested PML out of make_spte Ben Gardon
2021-11-18  2:12   ` Sean Christopherson
2021-11-18 17:43     ` Ben Gardon
2021-11-18 18:04       ` Paolo Bonzini
2021-11-10 22:29 ` [RFC 08/19] KVM: x86/mmu: Factor mt_mask " Ben Gardon
2021-11-10 22:30 ` [RFC 09/19] KVM: x86/mmu: Remove need for a vcpu from kvm_slot_page_track_is_active Ben Gardon
2021-11-10 22:30 ` [RFC 10/19] KVM: x86/mmu: Remove need for a vcpu from mmu_try_to_unsync_pages Ben Gardon
2021-11-10 22:30 ` [RFC 11/19] KVM: x86/mmu: Factor shadow_zero_check out of make_spte Ben Gardon
2021-11-10 22:44   ` Paolo Bonzini
2021-11-10 23:49     ` Ben Gardon
2021-11-11  1:18       ` Sean Christopherson
2021-11-11  1:44         ` Sean Christopherson
2021-11-11  7:06         ` Paolo Bonzini
2021-11-18  2:05   ` Sean Christopherson
2021-11-18  3:29     ` Sean Christopherson
2021-11-18 16:37       ` Sean Christopherson
2021-11-18 17:19         ` Paolo Bonzini
2021-11-18 18:02           ` Sean Christopherson
2021-11-18 18:07             ` Paolo Bonzini
2021-11-18 18:14               ` Sean Christopherson
2021-11-10 22:30 ` [RFC 12/19] KVM: x86/mmu: Replace vcpu argument with kvm pointer in make_spte Ben Gardon
2021-11-10 22:30 ` [RFC 13/19] KVM: x86/mmu: Factor out the meat of reset_tdp_shadow_zero_bits_mask Ben Gardon
2021-11-10 22:30 ` [RFC 14/19] KVM: x86/mmu: Propagate memslot const qualifier Ben Gardon
2021-11-10 22:30 ` [RFC 15/19] KVM: x86/MMU: Refactor vmx_get_mt_mask Ben Gardon
2021-11-10 22:30 ` [RFC 16/19] KVM: x86/mmu: Factor out part of vmx_get_mt_mask which does not depend on vcpu Ben Gardon
2021-11-10 22:30 ` [RFC 17/19] KVM: x86/mmu: Add try_get_mt_mask to x86_ops Ben Gardon
2021-11-10 22:30 ` [RFC 18/19] KVM: x86/mmu: Make kvm_is_mmio_pfn usable outside of spte.c Ben Gardon
2021-11-10 22:30 ` [RFC 19/19] KVM: x86/mmu: Promote pages in-place when disabling dirty logging Ben Gardon
2021-11-15 21:24 ` Ben Gardon [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANgfPd8_LhPe5fngddL2b=0cSeDwO5pNUGAtboioCMDhKT8Vnw@mail.gmail.com' \
    --to=bgardon@google.com \
    --cc=david@redhat.com \
    --cc=dmatlack@google.com \
    --cc=kai.huang@intel.com \
    --cc=kernellwp@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mizhang@google.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=pshier@google.com \
    --cc=seanjc@google.com \
    --cc=xiaoguangrong.eric@gmail.com \
    --cc=yulei.kernel@gmail.com \
    --cc=zhukeqian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).