All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Mathias Krause <minipli@grsecurity.net>
Cc: Greg KH <greg@kroah.com>,
	kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH v4 0/6] KVM: MMU: performance tweaks for heavy CR0.WP users
Date: Fri, 14 Apr 2023 09:49:28 -0700	[thread overview]
Message-ID: <ZDmEGM+CgYpvDLh6@google.com> (raw)
In-Reply-To: <026dcbfe-a306-85c3-600e-17cae3d3b7c5@grsecurity.net>

+Jeremi

On Fri, Apr 14, 2023, Mathias Krause wrote:
> On 06.04.23 15:22, Mathias Krause wrote:
> > On 06.04.23 04:25, Sean Christopherson wrote:
> >> These are quite risky to backport.  E.g. we botched patch 6[*], and my initial
> >> fix also had a subtle bug.  There have also been quite a few KVM MMU changes since
> >> 5.4, so it's possible that an edge case may exist in 5.4 that doesn't exist in
> >> mainline.
> > 
> > I totally agree. Getting the changes to work with older kernels needs
> > more work. The MMU role handling was refactored in 5.14 and down to 5.4
> > it differs even more, so backports to earlier kernels definitely needs
> > more care.
> > 
> > My plan would be to limit backporting of the whole series to kernels
> > down to 5.15 (maybe 5.10 if it turns out to be doable) and for kernels
> > before that only without patch 6. That would leave out the problematic
> > change but still give us the benefits of dropping the needless mmu
> > unloads for only toggling CR0.WP in the VM. This already helps us a lot!
> 
> To back up the "helps us a lot" with some numbers, here are the results
> I got from running the 'ssdd 10 50000' micro-benchmark on the backports
> I did, running on a grsecurity L1 VM (host is a vanilla kernel, as
> stated below; runtime in seconds, lower is better):
> 
>                           legacy     TDP    shadow
>     Linux v5.4.240          -        8.87s   56.8s
>     + patches               -        5.84s   55.4s

I believe "legacy" and "TDP" are flip-flopped, the TDP MMU does't exist in v5.4.

>     Linux v5.10.177       10.37s    88.7s    69.7s
>     + patches              4.88s     4.92s   70.1s
> 
>     Linux v5.15.106        9.94s    66.1s    64.9s
>     + patches              4.81s     4.79s   64.6s
> 
>     Linux v6.1.23          7.65s    8.23s    68.7s
>     + patches              3.36s    3.36s    69.1s
> 
>     Linux v6.2.10          7.61s    7.98s    68.6s
>     + patches              3.37s    3.41s    70.2s
> 
> I guess we can grossly ignore the shadow MMU numbers, beside noting them
> to regress from v5.4 to v5.10 (something to investigate?). The backports
> don't help (much) for shadow MMU setups and the flux in the measurements
> is likely related to the slab allocations involved.
> 
> Another unrelated data point is that TDP MMU is really broken for our
> use case on v5.10 and v5.15 -- it's even slower that shadow paging!
> 
> OTOH, the backports give nice speed-ups, ranging from ~2.2 times faster
> for pure EPT (legacy) MMU setups up to 18(!!!) times faster for TDP MMU
> on v5.10.

Anyone that's enabling the TDP MMU on v5.10 is on their own, we didn't enable the
TDP MMU by default until v5.14 for very good reasons.

> I backported the whole series down to v5.10 but left out the CR0.WP
> guest owning patch+fix for v5.4 as the code base is too different to get
> all the nuances right, as Sean already hinted. However, even this
> limited backport provides a big performance fix for our use case!

As a compromise of sorts, I propose that we disable the TDP MMU by default on v5.15,
and backport these fixes to v6.1.  v5.15 and earlier won't get "ludicrous speed", but
I think that's perfectly acceptable since KVM has had the suboptimal behavior
literally since EPT/NPT support was first added.

I'm comfortable backporting to v6.1 as that is recent enough, and there weren't
substantial MMU changes between v6.1 and v6.3 in this area.  I.e. I have a decent
level of confidence that we aren't overlooking some subtle dependency.

For v5.15, I am less confident in the safety of a backport, and more importantly,
I think we should disable the TDP MMU by default to mitigate the underlying flaw
that makes the 18x speedup possible.  That flaw is that KVM can end up freeing and
rebuilding TDP MMU roots every time CR0.WP is toggled or a vCPU transitions to/from
SMM.

We mitigated the CR0.WP case between v5.15 and v6.1[1], which is why v6.1 doesn't
exhibit the same pain as v5.10, but Jeremi discovered that the SMM case badly affects
KVM-on-HyperV[2], e.g. when lauching KVM guests using WSL.  I posted a fix[3] to
finally resolve the underlying bug, but as Jeremi discovered[4], backporting the fix
to v5.15 is going to be gnarly, to say the least.  It'll be far worse than backporting
these CR0.WP patches, and maybe even infeasible without a large scale rework (no thanks).

Anyone that will realize meaningful benefits from the TDP MMU is all but guaranteed
to be rolling their own kernels, i.e. can do the backports themselves if they want
to use a v5.15 based kernel.  The big selling point of the TDP MMU is that it scales
better to hundreds of vCPUs, particularly when live migrating such VMs.  I highly
doubt that anyone running a stock kernel is running 100+ vCPU VMs, let alone trying
to live migrate them.

[1] https://lkml.kernel.org/r/20220209170020.1775368-1-pbonzini%40redhat.com
[2] https://lore.kernel.org/all/959c5bce-beb5-b463-7158-33fc4a4f910c@linux.microsoft.com
[3] https://lore.kernel.org/all/20230413231251.1481410-1-seanjc@google.com
[4] https://lore.kernel.org/all/7332d846-fada-eb5c-6068-18ff267bd37f@linux.microsoft.com

  reply	other threads:[~2023-04-14 16:49 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-22  1:37 [PATCH v4 0/6] KVM: MMU: performance tweaks for heavy CR0.WP users Mathias Krause
2023-03-22  1:37 ` [PATCH v4 1/6] KVM: x86/mmu: Avoid indirect call for get_cr3 Mathias Krause
2023-03-22  1:37 ` [PATCH v4 2/6] KVM: x86: Do not unload MMU roots when only toggling CR0.WP with TDP enabled Mathias Krause
2023-05-07  7:32   ` Robert Hoo
2023-05-08  9:30     ` Mathias Krause
2023-05-09  1:04       ` Robert Hoo
2023-03-22  1:37 ` [PATCH v4 3/6] KVM: x86: Ignore CR0.WP toggles in non-paging mode Mathias Krause
2023-03-22  1:37 ` [PATCH v4 4/6] KVM: x86: Make use of kvm_read_cr*_bits() when testing bits Mathias Krause
2023-03-22  1:37 ` [PATCH v4 5/6] KVM: x86/mmu: Fix comment typo Mathias Krause
2023-03-22  1:37 ` [PATCH v4 6/6] KVM: VMX: Make CR0.WP a guest owned bit Mathias Krause
2023-03-27  8:33   ` Xiaoyao Li
2023-03-27  8:37     ` Mathias Krause
2023-03-27 13:48       ` Xiaoyao Li
2023-03-30  8:45   ` Mathias Krause
2023-03-30 17:12     ` Sean Christopherson
2023-03-30 20:15       ` Mathias Krause
2023-03-30 20:30         ` Mathias Krause
2023-03-30 20:36           ` Sean Christopherson
2023-03-30 20:33       ` Sean Christopherson
2023-03-30 20:55         ` Mathias Krause
2023-03-31 14:18           ` Mathias Krause
2023-03-22  7:41 ` [PATCH v4 0/6] KVM: MMU: performance tweaks for heavy CR0.WP users Mathias Krause
2023-03-23 22:50 ` Sean Christopherson
2023-03-25 11:39   ` Mathias Krause
2023-03-25 12:25     ` Greg KH
2023-04-06  2:25       ` Sean Christopherson
2023-04-06 13:22         ` Mathias Krause
2023-04-14  9:29           ` Mathias Krause
2023-04-14 16:49             ` Sean Christopherson [this message]
2023-04-14 20:09               ` Jeremi Piotrowski
2023-04-14 20:17                 ` Sean Christopherson
2023-05-02 17:38                   ` Jeremi Piotrowski
2023-05-08  9:19               ` Mathias Krause
2023-05-08 15:57                 ` Mathias Krause

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZDmEGM+CgYpvDLh6@google.com \
    --to=seanjc@google.com \
    --cc=greg@kroah.com \
    --cc=kvm@vger.kernel.org \
    --cc=minipli@grsecurity.net \
    --cc=pbonzini@redhat.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.