All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jim Mattson <jmattson@google.com>
To: Maxim Levitsky <mlevitsk@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	kvm@vger.kernel.org, Ingo Molnar <mingo@redhat.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Sean Christopherson <seanjc@google.com>,
	Borislav Petkov <bp@alien8.de>, "H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	x86@kernel.org, Vitaly Kuznetsov <vkuznets@redhat.com>,
	Joerg Roedel <joro@8bytes.org>,
	linux-kernel@vger.kernel.org, Wanpeng Li <wanpengli@tencent.com>
Subject: Re: [PATCH v3 4/7] KVM: x86: nSVM: support PAUSE filter threshold and count when cpu_pm=on
Date: Mon, 21 Mar 2022 14:59:53 -0700	[thread overview]
Message-ID: <CALMp9eSUSexhPWMWXE1HpSD+movaYcdge_J95LiLCnJyMEp3WA@mail.gmail.com> (raw)
In-Reply-To: <abe8584fa3691de1d6ae6c6617b8ea750b30fd1c.camel@redhat.com>

On Mon, Mar 21, 2022 at 2:36 PM Maxim Levitsky <mlevitsk@redhat.com> wrote:
>
> On Wed, 2022-03-09 at 11:07 -0800, Jim Mattson wrote:
> > On Wed, Mar 9, 2022 at 10:47 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
> > > On 3/9/22 19:35, Jim Mattson wrote:
> > > > I didn't think pause filtering was virtualizable, since the value of
> > > > the internal counter isn't exposed on VM-exit.
> > > >
> > > > On bare metal, for instance, assuming the hypervisor doesn't intercept
> > > > CPUID, the following code would quickly trigger a PAUSE #VMEXIT with
> > > > the filter count set to 2.
> > > >
> > > > 1:
> > > > pause
> > > > cpuid
> > > > jmp 1
> > > >
> > > > Since L0 intercepts CPUID, however, L2 will exit to L0 on each loop
> > > > iteration, and when L0 resumes L2, the internal counter will be set to
> > > > 2 again. L1 will never see a PAUSE #VMEXIT.
> > > >
> > > > How do you handle this?
> > > >
> > >
> > > I would expect that the same would happen on an SMI or a host interrupt.
> > >
> > >         1:
> > >         pause
> > >         outl al, 0xb2
> > >         jmp 1
> > >
> > > In general a PAUSE vmexit will mostly benefit the VM that is pausing, so
> > > having a partial implementation would be better than disabling it
> > > altogether.
> >
> > Indeed, the APM does say, "Certain events, including SMI, can cause
> > the internal count to be reloaded from the VMCB." However, expanding
> > that set of events so much that some pause loops will *never* trigger
> > a #VMEXIT seems problematic. If the hypervisor knew that the PAUSE
> > filter may not be triggered, it could always choose to exit on every
> > PAUSE.
> >
> > Having a partial implementation is only better than disabling it
> > altogether if the L2 pause loop doesn't contain a hidden #VMEXIT to
> > L0.
> >
>
> Hi!
>
> You bring up a very valid point, which I didn't think about.
>
> However after thinking about this, I think that in practice,
> this isn't a show stopper problem for exposing this feature to the guest.
>
>
> This is what I am thinking:
>
> First lets assume that the L2 is malicious. In this case no doubt
> it can craft such a loop which will not VMexit on PAUSE.
> But that isn't a problem - instead of this guest could have just used NOP
> which is not possible to intercept anyway - no harm is done.
>
> Now lets assume a non malicious L2:
>
>
> First of all the problem can only happen when a VM exit is intercepted by L0,
> and not by L1. Both above cases usually don't pass this criteria since L1 is highly
> likely to intercept both CPUID and IO port access. It is also highly unlikely
> to allow L2 direct access to L1's mmio ranges.
>
> Overall there are very few cases of deterministic vm exit which is intercepted
> by L0 but not L1. If that happens then L1 will not catch the PAUSE loop,
> which is not different much from not catching it because of not suitable
> thresholds.
>
> Also note that this is an optimization only - due to count and threshold,
> it is not guaranteed to catch all pause loops - in fact hypervisor has
> to guess these values, and update them in attempt to catch as many such
> loops as it can.
>
> I think overall it is OK to expose that feature to the guest
> and it should even improve performance in some cases - currently
> at least nested KVM intercepts every PAUSE otherwise.

Can I at least request that this behavior be documented as a KVM
virtual CPU erratum?

>
> Best regards,
>         Maxim Levitsky
>
>
>
>

  reply	other threads:[~2022-03-21 22:53 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-01 14:36 [PATCH v3 0/7] nSVM/SVM features Maxim Levitsky
2022-03-01 14:36 ` [PATCH v3 1/7] KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running Maxim Levitsky
2022-03-09 13:00   ` Paolo Bonzini
2022-03-14 11:25     ` Maxim Levitsky
2022-03-01 14:36 ` [PATCH v3 2/7] KVM: x86: nSVM: implement nested LBR virtualization Maxim Levitsky
2022-03-09 13:00   ` Paolo Bonzini
2022-03-22 16:53     ` Maxim Levitsky
2022-03-01 14:36 ` [PATCH v3 3/7] KVM: x86: nSVM: implement nested VMLOAD/VMSAVE Maxim Levitsky
2022-03-01 14:36 ` [PATCH v3 4/7] KVM: x86: nSVM: support PAUSE filter threshold and count when cpu_pm=on Maxim Levitsky
2022-03-09 13:12   ` Paolo Bonzini
2022-03-22 16:52     ` Maxim Levitsky
2022-03-09 18:35   ` Jim Mattson
2022-03-09 18:47     ` Paolo Bonzini
2022-03-09 19:07       ` Jim Mattson
2022-03-21 21:36         ` Maxim Levitsky
2022-03-21 21:59           ` Jim Mattson [this message]
2022-03-21 22:11             ` Maxim Levitsky
2022-03-21 22:41               ` Jim Mattson
2022-03-22 10:12                 ` Paolo Bonzini
2022-03-22 11:17                   ` Maxim Levitsky
2022-03-01 14:36 ` [PATCH v3 5/7] KVM: x86: nSVM: implement nested vGIF Maxim Levitsky
2022-03-09 13:40   ` Paolo Bonzini
2022-03-14 15:21     ` Maxim Levitsky
2022-03-01 14:36 ` [PATCH v3 6/7] KVM: x86: SVM: allow to force AVIC to be enabled Maxim Levitsky
2022-03-09 13:41   ` Paolo Bonzini
2022-03-01 14:36 ` [PATCH v3 7/7] KVM: x86: SVM: allow AVIC to co-exist with a nested guest running Maxim Levitsky
2022-03-09 13:50   ` Paolo Bonzini
2022-03-09 18:14     ` Maxim Levitsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALMp9eSUSexhPWMWXE1HpSD+movaYcdge_J95LiLCnJyMEp3WA@mail.gmail.com \
    --to=jmattson@google.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.