All of lore.kernel.org
 help / color / mirror / Atom feed
From: Xiaoyao Li <xiaoyao.li@intel.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Tao Xu <tao3.xu@intel.com>,
	pbonzini@redhat.com, vkuznets@redhat.com, wanpengli@tencent.com,
	jmattson@google.com, joro@8bytes.org, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, hpa@zytor.com, x86@kernel.org,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] KVM: VMX: Enable Notify VM exit
Date: Tue, 3 Aug 2021 08:38:13 +0800	[thread overview]
Message-ID: <080602dc-f998-ec13-ddf9-42902aa477de@intel.com> (raw)
In-Reply-To: <YQgTPakbT+kCwMLP@google.com>

On 8/2/2021 11:46 PM, Sean Christopherson wrote:
> On Mon, Aug 02, 2021, Xiaoyao Li wrote:
>> On 7/31/2021 4:41 AM, Sean Christopherson wrote:
>>> On Tue, May 25, 2021, Tao Xu wrote:
>>>>    #endif /* __KVM_X86_VMX_CAPS_H */
>>>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>>>> index 4bceb5ca3a89..c0ad01c88dac 100644
>>>> --- a/arch/x86/kvm/vmx/vmx.c
>>>> +++ b/arch/x86/kvm/vmx/vmx.c
>>>> @@ -205,6 +205,10 @@ module_param(ple_window_max, uint, 0444);
>>>>    int __read_mostly pt_mode = PT_MODE_SYSTEM;
>>>>    module_param(pt_mode, int, S_IRUGO);
>>>> +/* Default is 0, less than 0 (for example, -1) disables notify window. */
>>>> +static int __read_mostly notify_window;
>>>
>>> I'm not sure I like the idea of trusting ucode to select an appropriate internal
>>> threshold.  Unless the internal threshold is architecturally defined to be at
>>> least N nanoseconds or whatever, I think KVM should provide its own sane default.
>>> E.g. it's not hard to imagine a scenario where a ucode patch gets rolled out that
>>> adjusts the threshold and starts silently degrading guest performance.
>>
>> You mean when internal threshold gets smaller somehow, and cases
>> false-positive that leads unexpected VM exit on normal instruction? In this
>> case, we set increase the vmcs.notify_window in KVM.
> 
> Not while VMs are running though.
> 
>> I think there is no better to avoid this case if ucode changes internal
>> threshold. Unless KVM's default notify_window is bigger enough.
>>
>>> Even if the internal threshold isn't architecturally constrained, it would be very,
>>> very helpful if Intel could publish the per-uarch/stepping thresholds, e.g. to give
>>> us a ballpark idea of how agressive KVM can be before it risks false positives.
>>
>> Even Intel publishes the internal threshold, we still need to provide a
>> final best_value (internal + vmcs.notify_window). Then what's that value?
> 
> The ideal value would be high enough to guarantee there are zero false positives,
> yet low enough to prevent a malicious guest from causing instability in the host
> by blocking events for an extended duration.  The problem is that there's no
> magic answer for the threshold at which a blocked event would lead to system
> instability, and without at least a general idea of the internal value there's no
> answer at all.
> 
> IIRC, SGX instructions have a hard upper bound of 25k cycles before they have to
> check for pending interrupts, e.g. it's why EINIT is interruptible.  The 25k cycle
> limit is likely a good starting point for the combined minimum.  That's why I want
> to know the internal minimum; if the internal minimum is _guaranteed_ to be >25k,
> then KVM can be more aggressive with its default value.

OK. I will go internally to see if we can publish the internal threshold.

>> If we have an option for final best_value, then I think it's OK to just let
>> vmcs.notify_window = best_value. Then the true final value is best_value +
>> internal.
>>   - if it's a normal instruction, it should finish within best_value or
>> best_value + internal. So it makes no difference.
>>   - if it's an instruction in malicious case, it won't go to next instruction
>> whether wait for best_value or best_value + internal.
> 
> ...
> 
>>>> +
>>>>    	vmcs_write32(PAGE_FAULT_ERROR_CODE_MASK, 0);
>>>>    	vmcs_write32(PAGE_FAULT_ERROR_CODE_MATCH, 0);
>>>>    	vmcs_write32(CR3_TARGET_COUNT, 0);           /* 22.2.1 */
>>>> @@ -5642,6 +5653,31 @@ static int handle_bus_lock_vmexit(struct kvm_vcpu *vcpu)
>>>>    	return 0;
>>>>    }
>>>> +static int handle_notify(struct kvm_vcpu *vcpu)
>>>> +{
>>>> +	unsigned long exit_qual = vmx_get_exit_qual(vcpu);
>>>> +
>>>> +	if (!(exit_qual & NOTIFY_VM_CONTEXT_INVALID)) {
>>>
>>> What does CONTEXT_INVALID mean?  The ISE doesn't provide any information whatsoever.
>>
>> It means whether the VM context is corrupted and not valid in the VMCS.
> 
> Well that's a bit terrifying.  Under what conditions can the VM context become
> corrupted?  E.g. if the context can be corrupted by an inopportune NOTIFY exit,
> then KVM needs to be ultra conservative as a false positive could be fatal to a
> guest.
> 

Short answer is no case will set the VM_CONTEXT_INVALID bit.

VM_CONTEXT_INVALID is so fatal and IMHO it won't be set for any 
inopportune NOTIFY exit.

  reply	other threads:[~2021-08-03  0:38 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-25  5:12 [PATCH v2] KVM: VMX: Enable Notify VM exit Tao Xu
2021-06-02 10:31 ` Vitaly Kuznetsov
2021-06-03  1:23   ` Tao Xu
2021-06-03 13:43     ` Vitaly Kuznetsov
2021-06-03  1:25   ` Xiaoyao Li
2021-06-03 13:35     ` Jim Mattson
2021-06-07  9:24       ` Xiaoyao Li
2021-06-03 13:52     ` Vitaly Kuznetsov
2021-06-07  9:23       ` Xiaoyao Li
2021-06-24  4:52 ` Tao Xu
2021-07-22  3:25 ` Xiaoyao Li
2021-07-30 20:41 ` Sean Christopherson
2021-08-02 12:53   ` Xiaoyao Li
2021-08-02 15:46     ` Sean Christopherson
2021-08-03  0:38       ` Xiaoyao Li [this message]
2021-09-02  9:28         ` Chenyi Qiang
2021-09-02 16:29           ` Sean Christopherson
2021-09-07 13:33             ` Xiaoyao Li
2021-09-09 18:47               ` Sean Christopherson
2021-09-10  7:39                 ` Xiaoyao Li
2021-09-10 17:55                   ` Sean Christopherson
2021-09-02 16:15         ` Sean Christopherson
2021-09-02 16:36           ` Sean Christopherson
2021-09-07 13:45             ` Xiaoyao Li
2021-09-09 18:59               ` Sean Christopherson
2021-09-13  2:58                 ` Xiaoyao Li
2021-10-15 18:29                   ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=080602dc-f998-ec13-ddf9-42902aa477de@intel.com \
    --to=xiaoyao.li@intel.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tao3.xu@intel.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.