KVM Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] KVM: x86: drop erroneous mmu_check_root() from fast_pgd_switch()
@ 2020-06-30 10:07 Vitaly Kuznetsov
  2020-07-01  3:24 ` Junaid Shahid
  0 siblings, 1 reply; 3+ messages in thread
From: Vitaly Kuznetsov @ 2020-06-30 10:07 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Junaid Shahid,
	linux-kernel

Undesired triple fault gets injected to L1 guest on SVM when L2 is
launched with certain CR3 values. It seems the mmu_check_root()
check in fast_pgd_switch() is wrong: first of all we don't know
if 'new_pgd' is a GPA or a nested GPA and, in case it is a nested
GPA, we can't check it with kvm_is_visible_gfn().

The problematic code path is:
nested_svm_vmrun()
  ...
  nested_prepare_vmcb_save()
    kvm_set_cr3(..., nested_vmcb->save.cr3)
      kvm_mmu_new_pgd()
        ...
        mmu_check_root() -> TRIPLE FAULT

The mmu_check_root() check in fast_pgd_switch() seems to be
superfluous even for non-nested case: when GPA is outside of the
visible range cached_root_available() will fail for non-direct
roots (as we can't have a matching one on the list) and we don't
seem to care for direct ones.

Also, raising #TF immediately when a non-existent GFN is written to CR3
doesn't seem to mach architecture behavior.

Fixes: 7c390d350f8b ("kvm: x86: Add fast CR3 switch code path")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
- The patch fixes the immediate issue and doesn't seem to break any
  tests even with shadow PT but I'm not sure I properly understood
  why the check was there in the first place. Please review!
---
 arch/x86/kvm/mmu/mmu.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 76817d13c86e..286c74d2ae8d 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4277,8 +4277,7 @@ static bool fast_pgd_switch(struct kvm_vcpu *vcpu, gpa_t new_pgd,
 	 */
 	if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL &&
 	    mmu->root_level >= PT64_ROOT_4LEVEL)
-		return !mmu_check_root(vcpu, new_pgd >> PAGE_SHIFT) &&
-		       cached_root_available(vcpu, new_pgd, new_role);
+		return cached_root_available(vcpu, new_pgd, new_role);
 
 	return false;
 }
-- 
2.25.4


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] KVM: x86: drop erroneous mmu_check_root() from fast_pgd_switch()
  2020-06-30 10:07 [PATCH] KVM: x86: drop erroneous mmu_check_root() from fast_pgd_switch() Vitaly Kuznetsov
@ 2020-07-01  3:24 ` Junaid Shahid
  2020-07-01  7:14   ` Vitaly Kuznetsov
  0 siblings, 1 reply; 3+ messages in thread
From: Junaid Shahid @ 2020-07-01  3:24 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, linux-kernel

On 6/30/20 3:07 AM, Vitaly Kuznetsov wrote:
> Undesired triple fault gets injected to L1 guest on SVM when L2 is
> launched with certain CR3 values. It seems the mmu_check_root()
> check in fast_pgd_switch() is wrong: first of all we don't know
> if 'new_pgd' is a GPA or a nested GPA and, in case it is a nested
> GPA, we can't check it with kvm_is_visible_gfn().
> 
> The problematic code path is:
> nested_svm_vmrun()
>    ...
>    nested_prepare_vmcb_save()
>      kvm_set_cr3(..., nested_vmcb->save.cr3)
>        kvm_mmu_new_pgd()
>          ...
>          mmu_check_root() -> TRIPLE FAULT
> 
> The mmu_check_root() check in fast_pgd_switch() seems to be
> superfluous even for non-nested case: when GPA is outside of the
> visible range cached_root_available() will fail for non-direct
> roots (as we can't have a matching one on the list) and we don't
> seem to care for direct ones.
> 
> Also, raising #TF immediately when a non-existent GFN is written to CR3
> doesn't seem to mach architecture behavior.
> 
> Fixes: 7c390d350f8b ("kvm: x86: Add fast CR3 switch code path")
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
> - The patch fixes the immediate issue and doesn't seem to break any
>    tests even with shadow PT but I'm not sure I properly understood
>    why the check was there in the first place. Please review!
> ---
>   arch/x86/kvm/mmu/mmu.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 76817d13c86e..286c74d2ae8d 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -4277,8 +4277,7 @@ static bool fast_pgd_switch(struct kvm_vcpu *vcpu, gpa_t new_pgd,
>   	 */
>   	if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL &&
>   	    mmu->root_level >= PT64_ROOT_4LEVEL)
> -		return !mmu_check_root(vcpu, new_pgd >> PAGE_SHIFT) &&
> -		       cached_root_available(vcpu, new_pgd, new_role);
> +		return cached_root_available(vcpu, new_pgd, new_role);
>   
>   	return false;
>   }
> 

The check does seem superfluous, so should be ok to remove. Though I think that fast_pgd_switch() really should be getting only L1 GPAs. Otherwise, there could be confusion between the same GPAs from two different L2s.

IIUC, at least on Intel, only L1 CR3s (including shadow L1 CR3s for L2) or L1 EPTPs should get to fast_pgd_switch(). But I am not familiar enough with SVM to see why an L2 GPA would end up there. From a cursory look, it seems that until "978ce5837c7e KVM: SVM: always update CR3 in VMCB", enter_svm_guest_mode() was calling kvm_set_cr3() only when using shadow paging, in which case I assume that nested_vmcb->save.cr3 would have been an L1 CR3 shadowing the L2 CR3, correct? But now kvm_set_cr3() is called even when not using shadow paging, which I suppose is how we are getting the L2 CR3. Should we skip calling fast_pgd_switch() in that particular case?

Thanks,
Junaid

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] KVM: x86: drop erroneous mmu_check_root() from fast_pgd_switch()
  2020-07-01  3:24 ` Junaid Shahid
@ 2020-07-01  7:14   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 3+ messages in thread
From: Vitaly Kuznetsov @ 2020-07-01  7:14 UTC (permalink / raw)
  To: Junaid Shahid, kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, linux-kernel

Junaid Shahid <junaids@google.com> writes:

> On 6/30/20 3:07 AM, Vitaly Kuznetsov wrote:
>> Undesired triple fault gets injected to L1 guest on SVM when L2 is
>> launched with certain CR3 values. It seems the mmu_check_root()
>> check in fast_pgd_switch() is wrong: first of all we don't know
>> if 'new_pgd' is a GPA or a nested GPA and, in case it is a nested
>> GPA, we can't check it with kvm_is_visible_gfn().
>> 
>> The problematic code path is:
>> nested_svm_vmrun()
>>    ...
>>    nested_prepare_vmcb_save()
>>      kvm_set_cr3(..., nested_vmcb->save.cr3)
>>        kvm_mmu_new_pgd()
>>          ...
>>          mmu_check_root() -> TRIPLE FAULT
>> 
>> The mmu_check_root() check in fast_pgd_switch() seems to be
>> superfluous even for non-nested case: when GPA is outside of the
>> visible range cached_root_available() will fail for non-direct
>> roots (as we can't have a matching one on the list) and we don't
>> seem to care for direct ones.
>> 
>> Also, raising #TF immediately when a non-existent GFN is written to CR3
>> doesn't seem to mach architecture behavior.
>> 
>> Fixes: 7c390d350f8b ("kvm: x86: Add fast CR3 switch code path")
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>> - The patch fixes the immediate issue and doesn't seem to break any
>>    tests even with shadow PT but I'm not sure I properly understood
>>    why the check was there in the first place. Please review!
>> ---
>>   arch/x86/kvm/mmu/mmu.c | 3 +--
>>   1 file changed, 1 insertion(+), 2 deletions(-)
>> 
>> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
>> index 76817d13c86e..286c74d2ae8d 100644
>> --- a/arch/x86/kvm/mmu/mmu.c
>> +++ b/arch/x86/kvm/mmu/mmu.c
>> @@ -4277,8 +4277,7 @@ static bool fast_pgd_switch(struct kvm_vcpu *vcpu, gpa_t new_pgd,
>>   	 */
>>   	if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL &&
>>   	    mmu->root_level >= PT64_ROOT_4LEVEL)
>> -		return !mmu_check_root(vcpu, new_pgd >> PAGE_SHIFT) &&
>> -		       cached_root_available(vcpu, new_pgd, new_role);
>> +		return cached_root_available(vcpu, new_pgd, new_role);
>>   
>>   	return false;
>>   }
>> 
>
> The check does seem superfluous, so should be ok to remove. Though I
> think that fast_pgd_switch() really should be getting only L1
> GPAs. Otherwise, there could be confusion between the same GPAs from
> two different L2s.
>
> IIUC, at least on Intel, only L1 CR3s (including shadow L1 CR3s for
> L2) or L1 EPTPs should get to fast_pgd_switch(). But I am not familiar
> enough with SVM to see why an L2 GPA would end up there. From a
> cursory look, it seems that until "978ce5837c7e KVM: SVM: always
> update CR3 in VMCB", enter_svm_guest_mode() was calling kvm_set_cr3()
> only when using shadow paging, in which case I assume that
> nested_vmcb->save.cr3 would have been an L1 CR3 shadowing the L2 CR3,
> correct? But now kvm_set_cr3() is called even when not using shadow
> paging, which I suppose is how we are getting the L2 CR3. Should we
> skip calling fast_pgd_switch() in that particular case?

Thank you for your thoughts, this is helpful indeed.

As far as I can see, nVMX calls kvm_mmu_new_pgd() directly in two cases:

1) nested_ept_init_mmu_context() -> kvm_init_shadow_ept_mmu() when
switching to L2 (and 'new_pgd' is EPTP)

2) nested_vmx_load_cr3() when !nested_ept (and 'new_pgd' is 'cr3')

I think we need to do something similar for nSVM to make the root cache
work and work correctly:

1) supplement nested_svm_init_mmu_context() -> kvm_init_shadow_mmu()
with kvm_mmu_new_pgd()

2) stop doing kvm_mmu_new_pgd() from nested_prepare_vmcb_save() ->
kvm_set_cr3() when npt_enabled

Let me try to experiment here a bit.

Thanks again!

-- 
Vitaly


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, back to index

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-30 10:07 [PATCH] KVM: x86: drop erroneous mmu_check_root() from fast_pgd_switch() Vitaly Kuznetsov
2020-07-01  3:24 ` Junaid Shahid
2020-07-01  7:14   ` Vitaly Kuznetsov

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git