[PATCH v2 0/2] VMX: nested migration fixes for 32 bit nested guests

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 0/2] VMX: nested migration fixes for 32 bit nested guests
@ 2021-11-15 13:18 Maxim Levitsky
  2021-11-15 13:18 ` [PATCH v2 1/2] KVM: nVMX: don't use vcpu->arch.efer when checking host state on nested state load Maxim Levitsky
  2021-11-15 13:18 ` [PATCH v2 2/2] KVM: x86/mmu: include efer.lma in extended mmu role Maxim Levitsky
  0 siblings, 2 replies; 6+ messages in thread
From: Maxim Levitsky @ 2021-11-15 13:18 UTC (permalink / raw)
  To: kvm
  Cc: Vitaly Kuznetsov, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Thomas Gleixner, Sean Christopherson, Paolo Bonzini, Jim Mattson,
	Ingo Molnar, H. Peter Anvin, Wanpeng Li, Borislav Petkov,
	Maxim Levitsky

This is hopefully the last issue I was tracking in regard to nested migration,
as far as I know.

The issue is that migration of L1 which is normal 64 bit guest,
but is running a 32 bit nested guest is broken on VMX and I finally found out why.

There are two bugs, both related to the fact that qemu first restores SREGS
of L2, and only then sets the nested state. That haunts us till this day.

First issue is that vmx_set_nested_state does some checks on the host
state stored in vmcs12, but it uses the current IA32_EFER which is from L2.
Thus, consistency checks fail.

Second issue (happens on both VMX and SVM with npt/ept enabled in both L0 and L1)
is that after migration L1 mmu (aka root_mmu) is initialized by L2's IA32_EFER
due to the way qemu loads SREGS before the nested state, and later is not
initialized again because in this particular case the 32 bitness of L2's IA32_EFER
is not captured in mmu role.

V2:
Thanks to Sean and Paolo for helping me make more correct fixes for both of the issues.

I still haven't researched the 'fixes' tag, since I suspect that 32 bit nested
migration wasn't tested much ever, so this bug might be present since long time ago.

Best regards,
	Maxim Levitsky

Maxim Levitsky (2):
  KVM: nVMX: don't use vcpu->arch.efer when checking host state on
    nested state load
  KVM: x86/mmu: include efer.lma in extended mmu role

 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/mmu/mmu.c          |  1 +
 arch/x86/kvm/vmx/nested.c       | 22 +++++++++++++++++-----
 3 files changed, 19 insertions(+), 5 deletions(-)

-- 
2.26.3

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/2] KVM: nVMX: don't use vcpu->arch.efer when checking host state on nested state load
  2021-11-15 13:18 [PATCH v2 0/2] VMX: nested migration fixes for 32 bit nested guests Maxim Levitsky
@ 2021-11-15 13:18 ` Maxim Levitsky
  2021-11-15 15:50   ` Sean Christopherson
  2021-11-15 13:18 ` [PATCH v2 2/2] KVM: x86/mmu: include efer.lma in extended mmu role Maxim Levitsky
  1 sibling, 1 reply; 6+ messages in thread
From: Maxim Levitsky @ 2021-11-15 13:18 UTC (permalink / raw)
  To: kvm
  Cc: Vitaly Kuznetsov, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Thomas Gleixner, Sean Christopherson, Paolo Bonzini, Jim Mattson,
	Ingo Molnar, H. Peter Anvin, Wanpeng Li, Borislav Petkov,
	Maxim Levitsky

When loading the nested state, due to the way qemu loads
the nested state, vcpu->arch.efer contains L2' IA32_EFER which
can be completely different from L1's IA32_EFER, thus it is
wrong to do consistency check of it vs the vmcs12 exit fields.

To fix this

1. Split the host state consistency check
between current IA32_EFER.LMA and 'host address space' bit in VMCS12 into
nested_vmx_check_address_state_size.

2. Call this check only on a normal VM entry, while skipping this call
on loading the nested state.

3. Trust the 'host address space' bit to contain correct ia32e
value on loading the nested state as it is the best value of
it at that point.
Still do a consistency check of it vs host_ia32_efer in vmcs12.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/vmx/nested.c | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index b4ee5e9f9e201..7b1d5510a7cdc 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2866,6 +2866,17 @@ static int nested_vmx_check_controls(struct kvm_vcpu *vcpu,
 	return 0;
 }
 
+static int nested_vmx_check_address_state_size(struct kvm_vcpu *vcpu,
+				       struct vmcs12 *vmcs12)
+{
+#ifdef CONFIG_X86_64
+	if (CC(!!(vmcs12->vm_exit_controls & VM_EXIT_HOST_ADDR_SPACE_SIZE) !=
+		!!(vcpu->arch.efer & EFER_LMA)))
+		return -EINVAL;
+#endif
+	return 0;
+}
+
 static int nested_vmx_check_host_state(struct kvm_vcpu *vcpu,
 				       struct vmcs12 *vmcs12)
 {
@@ -2890,18 +2901,16 @@ static int nested_vmx_check_host_state(struct kvm_vcpu *vcpu,
 		return -EINVAL;
 
 #ifdef CONFIG_X86_64
-	ia32e = !!(vcpu->arch.efer & EFER_LMA);
+	ia32e = !!(vmcs12->vm_exit_controls & VM_EXIT_HOST_ADDR_SPACE_SIZE);
 #else
 	ia32e = false;
 #endif
 
 	if (ia32e) {
-		if (CC(!(vmcs12->vm_exit_controls & VM_EXIT_HOST_ADDR_SPACE_SIZE)) ||
-		    CC(!(vmcs12->host_cr4 & X86_CR4_PAE)))
+		if (CC(!(vmcs12->host_cr4 & X86_CR4_PAE)))
 			return -EINVAL;
 	} else {
-		if (CC(vmcs12->vm_exit_controls & VM_EXIT_HOST_ADDR_SPACE_SIZE) ||
-		    CC(vmcs12->vm_entry_controls & VM_ENTRY_IA32E_MODE) ||
+		if (CC(vmcs12->vm_entry_controls & VM_ENTRY_IA32E_MODE) ||
 		    CC(vmcs12->host_cr4 & X86_CR4_PCIDE) ||
 		    CC((vmcs12->host_rip) >> 32))
 			return -EINVAL;
@@ -3571,6 +3580,9 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
 	if (nested_vmx_check_controls(vcpu, vmcs12))
 		return nested_vmx_fail(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD);
 
+	if (nested_vmx_check_address_state_size(vcpu, vmcs12))
+		return nested_vmx_fail(vcpu, VMXERR_ENTRY_INVALID_HOST_STATE_FIELD);
+
 	if (nested_vmx_check_host_state(vcpu, vmcs12))
 		return nested_vmx_fail(vcpu, VMXERR_ENTRY_INVALID_HOST_STATE_FIELD);
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/2] KVM: x86/mmu: include efer.lma in extended mmu role
  2021-11-15 13:18 [PATCH v2 0/2] VMX: nested migration fixes for 32 bit nested guests Maxim Levitsky
  2021-11-15 13:18 ` [PATCH v2 1/2] KVM: nVMX: don't use vcpu->arch.efer when checking host state on nested state load Maxim Levitsky
@ 2021-11-15 13:18 ` Maxim Levitsky
  2021-11-15 20:44   ` Sean Christopherson
  1 sibling, 1 reply; 6+ messages in thread
From: Maxim Levitsky @ 2021-11-15 13:18 UTC (permalink / raw)
  To: kvm
  Cc: Vitaly Kuznetsov, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Thomas Gleixner, Sean Christopherson, Paolo Bonzini, Jim Mattson,
	Ingo Molnar, H. Peter Anvin, Wanpeng Li, Borislav Petkov,
	Maxim Levitsky

When the host is running with normal TDP mmu (EPT/NPT),
and it is running a nested 32 bit guest, then after a migration,
the host mmu (aka root_mmu) is first initialized with
nested guest's IA32_EFER, due to the way userspace restores
the nested state.

When later, this is corrected on first nested VM exit to the host,
when host EFER is loaded from vmcs12,
the root_mmu is not reset, because the role.base.level
in this case, reflects the level of the TDP mmu which is
always 4 (or 5) on EPT, and usually 4 or even 5 on AMD
(when we have 64 bit host).

Since most of the paging state is already captured in
the extended mmu role, just add the EFER.LMA there to
force that reset.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/include/asm/kvm_host.h | 1 +
 arch/x86/kvm/mmu/mmu.c          | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 88fce6ab4bbd7..a44b9eb7d4d6d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -364,6 +364,7 @@ union kvm_mmu_extended_role {
 		unsigned int cr4_smap:1;
 		unsigned int cr4_smep:1;
 		unsigned int cr4_la57:1;
+		unsigned int efer_lma:1;
 	};
 };
 
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 354d2ca92df4d..5c4a41697a717 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4682,6 +4682,7 @@ static union kvm_mmu_extended_role kvm_calc_mmu_role_ext(struct kvm_vcpu *vcpu,
 		/* PKEY and LA57 are active iff long mode is active. */
 		ext.cr4_pke = ____is_efer_lma(regs) && ____is_cr4_pke(regs);
 		ext.cr4_la57 = ____is_efer_lma(regs) && ____is_cr4_la57(regs);
+		ext.efer_lma = ____is_efer_lma(regs);
 	}
 
 	ext.valid = 1;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] KVM: nVMX: don't use vcpu->arch.efer when checking host state on nested state load
  2021-11-15 13:18 ` [PATCH v2 1/2] KVM: nVMX: don't use vcpu->arch.efer when checking host state on nested state load Maxim Levitsky
@ 2021-11-15 15:50   ` Sean Christopherson
  2021-11-16 10:03     ` Paolo Bonzini
  0 siblings, 1 reply; 6+ messages in thread
From: Sean Christopherson @ 2021-11-15 15:50 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: kvm, Vitaly Kuznetsov, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Thomas Gleixner, Paolo Bonzini, Jim Mattson, Ingo Molnar,
	H. Peter Anvin, Wanpeng Li, Borislav Petkov

On Mon, Nov 15, 2021, Maxim Levitsky wrote:
> When loading the nested state, due to the way qemu loads
> the nested state, vcpu->arch.efer contains L2' IA32_EFER which
> can be completely different from L1's IA32_EFER, thus it is
> wrong to do consistency check of it vs the vmcs12 exit fields.

This is not sufficient justification.  It makes it sound like KVM is hacking
around a bug in its ABI, which it is not, but that fact is _very_ subtle.  The
"trust" blurb in bullet (3) in particular is misleading.

Instead, I would like something like:

  When loading nested state, don't use check vcpu->arch.efer to get the
  L1 host's 64-bit vs. 32-bit state and don't check it for consistency
  with respect to VM_EXIT_HOST_ADDR_SPACE_SIZE, as register state in vCPU
  may be stale when KVM_SET_NESTED_STATE is called and conceptually does
  not exist.  When the CPU is in non-root mode, i.e. when restoring L2
  state in KVM, there is no snapshot of L1 host state, it is (conditionally)
  loaded on VM-Exit.  E.g. EFER is either preserved on exit, loaded from the
  VMCS (vmcs12 in this case), or loaded from the MSR load list.

  Use vmcs12.VM_EXIT_HOST_ADDR_SPACE_SIZE to determine the target mode of
  the L1 host, as it is the source of truth in this case.  Perform the EFER
  vs. vmcs12.VM_EXIT_HOST_ADDR_SPACE_SIZE consistency check only on VM-Enter,
  as conceptually there's no "current" L1 EFER to check.

  Note, KVM still checks vmcs12.HOST_EFER for consistency if
  if vmcs12.VM_EXIT_LOAD_IA32_EFER is set, i.e. this skips only the check
  against current vCPU state, which does not exist, when loading nested state.

> To fix this
> 
> 1. Split the host state consistency check
> between current IA32_EFER.LMA and 'host address space' bit in VMCS12 into
> nested_vmx_check_address_state_size.
> 
> 2. Call this check only on a normal VM entry, while skipping this call
> on loading the nested state.
> 
> 3. Trust the 'host address space' bit to contain correct ia32e
> value on loading the nested state as it is the best value of
> it at that point.
> Still do a consistency check of it vs host_ia32_efer in vmcs12.
> 
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  arch/x86/kvm/vmx/nested.c | 22 +++++++++++++++++-----
>  1 file changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index b4ee5e9f9e201..7b1d5510a7cdc 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -2866,6 +2866,17 @@ static int nested_vmx_check_controls(struct kvm_vcpu *vcpu,
>  	return 0;
>  }
>  
> +static int nested_vmx_check_address_state_size(struct kvm_vcpu *vcpu,
> +				       struct vmcs12 *vmcs12)

Bad indentation.

> +{
> +#ifdef CONFIG_X86_64
> +	if (CC(!!(vmcs12->vm_exit_controls & VM_EXIT_HOST_ADDR_SPACE_SIZE) !=
> +		!!(vcpu->arch.efer & EFER_LMA)))

Bad indentation.  The number of !'s is also unnecessary.  This also needs a comment
explaining why it's not included in the KVM_SET_NESTED_STATE path.

> +		return -EINVAL;
> +#endif
> +	return 0;
> +}
> +
>  static int nested_vmx_check_host_state(struct kvm_vcpu *vcpu,
>  				       struct vmcs12 *vmcs12)
>  {
> @@ -2890,18 +2901,16 @@ static int nested_vmx_check_host_state(struct kvm_vcpu *vcpu,
>  		return -EINVAL;
>  
>  #ifdef CONFIG_X86_64
> -	ia32e = !!(vcpu->arch.efer & EFER_LMA);
> +	ia32e = !!(vmcs12->vm_exit_controls & VM_EXIT_HOST_ADDR_SPACE_SIZE);


>  #else
>  	ia32e = false;
>  #endif
>  
>  	if (ia32e) {
> -		if (CC(!(vmcs12->vm_exit_controls & VM_EXIT_HOST_ADDR_SPACE_SIZE)) ||
> -		    CC(!(vmcs12->host_cr4 & X86_CR4_PAE)))
> +		if (CC(!(vmcs12->host_cr4 & X86_CR4_PAE)))
>  			return -EINVAL;
>  	} else {
> -		if (CC(vmcs12->vm_exit_controls & VM_EXIT_HOST_ADDR_SPACE_SIZE) ||
> -		    CC(vmcs12->vm_entry_controls & VM_ENTRY_IA32E_MODE) ||
> +		if (CC(vmcs12->vm_entry_controls & VM_ENTRY_IA32E_MODE) ||
>  		    CC(vmcs12->host_cr4 & X86_CR4_PCIDE) ||
>  		    CC((vmcs12->host_rip) >> 32))
>  			return -EINVAL;
> @@ -3571,6 +3580,9 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
>  	if (nested_vmx_check_controls(vcpu, vmcs12))
>  		return nested_vmx_fail(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD);
>  
> +	if (nested_vmx_check_address_state_size(vcpu, vmcs12))
> +		return nested_vmx_fail(vcpu, VMXERR_ENTRY_INVALID_HOST_STATE_FIELD);
> +
>  	if (nested_vmx_check_host_state(vcpu, vmcs12))
>  		return nested_vmx_fail(vcpu, VMXERR_ENTRY_INVALID_HOST_STATE_FIELD);
>  
> -- 
> 2.26.3
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 2/2] KVM: x86/mmu: include efer.lma in extended mmu role
  2021-11-15 13:18 ` [PATCH v2 2/2] KVM: x86/mmu: include efer.lma in extended mmu role Maxim Levitsky
@ 2021-11-15 20:44   ` Sean Christopherson
  0 siblings, 0 replies; 6+ messages in thread
From: Sean Christopherson @ 2021-11-15 20:44 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: kvm, Vitaly Kuznetsov, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Thomas Gleixner, Paolo Bonzini, Jim Mattson, Ingo Molnar,
	H. Peter Anvin, Wanpeng Li, Borislav Petkov

On Mon, Nov 15, 2021, Maxim Levitsky wrote:
> When the host is running with normal TDP mmu (EPT/NPT),
> and it is running a nested 32 bit guest, then after a migration,
> the host mmu (aka root_mmu) is first initialized with
> nested guest's IA32_EFER, due to the way userspace restores
> the nested state.

Please try to avoid unnecessary newlines, I find it quite difficult to read as
my eyeballs need to jump around more.  E.g. wrapping at 75 chars yields

  When the host is running with normal TDP mmu (EPT/NPT), and it is running
  a nested 32 bit guest, then after a migration, the host mmu (aka root_mmu)
  is first initialized with nested guest's IA32_EFER, due to the way
  userspace restores the nested state.

  When later, this is corrected on first nested VM exit to the host, when
  host EFER is loaded from vmcs12, the root_mmu is not reset, because the
  role.base.level in this case, reflects the level of the TDP mmu which is
  always 4 (or 5) on EPT, and usually 4 or even 5 on AMD (when we have
  64-bit host).

  Since most of the paging state is already captured in the extended mmu
  role, just add the EFER.LMA there to force that reset.

> When later, this is corrected on first nested VM exit to the host,
> when host EFER is loaded from vmcs12,
> the root_mmu is not reset, because the role.base.level
> in this case, reflects the level of the TDP mmu which is
> always 4 (or 5) on EPT, and usually 4 or even 5 on AMD
> (when we have 64 bit host).
>
> Since most of the paging state is already captured in
> the extended mmu role, just add the EFER.LMA there to
> force that reset.

Similar to patch 1, I'd like to word the changelog to make it very clear that this
fix is _necessary_, not just a hack to fudge around QEMU behavior.  I've spent far
too much time deciphering historical KVM changelogs along the lines of "QEMU does
XYZ, change KVM to handle that", and in more than one case the "fix" has been wrong
and/or incomplete.

  Incorporate EFER.LMA into kvm_mmu_extended_role, as it used to compute the
  guest root level and is not reflected in kvm_mmu_page_role.level when TDP
  is in use.  When simply running the guest, it is impossible for EFER.LMA
  and kvm_mmu.root_level to get out of sync, as the guest cannot transition
  from PAE paging to 64-bit paging without toggling CR0.PG, i.e. without
  first bouncing through a different MMU context.  And stuffing guest state
  via KVM_SET_SREGS{2} also ensures a full MMU context reset.

  However, if KVM_SET_SREGS{2} is followed by KVM_SET_NESTED_STATE, e.g. to
  set guest state when migrating the VM while L2 is active, the vCPU state
  will reflect L2, not L1.  If L1 is using TDP for L2, then root_mmu will
  have been configured using L2's state, despite not being used for L2.  If
  L2.EFER.LMA != L1.EFER.LMA, and L2 is using PAE paging, then root_mmu will
  be configured for guest PAE paging, but will match the mmu_role for 64-bit
  paging and cause KVM to not   reconfigured root_mmu on the next nested
  VM-Exit.

And after typing that up, it's probably also worth adding a blurb to call out (and
argue against) the alternative.

  Alternatively, the root_mmu's role could be invalidated after a successful
  KVM_SET_NESTED_STATE that yields vcpu->arch.mmu != vcpu->arch.root_mmu,
  i.e. that switches the active mmu to guest_mmu, but doing so would force
  KVM to reconfigure the root_mmu in the common case where L1 and L2 have
  the same EFER, e.g. are both 64-bit guests.

> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 1 +
>  arch/x86/kvm/mmu/mmu.c          | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 88fce6ab4bbd7..a44b9eb7d4d6d 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -364,6 +364,7 @@ union kvm_mmu_extended_role {
>  		unsigned int cr4_smap:1;
>  		unsigned int cr4_smep:1;
>  		unsigned int cr4_la57:1;
> +		unsigned int efer_lma:1;
>  	};
>  };
>  
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 354d2ca92df4d..5c4a41697a717 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -4682,6 +4682,7 @@ static union kvm_mmu_extended_role kvm_calc_mmu_role_ext(struct kvm_vcpu *vcpu,
>  		/* PKEY and LA57 are active iff long mode is active. */
>  		ext.cr4_pke = ____is_efer_lma(regs) && ____is_cr4_pke(regs);
>  		ext.cr4_la57 = ____is_efer_lma(regs) && ____is_cr4_la57(regs);
> +		ext.efer_lma = ____is_efer_lma(regs);
>  	}
>  
>  	ext.valid = 1;
> -- 
> 2.26.3
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] KVM: nVMX: don't use vcpu->arch.efer when checking host state on nested state load
  2021-11-15 15:50   ` Sean Christopherson
@ 2021-11-16 10:03     ` Paolo Bonzini
  0 siblings, 0 replies; 6+ messages in thread
From: Paolo Bonzini @ 2021-11-16 10:03 UTC (permalink / raw)
  To: Sean Christopherson, Maxim Levitsky
  Cc: kvm, Vitaly Kuznetsov, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Thomas Gleixner, Jim Mattson, Ingo Molnar, H. Peter Anvin,
	Wanpeng Li, Borislav Petkov

On 11/15/21 16:50, Sean Christopherson wrote:
>    When loading nested state, don't use check vcpu->arch.efer to get the
>    L1 host's 64-bit vs. 32-bit state and don't check it for consistency
>    with respect to VM_EXIT_HOST_ADDR_SPACE_SIZE, as register state in vCPU
>    may be stale when KVM_SET_NESTED_STATE is called and conceptually does
>    not exist.  When the CPU is in non-root mode, i.e. when restoring L2
>    state in KVM, there is no snapshot of L1 host state, it is (conditionally)
>    loaded on VM-Exit.  E.g. EFER is either preserved on exit, loaded from the
>    VMCS (vmcs12 in this case), or loaded from the MSR load list.
> 
>    Use vmcs12.VM_EXIT_HOST_ADDR_SPACE_SIZE to determine the target mode of
>    the L1 host, as it is the source of truth in this case.  Perform the EFER
>    vs. vmcs12.VM_EXIT_HOST_ADDR_SPACE_SIZE consistency check only on VM-Enter,
>    as conceptually there's no "current" L1 EFER to check.
> 
>    Note, KVM still checks vmcs12.HOST_EFER for consistency if
>    if vmcs12.VM_EXIT_LOAD_IA32_EFER is set, i.e. this skips only the check
>    against current vCPU state, which does not exist, when loading nested state.

Queued with some further edits and nested_vmx_check_address_state_size 
renamed to nested_vmx_check_address_*space*_size.

I think the "!!" are best left in place though, because "!!(a & b)" is 
idiomatic. Comparing "!(a & b)" would leave the reader wondering about 
the inversion, and "(bool)(a & b)" is just too ugly and magic.  The 
compiler anyway converts the "!!" to "!= 0" very early on, and never 
performs back-to-back logical NOTs.

Paolo


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-11-16 10:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-15 13:18 [PATCH v2 0/2] VMX: nested migration fixes for 32 bit nested guests Maxim Levitsky
2021-11-15 13:18 ` [PATCH v2 1/2] KVM: nVMX: don't use vcpu->arch.efer when checking host state on nested state load Maxim Levitsky
2021-11-15 15:50   ` Sean Christopherson
2021-11-16 10:03     ` Paolo Bonzini
2021-11-15 13:18 ` [PATCH v2 2/2] KVM: x86/mmu: include efer.lma in extended mmu role Maxim Levitsky
2021-11-15 20:44   ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).