linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup
@ 2020-10-07  1:44 Sean Christopherson
  2020-10-07  1:44 ` [PATCH 1/6] KVM: VMX: Drop guest CPUID check for VMXE in vmx_set_cr4() Sean Christopherson
                   ` (7 more replies)
  0 siblings, 8 replies; 24+ messages in thread
From: Sean Christopherson @ 2020-10-07  1:44 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Stas Sergeev

Two bug fixes to handle KVM_SET_SREGS without a preceding KVM_SET_CPUID2.

The overarching issue is that kvm_x86_ops.set_cr4() can fail, but its
invocation from __set_sregs(), a.k.a. KVM_SET_SREGS, ignores the result.
Fix the issue by moving all validity checks out of .set_cr4() in one way
or another.

I intentionally omitted a Cc to stable.  The first bug fix in particular
may break stable trees as it simply removes a check, and I don't know that
stable trees have the generic CR4 reserved bit check that is needed to
prevent the guest from setting VMXE when nVMX is not allowed.

Sean Christopherson (6):
  KVM: VMX: Drop guest CPUID check for VMXE in vmx_set_cr4()
  KVM: VMX: Drop explicit 'nested' check from vmx_set_cr4()
  KVM: SVM: Drop VMXE check from svm_set_cr4()
  KVM: x86: Move vendor CR4 validity check to dedicated kvm_x86_ops hook
  KVM: x86: Return bool instead of int for CR4 and SREGS validity checks
  KVM: selftests: Verify supported CR4 bits can be set before
    KVM_SET_CPUID2

 arch/x86/include/asm/kvm_host.h               |  3 +-
 arch/x86/kvm/svm/nested.c                     |  2 +-
 arch/x86/kvm/svm/svm.c                        | 12 ++-
 arch/x86/kvm/svm/svm.h                        |  2 +-
 arch/x86/kvm/vmx/nested.c                     |  2 +-
 arch/x86/kvm/vmx/vmx.c                        | 35 +++----
 arch/x86/kvm/vmx/vmx.h                        |  2 +-
 arch/x86/kvm/x86.c                            | 28 +++---
 arch/x86/kvm/x86.h                            |  2 +-
 .../selftests/kvm/include/x86_64/processor.h  | 17 ++++
 .../selftests/kvm/include/x86_64/vmx.h        |  4 -
 .../selftests/kvm/x86_64/set_sregs_test.c     | 92 ++++++++++++++++++-
 12 files changed, 153 insertions(+), 48 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/6] KVM: VMX: Drop guest CPUID check for VMXE in vmx_set_cr4()
  2020-10-07  1:44 [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup Sean Christopherson
@ 2020-10-07  1:44 ` Sean Christopherson
  2020-10-07  1:44 ` [PATCH 2/6] KVM: VMX: Drop explicit 'nested' check from vmx_set_cr4() Sean Christopherson
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 24+ messages in thread
From: Sean Christopherson @ 2020-10-07  1:44 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Stas Sergeev

Drop vmx_set_cr4()'s somewhat hidden guest_cpuid_has() check on VMXE now
that common x86 handles the check by incorporating VMXE into the CR4
reserved bits, i.e. in cr4_guest_rsvd_bits.  This fixes a bug where KVM
incorrectly rejects KVM_SET_SREGS with CR4.VMXE=1 if it's executed
before KVM_SET_CPUID{,2}.

Fixes: 5e1746d6205d ("KVM: nVMX: Allow setting the VMXE bit in CR4")
Reported-by: Stas Sergeev <stsp@users.sourceforge.net>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index e23c41ccfac9..99ea57ba2a84 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3110,9 +3110,10 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 		 * must first be able to turn on cr4.VMXE (see handle_vmon()).
 		 * So basically the check on whether to allow nested VMX
 		 * is here.  We operate under the default treatment of SMM,
-		 * so VMX cannot be enabled under SMM.
+		 * so VMX cannot be enabled under SMM.  Note, guest CPUID is
+		 * intentionally ignored, it's handled by cr4_guest_rsvd_bits.
 		 */
-		if (!nested_vmx_allowed(vcpu) || is_smm(vcpu))
+		if (!nested || is_smm(vcpu))
 			return 1;
 	}
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/6] KVM: VMX: Drop explicit 'nested' check from vmx_set_cr4()
  2020-10-07  1:44 [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup Sean Christopherson
  2020-10-07  1:44 ` [PATCH 1/6] KVM: VMX: Drop guest CPUID check for VMXE in vmx_set_cr4() Sean Christopherson
@ 2020-10-07  1:44 ` Sean Christopherson
  2020-10-07  1:44 ` [PATCH 3/6] KVM: SVM: Drop VMXE check from svm_set_cr4() Sean Christopherson
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 24+ messages in thread
From: Sean Christopherson @ 2020-10-07  1:44 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Stas Sergeev

Drop vmx_set_cr4()'s explicit check on the 'nested' module param now
that common x86 handles the check by incorporating VMXE into the CR4
reserved bits, via kvm_cpu_caps.  X86_FEATURE_VMX is set in kvm_cpu_caps
(by vmx_set_cpu_caps()), if and only if 'nested' is true.

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 99ea57ba2a84..dac93346aca9 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3104,18 +3104,13 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 		}
 	}
 
-	if (cr4 & X86_CR4_VMXE) {
-		/*
-		 * To use VMXON (and later other VMX instructions), a guest
-		 * must first be able to turn on cr4.VMXE (see handle_vmon()).
-		 * So basically the check on whether to allow nested VMX
-		 * is here.  We operate under the default treatment of SMM,
-		 * so VMX cannot be enabled under SMM.  Note, guest CPUID is
-		 * intentionally ignored, it's handled by cr4_guest_rsvd_bits.
-		 */
-		if (!nested || is_smm(vcpu))
-			return 1;
-	}
+	/*
+	 * We operate under the default treatment of SMM, so VMX cannot be
+	 * enabled under SMM.  Note, whether or not VMXE is allowed at all is
+	 * handled by kvm_valid_cr4().
+	 */
+	if ((cr4 & X86_CR4_VMXE) && is_smm(vcpu))
+		return 1;
 
 	if (vmx->nested.vmxon && !nested_cr4_valid(vcpu, cr4))
 		return 1;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/6] KVM: SVM: Drop VMXE check from svm_set_cr4()
  2020-10-07  1:44 [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup Sean Christopherson
  2020-10-07  1:44 ` [PATCH 1/6] KVM: VMX: Drop guest CPUID check for VMXE in vmx_set_cr4() Sean Christopherson
  2020-10-07  1:44 ` [PATCH 2/6] KVM: VMX: Drop explicit 'nested' check from vmx_set_cr4() Sean Christopherson
@ 2020-10-07  1:44 ` Sean Christopherson
  2020-10-07  1:44 ` [PATCH 4/6] KVM: x86: Move vendor CR4 validity check to dedicated kvm_x86_ops hook Sean Christopherson
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 24+ messages in thread
From: Sean Christopherson @ 2020-10-07  1:44 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Stas Sergeev

Drop svm_set_cr4()'s explicit check CR4.VMXE now that common x86 handles
the check by incorporating VMXE into the CR4 reserved bits, via
kvm_cpu_caps.  SVM obviously does not set X86_FEATURE_VMX.

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/svm/svm.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 4f401fc6a05d..f92a19b77da3 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1684,9 +1684,6 @@ int svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 	unsigned long host_cr4_mce = cr4_read_shadow() & X86_CR4_MCE;
 	unsigned long old_cr4 = to_svm(vcpu)->vmcb->save.cr4;
 
-	if (cr4 & X86_CR4_VMXE)
-		return 1;
-
 	if (npt_enabled && ((old_cr4 ^ cr4) & X86_CR4_PGE))
 		svm_flush_tlb(vcpu);
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/6] KVM: x86: Move vendor CR4 validity check to dedicated kvm_x86_ops hook
  2020-10-07  1:44 [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup Sean Christopherson
                   ` (2 preceding siblings ...)
  2020-10-07  1:44 ` [PATCH 3/6] KVM: SVM: Drop VMXE check from svm_set_cr4() Sean Christopherson
@ 2020-10-07  1:44 ` Sean Christopherson
  2020-10-07  1:44 ` [PATCH 5/6] KVM: x86: Return bool instead of int for CR4 and SREGS validity checks Sean Christopherson
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 24+ messages in thread
From: Sean Christopherson @ 2020-10-07  1:44 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Stas Sergeev

Split out VMX's checks on CR4.VMXE to a dedicated hook, .is_valid_cr4(),
and invoke the new hook from kvm_valid_cr4().  This fixes an issue where
KVM_SET_SREGS would return success while failing to actually set CR4.

Fixing the issue by explicitly checking kvm_x86_ops.set_cr4()'s return
in __set_sregs() is not a viable option as KVM has already stuffed a
variety of vCPU state.

Note, kvm_valid_cr4() and is_valid_cr4() have different return types and
inverted semantics.  This will be remedied in a future patch.

Fixes: 5e1746d6205d ("KVM: nVMX: Allow setting the VMXE bit in CR4")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  3 ++-
 arch/x86/kvm/svm/svm.c          |  9 +++++++--
 arch/x86/kvm/svm/svm.h          |  2 +-
 arch/x86/kvm/vmx/nested.c       |  2 +-
 arch/x86/kvm/vmx/vmx.c          | 31 ++++++++++++++++++-------------
 arch/x86/kvm/vmx/vmx.h          |  2 +-
 arch/x86/kvm/x86.c              |  6 ++++--
 7 files changed, 34 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d0f77235da92..e0fb61d8f6fb 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1085,7 +1085,8 @@ struct kvm_x86_ops {
 			    struct kvm_segment *var, int seg);
 	void (*get_cs_db_l_bits)(struct kvm_vcpu *vcpu, int *db, int *l);
 	void (*set_cr0)(struct kvm_vcpu *vcpu, unsigned long cr0);
-	int (*set_cr4)(struct kvm_vcpu *vcpu, unsigned long cr4);
+	bool (*is_valid_cr4)(struct kvm_vcpu *vcpu, unsigned long cr0);
+	void (*set_cr4)(struct kvm_vcpu *vcpu, unsigned long cr4);
 	void (*set_efer)(struct kvm_vcpu *vcpu, u64 efer);
 	void (*get_idt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt);
 	void (*set_idt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index f92a19b77da3..38680d453f80 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1679,7 +1679,12 @@ void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 	update_cr0_intercept(svm);
 }
 
-int svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
+static bool svm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
+{
+	return true;
+}
+
+void svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 {
 	unsigned long host_cr4_mce = cr4_read_shadow() & X86_CR4_MCE;
 	unsigned long old_cr4 = to_svm(vcpu)->vmcb->save.cr4;
@@ -1693,7 +1698,6 @@ int svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 	cr4 |= host_cr4_mce;
 	to_svm(vcpu)->vmcb->save.cr4 = cr4;
 	vmcb_mark_dirty(to_svm(vcpu)->vmcb, VMCB_CR);
-	return 0;
 }
 
 static void svm_set_segment(struct kvm_vcpu *vcpu,
@@ -4192,6 +4196,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.get_cpl = svm_get_cpl,
 	.get_cs_db_l_bits = kvm_get_cs_db_l_bits,
 	.set_cr0 = svm_set_cr0,
+	.is_valid_cr4 = svm_is_valid_cr4,
 	.set_cr4 = svm_set_cr4,
 	.set_efer = svm_set_efer,
 	.get_idt = svm_get_idt,
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index a7f997459b87..8fe632d7fca4 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -352,7 +352,7 @@ static inline bool gif_set(struct vcpu_svm *svm)
 u32 svm_msrpm_offset(u32 msr);
 void svm_set_efer(struct kvm_vcpu *vcpu, u64 efer);
 void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
-int svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
+void svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
 void svm_flush_tlb(struct kvm_vcpu *vcpu);
 void disable_nmi_singlestep(struct vcpu_svm *svm);
 bool svm_smi_blocked(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 6eca8a7deed1..650ff3e4b5ca 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4814,7 +4814,7 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
 	/*
 	 * The Intel VMX Instruction Reference lists a bunch of bits that are
 	 * prerequisite to running VMXON, most notably cr4.VMXE must be set to
-	 * 1 (see vmx_set_cr4() for when we allow the guest to set this).
+	 * 1 (see vmx_is_valid_cr4() for when we allow the guest to set this).
 	 * Otherwise, we should fail with #UD.  But most faulting conditions
 	 * have already been checked by hardware, prior to the VM-exit for
 	 * VMXON.  We do test guest cr4.VMXE because processor CR4 always has
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index dac93346aca9..5aa0a3af7dbb 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3076,7 +3076,23 @@ static void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long pgd,
 		vmcs_writel(GUEST_CR3, guest_cr3);
 }
 
-int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
+static bool vmx_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
+{
+	/*
+	 * We operate under the default treatment of SMM, so VMX cannot be
+	 * enabled under SMM.  Note, whether or not VMXE is allowed at all is
+	 * handled by kvm_valid_cr4().
+	 */
+	if ((cr4 & X86_CR4_VMXE) && is_smm(vcpu))
+		return false;
+
+	if (to_vmx(vcpu)->nested.vmxon && !nested_cr4_valid(vcpu, cr4))
+		return false;
+
+	return true;
+}
+
+void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	/*
@@ -3104,17 +3120,6 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 		}
 	}
 
-	/*
-	 * We operate under the default treatment of SMM, so VMX cannot be
-	 * enabled under SMM.  Note, whether or not VMXE is allowed at all is
-	 * handled by kvm_valid_cr4().
-	 */
-	if ((cr4 & X86_CR4_VMXE) && is_smm(vcpu))
-		return 1;
-
-	if (vmx->nested.vmxon && !nested_cr4_valid(vcpu, cr4))
-		return 1;
-
 	vcpu->arch.cr4 = cr4;
 	kvm_register_mark_available(vcpu, VCPU_EXREG_CR4);
 
@@ -3145,7 +3150,6 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 
 	vmcs_writel(CR4_READ_SHADOW, cr4);
 	vmcs_writel(GUEST_CR4, hw_cr4);
-	return 0;
 }
 
 void vmx_get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg)
@@ -7597,6 +7601,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.get_cpl = vmx_get_cpl,
 	.get_cs_db_l_bits = vmx_get_cs_db_l_bits,
 	.set_cr0 = vmx_set_cr0,
+	.is_valid_cr4 = vmx_is_valid_cr4,
 	.set_cr4 = vmx_set_cr4,
 	.set_efer = vmx_set_efer,
 	.get_idt = vmx_get_idt,
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 5961cb897125..96895ac16b27 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -321,7 +321,7 @@ u32 vmx_get_interrupt_shadow(struct kvm_vcpu *vcpu);
 void vmx_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask);
 void vmx_set_efer(struct kvm_vcpu *vcpu, u64 efer);
 void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
-int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
+void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
 void set_cr4_guest_host_mask(struct vcpu_vmx *vmx);
 void ept_save_pdptrs(struct kvm_vcpu *vcpu);
 void vmx_get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c4015a43cc8a..64cc86f4f18f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -973,6 +973,9 @@ int kvm_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 	if (cr4 & vcpu->arch.cr4_guest_rsvd_bits)
 		return -EINVAL;
 
+	if (!kvm_x86_ops.is_valid_cr4(vcpu, cr4))
+		return -EINVAL;
+
 	return 0;
 }
 EXPORT_SYMBOL_GPL(kvm_valid_cr4);
@@ -1006,8 +1009,7 @@ int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 			return 1;
 	}
 
-	if (kvm_x86_ops.set_cr4(vcpu, cr4))
-		return 1;
+	kvm_x86_ops.set_cr4(vcpu, cr4);
 
 	if (((cr4 ^ old_cr4) & pdptr_bits) ||
 	    (!(cr4 & X86_CR4_PCIDE) && (old_cr4 & X86_CR4_PCIDE)))
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 5/6] KVM: x86: Return bool instead of int for CR4 and SREGS validity checks
  2020-10-07  1:44 [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup Sean Christopherson
                   ` (3 preceding siblings ...)
  2020-10-07  1:44 ` [PATCH 4/6] KVM: x86: Move vendor CR4 validity check to dedicated kvm_x86_ops hook Sean Christopherson
@ 2020-10-07  1:44 ` Sean Christopherson
  2020-10-07  1:44 ` [PATCH 6/6] KVM: selftests: Verify supported CR4 bits can be set before KVM_SET_CPUID2 Sean Christopherson
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 24+ messages in thread
From: Sean Christopherson @ 2020-10-07  1:44 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Stas Sergeev

Rework the common CR4 and SREGS checks to return a bool instead of an
int, i.e. true/false instead of 0/-EINVAL, and add "is" to the name to
clarify the polarity of the return value (which is effectively inverted
by this change).

No functional changed intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/svm/nested.c |  2 +-
 arch/x86/kvm/vmx/vmx.c    |  2 +-
 arch/x86/kvm/x86.c        | 28 ++++++++++++----------------
 arch/x86/kvm/x86.h        |  2 +-
 4 files changed, 15 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index ba50ff6e35c7..114e0e8561bc 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -254,7 +254,7 @@ static bool nested_vmcb_checks(struct vcpu_svm *svm, struct vmcb *vmcb12)
 		    (vmcb12->save.cr3 & MSR_CR3_LONG_MBZ_MASK))
 			return false;
 	}
-	if (kvm_valid_cr4(&svm->vcpu, vmcb12->save.cr4))
+	if (!kvm_is_valid_cr4(&svm->vcpu, vmcb12->save.cr4))
 		return false;
 
 	return nested_vmcb_check_controls(&vmcb12->control);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 5aa0a3af7dbb..ac69aa3076d8 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3081,7 +3081,7 @@ static bool vmx_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 	/*
 	 * We operate under the default treatment of SMM, so VMX cannot be
 	 * enabled under SMM.  Note, whether or not VMXE is allowed at all is
-	 * handled by kvm_valid_cr4().
+	 * handled by kvm_is_valid_cr4().
 	 */
 	if ((cr4 & X86_CR4_VMXE) && is_smm(vcpu))
 		return false;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 64cc86f4f18f..5870aa6cbad2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -965,20 +965,17 @@ int kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr)
 }
 EXPORT_SYMBOL_GPL(kvm_set_xcr);
 
-int kvm_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
+bool kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 {
 	if (cr4 & cr4_reserved_bits)
-		return -EINVAL;
+		return false;
 
 	if (cr4 & vcpu->arch.cr4_guest_rsvd_bits)
-		return -EINVAL;
+		return false;
 
-	if (!kvm_x86_ops.is_valid_cr4(vcpu, cr4))
-		return -EINVAL;
-
-	return 0;
+	return kvm_x86_ops.is_valid_cr4(vcpu, cr4);
 }
-EXPORT_SYMBOL_GPL(kvm_valid_cr4);
+EXPORT_SYMBOL_GPL(kvm_is_valid_cr4);
 
 int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 {
@@ -986,7 +983,7 @@ int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 	unsigned long pdptr_bits = X86_CR4_PGE | X86_CR4_PSE | X86_CR4_PAE |
 				   X86_CR4_SMEP;
 
-	if (kvm_valid_cr4(vcpu, cr4))
+	if (!kvm_is_valid_cr4(vcpu, cr4))
 		return 1;
 
 	if (is_long_mode(vcpu)) {
@@ -9422,7 +9419,7 @@ int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int idt_index,
 }
 EXPORT_SYMBOL_GPL(kvm_task_switch);
 
-static int kvm_valid_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
+static bool kvm_is_valid_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
 {
 	if ((sregs->efer & EFER_LME) && (sregs->cr0 & X86_CR0_PG)) {
 		/*
@@ -9430,19 +9427,18 @@ static int kvm_valid_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
 		 * 64-bit mode (though maybe in a 32-bit code segment).
 		 * CR4.PAE and EFER.LMA must be set.
 		 */
-		if (!(sregs->cr4 & X86_CR4_PAE)
-		    || !(sregs->efer & EFER_LMA))
-			return -EINVAL;
+		if (!(sregs->cr4 & X86_CR4_PAE) || !(sregs->efer & EFER_LMA))
+			return false;
 	} else {
 		/*
 		 * Not in 64-bit mode: EFER.LMA is clear and the code
 		 * segment cannot be 64-bit.
 		 */
 		if (sregs->efer & EFER_LMA || sregs->cs.l)
-			return -EINVAL;
+			return false;
 	}
 
-	return kvm_valid_cr4(vcpu, sregs->cr4);
+	return kvm_is_valid_cr4(vcpu, sregs->cr4);
 }
 
 static int __set_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
@@ -9454,7 +9450,7 @@ static int __set_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
 	struct desc_ptr dt;
 	int ret = -EINVAL;
 
-	if (kvm_valid_sregs(vcpu, sregs))
+	if (!kvm_is_valid_sregs(vcpu, sregs))
 		goto out;
 
 	apic_base_msr.data = sregs->apic_base;
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 3900ab0c6004..b3b1d237ffe5 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -369,7 +369,7 @@ static inline bool kvm_dr6_valid(u64 data)
 void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu);
 void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu);
 int kvm_spec_ctrl_test_value(u64 value);
-int kvm_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
+bool kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
 bool kvm_vcpu_exit_request(struct kvm_vcpu *vcpu);
 int kvm_handle_memory_failure(struct kvm_vcpu *vcpu, int r,
 			      struct x86_exception *e);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 6/6] KVM: selftests: Verify supported CR4 bits can be set before KVM_SET_CPUID2
  2020-10-07  1:44 [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup Sean Christopherson
                   ` (4 preceding siblings ...)
  2020-10-07  1:44 ` [PATCH 5/6] KVM: x86: Return bool instead of int for CR4 and SREGS validity checks Sean Christopherson
@ 2020-10-07  1:44 ` Sean Christopherson
  2020-10-08 16:00 ` [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup stsp
  2020-11-13 11:36 ` [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup Paolo Bonzini
  7 siblings, 0 replies; 24+ messages in thread
From: Sean Christopherson @ 2020-10-07  1:44 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Stas Sergeev

Extend the KVM_SET_SREGS test to verify that all supported CR4 bits, as
enumerated by KVM, can be set before KVM_SET_CPUID2, i.e. without first
defining the vCPU model.  KVM is supposed to skip guest CPUID checks
when host userspace is stuffing guest state.

Check the inverse as well, i.e. that KVM rejects KVM_SET_REGS if CR4
has one or more unsupported bits set.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 .../selftests/kvm/include/x86_64/processor.h  | 17 ++++
 .../selftests/kvm/include/x86_64/vmx.h        |  4 -
 .../selftests/kvm/x86_64/set_sregs_test.c     | 92 ++++++++++++++++++-
 3 files changed, 108 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 82b7fe16a824..29f0bd7d8271 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -27,6 +27,7 @@
 #define X86_CR4_OSFXSR		(1ul << 9)
 #define X86_CR4_OSXMMEXCPT	(1ul << 10)
 #define X86_CR4_UMIP		(1ul << 11)
+#define X86_CR4_LA57		(1ul << 12)
 #define X86_CR4_VMXE		(1ul << 13)
 #define X86_CR4_SMXE		(1ul << 14)
 #define X86_CR4_FSGSBASE	(1ul << 16)
@@ -36,6 +37,22 @@
 #define X86_CR4_SMAP		(1ul << 21)
 #define X86_CR4_PKE		(1ul << 22)
 
+/* CPUID.1.ECX */
+#define CPUID_VMX		(1ul << 5)
+#define CPUID_SMX		(1ul << 6)
+#define CPUID_PCID		(1ul << 17)
+#define CPUID_XSAVE		(1ul << 26)
+
+/* CPUID.7.EBX */
+#define CPUID_FSGSBASE		(1ul << 0)
+#define CPUID_SMEP		(1ul << 7)
+#define CPUID_SMAP		(1ul << 20)
+
+/* CPUID.7.ECX */
+#define CPUID_UMIP		(1ul << 2)
+#define CPUID_PKU		(1ul << 3)
+#define CPUID_LA57		(1ul << 16)
+
 /* General Registers in 64-Bit Mode */
 struct gpr64_regs {
 	u64 rax;
diff --git a/tools/testing/selftests/kvm/include/x86_64/vmx.h b/tools/testing/selftests/kvm/include/x86_64/vmx.h
index 54d624dd6c10..e4da3e784f90 100644
--- a/tools/testing/selftests/kvm/include/x86_64/vmx.h
+++ b/tools/testing/selftests/kvm/include/x86_64/vmx.h
@@ -11,10 +11,6 @@
 #include <stdint.h>
 #include "processor.h"
 
-#define CPUID_VMX_BIT				5
-
-#define CPUID_VMX				(1 << 5)
-
 /*
  * Definitions of Primary Processor-Based VM-Execution Controls.
  */
diff --git a/tools/testing/selftests/kvm/x86_64/set_sregs_test.c b/tools/testing/selftests/kvm/x86_64/set_sregs_test.c
index 9f7656184f31..318be0bf77ab 100644
--- a/tools/testing/selftests/kvm/x86_64/set_sregs_test.c
+++ b/tools/testing/selftests/kvm/x86_64/set_sregs_test.c
@@ -24,16 +24,106 @@
 
 #define VCPU_ID                  5
 
+static void test_cr4_feature_bit(struct kvm_vm *vm, struct kvm_sregs *orig,
+				 uint64_t feature_bit)
+{
+	struct kvm_sregs sregs;
+	int rc;
+
+	/* Skip the sub-test, the feature is supported. */
+	if (orig->cr4 & feature_bit)
+		return;
+
+	memcpy(&sregs, orig, sizeof(sregs));
+	sregs.cr4 |= feature_bit;
+
+	rc = _vcpu_sregs_set(vm, VCPU_ID, &sregs);
+	TEST_ASSERT(rc, "KVM allowed unsupported CR4 bit (0x%lx)", feature_bit);
+
+	/* Sanity check that KVM didn't change anything. */
+	vcpu_sregs_get(vm, VCPU_ID, &sregs);
+	TEST_ASSERT(!memcmp(&sregs, orig, sizeof(sregs)), "KVM modified sregs");
+}
+
+static uint64_t calc_cr4_feature_bits(struct kvm_vm *vm)
+{
+	struct kvm_cpuid_entry2 *cpuid_1, *cpuid_7;
+	uint64_t cr4;
+
+	cpuid_1 = kvm_get_supported_cpuid_entry(1);
+	cpuid_7 = kvm_get_supported_cpuid_entry(7);
+
+	cr4 = X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD | X86_CR4_DE |
+	      X86_CR4_PSE | X86_CR4_PAE | X86_CR4_MCE | X86_CR4_PGE |
+	      X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT;
+	if (cpuid_7->ecx & CPUID_UMIP)
+		cr4 |= X86_CR4_UMIP;
+	if (cpuid_7->ecx & CPUID_LA57)
+		cr4 |= X86_CR4_LA57;
+	if (cpuid_1->ecx & CPUID_VMX)
+		cr4 |= X86_CR4_VMXE;
+	if (cpuid_1->ecx & CPUID_SMX)
+		cr4 |= X86_CR4_SMXE;
+	if (cpuid_7->ebx & CPUID_FSGSBASE)
+		cr4 |= X86_CR4_FSGSBASE;
+	if (cpuid_1->ecx & CPUID_PCID)
+		cr4 |= X86_CR4_PCIDE;
+	if (cpuid_1->ecx & CPUID_XSAVE)
+		cr4 |= X86_CR4_OSXSAVE;
+	if (cpuid_7->ebx & CPUID_SMEP)
+		cr4 |= X86_CR4_SMEP;
+	if (cpuid_7->ebx & CPUID_SMAP)
+		cr4 |= X86_CR4_SMAP;
+	if (cpuid_7->ecx & CPUID_PKU)
+		cr4 |= X86_CR4_PKE;
+
+	return cr4;
+}
+
 int main(int argc, char *argv[])
 {
 	struct kvm_sregs sregs;
 	struct kvm_vm *vm;
+	uint64_t cr4;
 	int rc;
 
 	/* Tell stdout not to buffer its content */
 	setbuf(stdout, NULL);
 
-	/* Create VM */
+	/*
+	 * Create a dummy VM, specifically to avoid doing KVM_SET_CPUID2, and
+	 * use it to verify all supported CR4 bits can be set prior to defining
+	 * the vCPU model, i.e. without doing KVM_SET_CPUID2.
+	 */
+	vm = vm_create(VM_MODE_DEFAULT, DEFAULT_GUEST_PHY_PAGES, O_RDWR);
+	vm_vcpu_add(vm, VCPU_ID);
+
+	vcpu_sregs_get(vm, VCPU_ID, &sregs);
+
+	sregs.cr4 |= calc_cr4_feature_bits(vm);
+	cr4 = sregs.cr4;
+
+	rc = _vcpu_sregs_set(vm, VCPU_ID, &sregs);
+	TEST_ASSERT(!rc, "Failed to set supported CR4 bits (0x%lx)", cr4);
+
+	vcpu_sregs_get(vm, VCPU_ID, &sregs);
+	TEST_ASSERT(sregs.cr4 == cr4, "sregs.CR4 (0x%llx) != CR4 (0x%lx)",
+		    sregs.cr4, cr4);
+
+	/* Verify all unsupported features are rejected by KVM. */
+	test_cr4_feature_bit(vm, &sregs, X86_CR4_UMIP);
+	test_cr4_feature_bit(vm, &sregs, X86_CR4_LA57);
+	test_cr4_feature_bit(vm, &sregs, X86_CR4_VMXE);
+	test_cr4_feature_bit(vm, &sregs, X86_CR4_SMXE);
+	test_cr4_feature_bit(vm, &sregs, X86_CR4_FSGSBASE);
+	test_cr4_feature_bit(vm, &sregs, X86_CR4_PCIDE);
+	test_cr4_feature_bit(vm, &sregs, X86_CR4_OSXSAVE);
+	test_cr4_feature_bit(vm, &sregs, X86_CR4_SMEP);
+	test_cr4_feature_bit(vm, &sregs, X86_CR4_SMAP);
+	test_cr4_feature_bit(vm, &sregs, X86_CR4_PKE);
+	kvm_vm_free(vm);
+
+	/* Create a "real" VM and verify APIC_BASE can be set. */
 	vm = vm_create_default(VCPU_ID, 0, NULL);
 
 	vcpu_sregs_get(vm, VCPU_ID, &sregs);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup
  2020-10-07  1:44 [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup Sean Christopherson
                   ` (5 preceding siblings ...)
  2020-10-07  1:44 ` [PATCH 6/6] KVM: selftests: Verify supported CR4 bits can be set before KVM_SET_CPUID2 Sean Christopherson
@ 2020-10-08 16:00 ` stsp
  2020-10-08 17:59   ` Sean Christopherson
  2020-11-13 11:36 ` [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup Paolo Bonzini
  7 siblings, 1 reply; 24+ messages in thread
From: stsp @ 2020-10-08 16:00 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel

07.10.2020 04:44, Sean Christopherson пишет:
> Two bug fixes to handle KVM_SET_SREGS without a preceding KVM_SET_CPUID2.
Hi Sean & KVM devs.

I tested the patches, and wherever I
set VMXE in CR4, I now get
KVM: KVM_SET_SREGS: Invalid argument
Before the patch I was able (with many
problems, but still) to set VMXE sometimes.

So its a NAK so far, waiting for an update. :)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup
  2020-10-08 16:00 ` [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup stsp
@ 2020-10-08 17:59   ` Sean Christopherson
  2020-10-08 18:18     ` stsp
  0 siblings, 1 reply; 24+ messages in thread
From: Sean Christopherson @ 2020-10-08 17:59 UTC (permalink / raw)
  To: stsp
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

On Thu, Oct 08, 2020 at 07:00:13PM +0300, stsp wrote:
> 07.10.2020 04:44, Sean Christopherson пишет:
> >Two bug fixes to handle KVM_SET_SREGS without a preceding KVM_SET_CPUID2.
> Hi Sean & KVM devs.
> 
> I tested the patches, and wherever I
> set VMXE in CR4, I now get
> KVM: KVM_SET_SREGS: Invalid argument
> Before the patch I was able (with many
> problems, but still) to set VMXE sometimes.
> 
> So its a NAK so far, waiting for an update. :)

IIRC, you said you were going to test on AMD?  Assuming that's correct, -EINVAL
is the expected behavior.  KVM was essentially lying before; it never actually
set CR4.VMXE in hardware, it just didn't properply detect the error and so VMXE
was set in KVM's shadow of the guest's CR4.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup
  2020-10-08 17:59   ` Sean Christopherson
@ 2020-10-08 18:18     ` stsp
  2020-10-09  4:04       ` Sean Christopherson
  0 siblings, 1 reply; 24+ messages in thread
From: stsp @ 2020-10-08 18:18 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

08.10.2020 20:59, Sean Christopherson пишет:
> On Thu, Oct 08, 2020 at 07:00:13PM +0300, stsp wrote:
>> 07.10.2020 04:44, Sean Christopherson пишет:
>>> Two bug fixes to handle KVM_SET_SREGS without a preceding KVM_SET_CPUID2.
>> Hi Sean & KVM devs.
>>
>> I tested the patches, and wherever I
>> set VMXE in CR4, I now get
>> KVM: KVM_SET_SREGS: Invalid argument
>> Before the patch I was able (with many
>> problems, but still) to set VMXE sometimes.
>>
>> So its a NAK so far, waiting for an update. :)
> IIRC, you said you were going to test on AMD?  Assuming that's correct,

Yes, that is true.


>   -EINVAL
> is the expected behavior.  KVM was essentially lying before; it never actually
> set CR4.VMXE in hardware, it just didn't properply detect the error and so VMXE
> was set in KVM's shadow of the guest's CR4.

Hmm. But at least it was lying
similarly on AMD and Intel CPUs. :)
So I was able to reproduce the problems
myself.
Do you mean, any AMD tests are now
useless, and we need to proceed with
Intel tests only?

Then additional question.
On old Intel CPUs we needed to set
VMXE in guest to make it to work in
nested-guest mode.
Is it still needed even with your patches?
Or the nested-guest mode will work
now even on older Intel CPUs and KVM
will set VMXE for us itself, when needed?


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup
  2020-10-08 18:18     ` stsp
@ 2020-10-09  4:04       ` Sean Christopherson
  2020-10-09 14:11         ` stsp
  0 siblings, 1 reply; 24+ messages in thread
From: Sean Christopherson @ 2020-10-09  4:04 UTC (permalink / raw)
  To: stsp
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

On Thu, Oct 08, 2020 at 09:18:18PM +0300, stsp wrote:
> 08.10.2020 20:59, Sean Christopherson пишет:
> >On Thu, Oct 08, 2020 at 07:00:13PM +0300, stsp wrote:
> >>07.10.2020 04:44, Sean Christopherson пишет:
> >>>Two bug fixes to handle KVM_SET_SREGS without a preceding KVM_SET_CPUID2.
> >>Hi Sean & KVM devs.
> >>
> >>I tested the patches, and wherever I
> >>set VMXE in CR4, I now get
> >>KVM: KVM_SET_SREGS: Invalid argument
> >>Before the patch I was able (with many
> >>problems, but still) to set VMXE sometimes.
> >>
> >>So its a NAK so far, waiting for an update. :)
> >IIRC, you said you were going to test on AMD?  Assuming that's correct,
> 
> Yes, that is true.
> 
> 
> >  -EINVAL
> >is the expected behavior.  KVM was essentially lying before; it never actually
> >set CR4.VMXE in hardware, it just didn't properply detect the error and so VMXE
> >was set in KVM's shadow of the guest's CR4.
> 
> Hmm. But at least it was lying
> similarly on AMD and Intel CPUs. :)
> So I was able to reproduce the problems
> myself.
> Do you mean, any AMD tests are now useless, and we need to proceed with Intel
> tests only?

For anything VMXE related, yes.

> Then additional question.
> On old Intel CPUs we needed to set VMXE in guest to make it to work in
> nested-guest mode.
> Is it still needed even with your patches?
> Or the nested-guest mode will work now even on older Intel CPUs and KVM will
> set VMXE for us itself, when needed?

I'm struggling to even come up with a theory as to how setting VMXE from
userspace would have impacted KVM with unrestricted_guest=n, let alone fixed
anything.

CR4.VMXE must always be 1 in _hardware_ when VMX is on, including when running
the guest.  But KVM forces vmcs.GUEST_CR4.VMXE=1 at all times, regardless of
the guest's actual value (the guest sees a shadow value when it reads CR4).

And unless I grossly misunderstand dosemu2, it's not doing anything related to
nested virtualization, i.e. the stuffing VMXE=1 for the guest's shadow value
should have absolutely zero impact.

More than likely, VMXE was a red herring.  Given that the reporter is also
seeing the same bug on bare metal after moving to kernel 5.4, odds are good
the issue is related to unrestricted_guest=n and has nothing to do with nVMX.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup
  2020-10-09  4:04       ` Sean Christopherson
@ 2020-10-09 14:11         ` stsp
  2020-10-09 15:30           ` Sean Christopherson
  0 siblings, 1 reply; 24+ messages in thread
From: stsp @ 2020-10-09 14:11 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

09.10.2020 07:04, Sean Christopherson пишет:
>> Hmm. But at least it was lying
>> similarly on AMD and Intel CPUs. :)
>> So I was able to reproduce the problems
>> myself.
>> Do you mean, any AMD tests are now useless, and we need to proceed with Intel
>> tests only?
> For anything VMXE related, yes.

What would be the expected behaviour
on Intel, if it is set? Any difference with AMD?


>> Then additional question.
>> On old Intel CPUs we needed to set VMXE in guest to make it to work in
>> nested-guest mode.
>> Is it still needed even with your patches?
>> Or the nested-guest mode will work now even on older Intel CPUs and KVM will
>> set VMXE for us itself, when needed?
> I'm struggling to even come up with a theory as to how setting VMXE from
> userspace would have impacted KVM with unrestricted_guest=n, let alone fixed
> anything.
>
> CR4.VMXE must always be 1 in _hardware_ when VMX is on, including when running
> the guest.  But KVM forces vmcs.GUEST_CR4.VMXE=1 at all times, regardless of
> the guest's actual value (the guest sees a shadow value when it reads CR4).
>
> And unless I grossly misunderstand dosemu2, it's not doing anything related to
> nested virtualization, i.e. the stuffing VMXE=1 for the guest's shadow value
> should have absolutely zero impact.
>
> More than likely, VMXE was a red herring.

Yes, it was. :(
(as you can see from the end of the
github thread)


>    Given that the reporter is also
> seeing the same bug on bare metal after moving to kernel 5.4, odds are good
> the issue is related to unrestricted_guest=n and has nothing to do with nVMX.

But we do not use unrestricted guest.
We use v86 under KVM.
The only other effect of setting VMXE
was clearing VME. Which shouldn't affect
anything either, right?


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup
  2020-10-09 14:11         ` stsp
@ 2020-10-09 15:30           ` Sean Christopherson
  2020-10-09 15:48             ` stsp
                               ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Sean Christopherson @ 2020-10-09 15:30 UTC (permalink / raw)
  To: stsp
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

On Fri, Oct 09, 2020 at 05:11:51PM +0300, stsp wrote:
> 09.10.2020 07:04, Sean Christopherson пишет:
> >>Hmm. But at least it was lying
> >>similarly on AMD and Intel CPUs. :)
> >>So I was able to reproduce the problems
> >>myself.
> >>Do you mean, any AMD tests are now useless, and we need to proceed with Intel
> >>tests only?
> >For anything VMXE related, yes.
> 
> What would be the expected behaviour on Intel, if it is set? Any difference
> with AMD?

On Intel, userspace should be able to stuff CR4.VMXE=1 via KVM_SET_SREGS if
the 'nested' module param is 1, e.g. if 'modprobe kvm_intel nested=1'.  Note,
'nested' is enabled by default on kernel 5.0 and later.

With AMD, setting CR4.VMXE=1 is never allowed as AMD doesn't support VMX,
AMD's virtualization solution is called SVM (Secure Virtual Machine).  KVM
doesn't support nesting VMX within SVM and vice versa.

> >>Then additional question.
> >>On old Intel CPUs we needed to set VMXE in guest to make it to work in
> >>nested-guest mode.
> >>Is it still needed even with your patches?
> >>Or the nested-guest mode will work now even on older Intel CPUs and KVM will
> >>set VMXE for us itself, when needed?
> >I'm struggling to even come up with a theory as to how setting VMXE from
> >userspace would have impacted KVM with unrestricted_guest=n, let alone fixed
> >anything.
> >
> >CR4.VMXE must always be 1 in _hardware_ when VMX is on, including when running
> >the guest.  But KVM forces vmcs.GUEST_CR4.VMXE=1 at all times, regardless of
> >the guest's actual value (the guest sees a shadow value when it reads CR4).
> >
> >And unless I grossly misunderstand dosemu2, it's not doing anything related to
> >nested virtualization, i.e. the stuffing VMXE=1 for the guest's shadow value
> >should have absolutely zero impact.
> >
> >More than likely, VMXE was a red herring.
> 
> Yes, it was. :( (as you can see from the end of the github thread)
> 
> 
> >   Given that the reporter is also
> >seeing the same bug on bare metal after moving to kernel 5.4, odds are good
> >the issue is related to unrestricted_guest=n and has nothing to do with nVMX.
> 
> But we do not use unrestricted guest.
> We use v86 under KVM.

Unrestricted guest can kick in even if CR0.PG=1 && CR0.PE=1, e.g. there are
segmentation checks that apply if and only if unrestricted_guest=0.  Long story
short, without a deep audit, it's basically impossible to rule out a dependency
on unrestricted guest since you're playing around with v86.
 
> The only other effect of setting VMXE was clearing VME. Which shouldn't
> affect anything either, right?

Hmm, clearing VME would mean that exceptions/interrupts within the guest would
trigger a switch out of v86 and into vanilla protected mode.  v86 and PM have
different consistency checks, particularly for segmentation, so it's plausible
that clearing CR4.VME inadvertantly worked around the bug by avoiding invalid
guest state for v86.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup
  2020-10-09 15:30           ` Sean Christopherson
@ 2020-10-09 15:48             ` stsp
  2020-10-09 16:11               ` Sean Christopherson
  2020-12-07 11:19             ` KVM_SET_CPUID doesn't check supported bits (was Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup) stsp
  2020-12-07 11:24             ` stsp
  2 siblings, 1 reply; 24+ messages in thread
From: stsp @ 2020-10-09 15:48 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel

09.10.2020 18:30, Sean Christopherson пишет:
> On Fri, Oct 09, 2020 at 05:11:51PM +0300, stsp wrote:
>> 09.10.2020 07:04, Sean Christopherson пишет:
>>>> Hmm. But at least it was lying
>>>> similarly on AMD and Intel CPUs. :)
>>>> So I was able to reproduce the problems
>>>> myself.
>>>> Do you mean, any AMD tests are now useless, and we need to proceed with Intel
>>>> tests only?
>>> For anything VMXE related, yes.
>> What would be the expected behaviour on Intel, if it is set? Any difference
>> with AMD?
> On Intel, userspace should be able to stuff CR4.VMXE=1 via KVM_SET_SREGS if
> the 'nested' module param is 1, e.g. if 'modprobe kvm_intel nested=1'.  Note,
> 'nested' is enabled by default on kernel 5.0 and later.

So if I understand you correctly, we
need to test that:
- with nested=0 VMXE gives EINVAL
- with nested=1 VMXE changes nothing
visible, except probably to allow guest
to read that value (we won't test guest
reading though).

Is this correct?


> With AMD, setting CR4.VMXE=1 is never allowed as AMD doesn't support VMX,

OK, for that I can give you a
Tested-by: Stas Sergeev <stsp@users.sourceforge.net>

because I confirm that on AMD it now
consistently returns EINVAL, whereas
without your patches it did random crap,
depending on whether it is a first call to
KVM_SET_SREGS, or not first.


>> But we do not use unrestricted guest.
>> We use v86 under KVM.
> Unrestricted guest can kick in even if CR0.PG=1 && CR0.PE=1, e.g. there are
> segmentation checks that apply if and only if unrestricted_guest=0.  Long story
> short, without a deep audit, it's basically impossible to rule out a dependency
> on unrestricted guest since you're playing around with v86.

You mean "unrestricted_guest" as a module
parameter, rather than the similar named CPU
feature, right? So we may depend on
unrestricted_guest parameter, but not on a
hardware feature, correct?


>> The only other effect of setting VMXE was clearing VME. Which shouldn't
>> affect anything either, right?
> Hmm, clearing VME would mean that exceptions/interrupts within the guest would
> trigger a switch out of v86 and into vanilla protected mode.  v86 and PM have
> different consistency checks, particularly for segmentation, so it's plausible
> that clearing CR4.VME inadvertantly worked around the bug by avoiding invalid
> guest state for v86.

Lets assume that was the case.
With those github guys its not possible
to do any consistent checks. :(


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup
  2020-10-09 15:48             ` stsp
@ 2020-10-09 16:11               ` Sean Christopherson
  0 siblings, 0 replies; 24+ messages in thread
From: Sean Christopherson @ 2020-10-09 16:11 UTC (permalink / raw)
  To: stsp
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel

On Fri, Oct 09, 2020 at 06:48:21PM +0300, stsp wrote:
> 09.10.2020 18:30, Sean Christopherson пишет:
> >On Fri, Oct 09, 2020 at 05:11:51PM +0300, stsp wrote:
> >>09.10.2020 07:04, Sean Christopherson пишет:
> >>>>Hmm. But at least it was lying
> >>>>similarly on AMD and Intel CPUs. :)
> >>>>So I was able to reproduce the problems
> >>>>myself.
> >>>>Do you mean, any AMD tests are now useless, and we need to proceed with Intel
> >>>>tests only?
> >>>For anything VMXE related, yes.
> >>What would be the expected behaviour on Intel, if it is set? Any difference
> >>with AMD?
> >On Intel, userspace should be able to stuff CR4.VMXE=1 via KVM_SET_SREGS if
> >the 'nested' module param is 1, e.g. if 'modprobe kvm_intel nested=1'.  Note,
> >'nested' is enabled by default on kernel 5.0 and later.
> 
> So if I understand you correctly, we
> need to test that:
> - with nested=0 VMXE gives EINVAL
> - with nested=1 VMXE changes nothing
> visible, except probably to allow guest
> to read that value (we won't test guest
> reading though).
> 
> Is this correct?

Yep, exactly!
 
> >With AMD, setting CR4.VMXE=1 is never allowed as AMD doesn't support VMX,
> 
> OK, for that I can give you a
> Tested-by: Stas Sergeev <stsp@users.sourceforge.net>
> 
> because I confirm that on AMD it now consistently returns EINVAL, whereas
> without your patches it did random crap, depending on whether it is a first
> call to KVM_SET_SREGS, or not first.
> 
> 
> >>But we do not use unrestricted guest.
> >>We use v86 under KVM.
> >Unrestricted guest can kick in even if CR0.PG=1 && CR0.PE=1, e.g. there are
> >segmentation checks that apply if and only if unrestricted_guest=0.  Long story
> >short, without a deep audit, it's basically impossible to rule out a dependency
> >on unrestricted guest since you're playing around with v86.
> 
> You mean "unrestricted_guest" as a module parameter, rather than the similar
> named CPU feature, right? So we may depend on unrestricted_guest parameter,
> but not on a hardware feature, correct?

The unrestricted_guest module param is tied directly to the hardware feature,
i.e. if kvm_intel.unrestricted_guest=0 then KVM will run guests with
unrestricted guest disabled.  That doesn't necessarily mean any of the
behavior that is allowed by unrestricted guest will be encountered, but if
it is encountered, then it will be handled by the CPU instead of causing a
VM-Exit and requiring KVM emulation.

The reported is using an old CPU that doesn't support unrestricted guest,
so both the hardware feature and the module param will be off/0.

> >>The only other effect of setting VMXE was clearing VME. Which shouldn't
> >>affect anything either, right?
> >Hmm, clearing VME would mean that exceptions/interrupts within the guest would
> >trigger a switch out of v86 and into vanilla protected mode.  v86 and PM have
> >different consistency checks, particularly for segmentation, so it's plausible
> >that clearing CR4.VME inadvertantly worked around the bug by avoiding invalid
> >guest state for v86.
> 
> Lets assume that was the case.  With those github guys its not possible to do
> any consistent checks. :(

K.  If this is ever a problem in the future, having a way relatively simple
reproducer, e.g. something we can run without having to build/install a
variety of tools, would make it easier to debug.  In theory, the bug should be
reproducible even on modern hardware by loading KVM with unrestricted_guest=0.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup
  2020-10-07  1:44 [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup Sean Christopherson
                   ` (6 preceding siblings ...)
  2020-10-08 16:00 ` [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup stsp
@ 2020-11-13 11:36 ` Paolo Bonzini
  7 siblings, 0 replies; 24+ messages in thread
From: Paolo Bonzini @ 2020-11-13 11:36 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Stas Sergeev

On 07/10/20 03:44, Sean Christopherson wrote:
> Two bug fixes to handle KVM_SET_SREGS without a preceding KVM_SET_CPUID2.
> 
> The overarching issue is that kvm_x86_ops.set_cr4() can fail, but its
> invocation from __set_sregs(), a.k.a. KVM_SET_SREGS, ignores the result.
> Fix the issue by moving all validity checks out of .set_cr4() in one way
> or another.
> 
> I intentionally omitted a Cc to stable.  The first bug fix in particular
> may break stable trees as it simply removes a check, and I don't know that
> stable trees have the generic CR4 reserved bit check that is needed to
> prevent the guest from setting VMXE when nVMX is not allowed.
> 
> Sean Christopherson (6):
>    KVM: VMX: Drop guest CPUID check for VMXE in vmx_set_cr4()
>    KVM: VMX: Drop explicit 'nested' check from vmx_set_cr4()
>    KVM: SVM: Drop VMXE check from svm_set_cr4()
>    KVM: x86: Move vendor CR4 validity check to dedicated kvm_x86_ops hook
>    KVM: x86: Return bool instead of int for CR4 and SREGS validity checks
>    KVM: selftests: Verify supported CR4 bits can be set before
>      KVM_SET_CPUID2
> 
>   arch/x86/include/asm/kvm_host.h               |  3 +-
>   arch/x86/kvm/svm/nested.c                     |  2 +-
>   arch/x86/kvm/svm/svm.c                        | 12 ++-
>   arch/x86/kvm/svm/svm.h                        |  2 +-
>   arch/x86/kvm/vmx/nested.c                     |  2 +-
>   arch/x86/kvm/vmx/vmx.c                        | 35 +++----
>   arch/x86/kvm/vmx/vmx.h                        |  2 +-
>   arch/x86/kvm/x86.c                            | 28 +++---
>   arch/x86/kvm/x86.h                            |  2 +-
>   .../selftests/kvm/include/x86_64/processor.h  | 17 ++++
>   .../selftests/kvm/include/x86_64/vmx.h        |  4 -
>   .../selftests/kvm/x86_64/set_sregs_test.c     | 92 ++++++++++++++++++-
>   12 files changed, 153 insertions(+), 48 deletions(-)
> 

Queued, thanks.

Paolo


^ permalink raw reply	[flat|nested] 24+ messages in thread

* KVM_SET_CPUID doesn't check supported bits (was Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup)
  2020-10-09 15:30           ` Sean Christopherson
  2020-10-09 15:48             ` stsp
@ 2020-12-07 11:19             ` stsp
  2020-12-07 11:24             ` stsp
  2 siblings, 0 replies; 24+ messages in thread
From: stsp @ 2020-12-07 11:19 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

09.10.2020 18:30, Sean Christopherson пишет:
>> The only other effect of setting VMXE was clearing VME. Which shouldn't
>> affect anything either, right?
> Hmm, clearing VME would mean that exceptions/interrupts within the guest would
> trigger a switch out of v86 and into vanilla protected mode.  v86 and PM have
> different consistency checks, particularly for segmentation, so it's plausible
> that clearing CR4.VME inadvertantly worked around the bug by avoiding invalid
> guest state for v86.

Almost.

So with your patch set (thanks!) and a
bit of further investigations, it now became
clear where the problem is.
We have this code:
---

|cpuid->nent = 2; // Use the same values as in emu-i386/simx86/interp.c 
// (Pentium 133-200MHz, "GenuineIntel") cpuid->entries[0] = (struct 
kvm_cpuid_entry) { .function = 0, .eax = 1, .ebx = 0x756e6547, .ecx = 
0x6c65746e, .edx = 0x49656e69 }; // family 5, model 2, stepping 12, fpu 
vme de pse tsc msr mce cx8 cpuid->entries[1] = (struct kvm_cpuid_entry) 
{ .function = 1, .eax = 0x052c, .ebx = 0, .ecx = 0, .edx = 0x1bf }; ret 
= ioctl(vcpufd, KVM_SET_CPUID, cpuid); free(cpuid); if (ret == -1) { 
perror("KVM: KVM_SET_CPUID"); return 0; } --- It tries to enable VME 
among other things. qemu appears to disable VME by default, unless you 
do "-cpu host". So we have a situation where the host (which is qemu) 
doesn't have VME, and guest (dosemu) is trying to enable it. Now obviously ||KVM_SET_CPUID|  doesn't check anyting
at all and returns success. That later turns
into an invalid guest state.

Question: should|KVM_SET_CPUID|  check for
supported bits, end return error if not everything
is supported?
||


^ permalink raw reply	[flat|nested] 24+ messages in thread

* KVM_SET_CPUID doesn't check supported bits (was Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup)
  2020-10-09 15:30           ` Sean Christopherson
  2020-10-09 15:48             ` stsp
  2020-12-07 11:19             ` KVM_SET_CPUID doesn't check supported bits (was Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup) stsp
@ 2020-12-07 11:24             ` stsp
  2020-12-07 11:29               ` Paolo Bonzini
  2 siblings, 1 reply; 24+ messages in thread
From: stsp @ 2020-12-07 11:24 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

[re-send because of bad formatting]

09.10.2020 18:30, Sean Christopherson пишет:
>> The only other effect of setting VMXE was clearing VME. Which shouldn't
>> affect anything either, right?
> Hmm, clearing VME would mean that exceptions/interrupts within the 
> guest would
> trigger a switch out of v86 and into vanilla protected mode. v86 and 
> PM have
> different consistency checks, particularly for segmentation, so it's 
> plausible
> that clearing CR4.VME inadvertantly worked around the bug by avoiding 
> invalid
> guest state for v86.

Almost.

So with your patch set (thanks!) and a
bit of further investigations, it now became
clear where the problem is.
We have this code:
---

|cpuid->nent = 2; // Use the same values as in emu-i386/simx86/interp.c 
// (Pentium 133-200MHz, "GenuineIntel") cpuid->entries[0] = (struct 
kvm_cpuid_entry) { .function = 0, .eax = 1, .ebx = 0x756e6547, .ecx = 
0x6c65746e, .edx = 0x49656e69 }; // family 5, model 2, stepping 12, fpu 
vme de pse tsc msr mce cx8 cpuid->entries[1] = (struct kvm_cpuid_entry) 
{ .function = 1, .eax = 0x052c, .ebx = 0, .ecx = 0, .edx = 0x1bf }; ret 
= ioctl(vcpufd, KVM_SET_CPUID, cpuid); free(cpuid); if (ret == -1) { 
perror("KVM: KVM_SET_CPUID"); return 0; }|

---


It tries to enable VME among other things.
qemu appears to disable VME by default,
unless you do "-cpu host". So we have a situation where
the host (which is qemu) doesn't have VME,
and guest (dosemu) is trying to enable it.
Now obviously KVM_SET_CPUID doesn't check anyting
at all and returns success. That later turns
into an invalid guest state.


Question: should KVM_SET_CPUID check for
supported bits, end return error if not everything
is supported?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: KVM_SET_CPUID doesn't check supported bits (was Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup)
  2020-12-07 11:24             ` stsp
@ 2020-12-07 11:29               ` Paolo Bonzini
  2020-12-07 11:47                 ` stsp
  0 siblings, 1 reply; 24+ messages in thread
From: Paolo Bonzini @ 2020-12-07 11:29 UTC (permalink / raw)
  To: stsp, Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel

On 07/12/20 12:24, stsp wrote:
> It tries to enable VME among other things.
> qemu appears to disable VME by default,
> unless you do "-cpu host". So we have a situation where
> the host (which is qemu) doesn't have VME,
> and guest (dosemu) is trying to enable it.
> Now obviously KVM_SET_CPUID doesn't check anyting
> at all and returns success. That later turns
> into an invalid guest state.
> 
> 
> Question: should KVM_SET_CPUID check for
> supported bits, end return error if not everything
> is supported?

No, it is intentional.  Most bits of CPUID are not ever checked by KVM, 
so userspace is supposed to set values that makes sense or just copy the 
value of KVM_GET_SUPPORTED_CPUID more or less blindly.

Paolo


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: KVM_SET_CPUID doesn't check supported bits (was Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup)
  2020-12-07 11:29               ` Paolo Bonzini
@ 2020-12-07 11:47                 ` stsp
       [not found]                   ` <CABgObfYS57_ez-t=eu9+3S2bhSXC_9DTj=64Sna2jnYEMYo2Ag@mail.gmail.com>
  2020-12-07 23:59                   ` Jim Mattson
  0 siblings, 2 replies; 24+ messages in thread
From: stsp @ 2020-12-07 11:47 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel

07.12.2020 14:29, Paolo Bonzini пишет:
> On 07/12/20 12:24, stsp wrote:
>> It tries to enable VME among other things.
>> qemu appears to disable VME by default,
>> unless you do "-cpu host". So we have a situation where
>> the host (which is qemu) doesn't have VME,
>> and guest (dosemu) is trying to enable it.
>> Now obviously KVM_SET_CPUID doesn't check anyting
>> at all and returns success. That later turns
>> into an invalid guest state.
>>
>>
>> Question: should KVM_SET_CPUID check for
>> supported bits, end return error if not everything
>> is supported?
>
> No, it is intentional.  Most bits of CPUID are not ever checked by 
> KVM, so userspace is supposed to set values that makes sense
By "that makes sense" you probably
meant to say "bits_that_makes_sense masked
with the ones returned by KVM_GET_SUPPORTED_CPUID"?

So am I right that KVM_SET_CPUID only "lowers"
the supported bits? In which case I don't need to
call it at all, but instead just call KVM_GET_SUPPORTED_CPUID
and see if the needed bits are supported, and
exit otherwise, right?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: KVM_SET_CPUID doesn't check supported bits (was Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup)
       [not found]                   ` <CABgObfYS57_ez-t=eu9+3S2bhSXC_9DTj=64Sna2jnYEMYo2Ag@mail.gmail.com>
@ 2020-12-07 14:03                     ` stsp
       [not found]                       ` <CABgObfb_4r=k_qakd+48hPar8rzc-P50+dgdoYvQaL2H-po6+g@mail.gmail.com>
  0 siblings, 1 reply; 24+ messages in thread
From: stsp @ 2020-12-07 14:03 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

07.12.2020 16:35, Paolo Bonzini пишет:
>
>
> Il lun 7 dic 2020, 12:47 stsp <stsp2@yandex.ru 
> <mailto:stsp2@yandex.ru>> ha scritto:
>
>     So am I right that KVM_SET_CPUID only "lowers"
>     the supported bits? In which case I don't need to
>     call it at all, but instead just call KVM_GET_SUPPORTED_CPUID
>     and see if the needed bits are supported, and
>     exit otherwise, right?
>
>
> You always have to call KVM_SET_CPUID2, but you can just pass in 
> whatever you got from KVM_GET_SUPPORTED_CPUID.
OK, done that, thanks.
(after checking that KVM_GET_SUPPORTED_CPUID
actually has the needed features itself, otherwise exit).

Perhaps it would be good if guest cpuid to
have a default values of KVM_GET_SUPPORTED_CPUID,
so that the user doesn't have to do the needless
calls to just copy host features to guest cpuid.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: KVM_SET_CPUID doesn't check supported bits (was Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup)
       [not found]                       ` <CABgObfb_4r=k_qakd+48hPar8rzc-P50+dgdoYvQaL2H-po6+g@mail.gmail.com>
@ 2020-12-07 14:29                         ` stsp
       [not found]                           ` <CABgObfYN7Okdt+YfHtsd3M_00iuWf=UyKPmbQhhYBhoiMtdXuw@mail.gmail.com>
  0 siblings, 1 reply; 24+ messages in thread
From: stsp @ 2020-12-07 14:29 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

07.12.2020 17:09, Paolo Bonzini пишет:
>
>
> Il lun 7 dic 2020, 15:04 stsp <stsp2@yandex.ru 
> <mailto:stsp2@yandex.ru>> ha scritto:
>
>     Perhaps it would be good if guest cpuid to
>     have a default values of KVM_GET_SUPPORTED_CPUID,
>     so that the user doesn't have to do the needless
>     calls to just copy host features to guest cpuid.
>
>
> It is too late to change that aspect of the API, unfortunately. We 
> don't know how various userspaces would behave.
Which means some sensible behaviour
already exists if I don't call KVM_SET_CPUID2.
So what is it, #UD on CPUID?
Would be good to have that documented.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: KVM_SET_CPUID doesn't check supported bits (was Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup)
       [not found]                           ` <CABgObfYN7Okdt+YfHtsd3M_00iuWf=UyKPmbQhhYBhoiMtdXuw@mail.gmail.com>
@ 2020-12-07 14:41                             ` stsp
  0 siblings, 0 replies; 24+ messages in thread
From: stsp @ 2020-12-07 14:41 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

07.12.2020 17:34, Paolo Bonzini пишет:
>
>     > It is too late to change that aspect of the API, unfortunately. We
>     > don't know how various userspaces would behave.
>     Which means some sensible behaviour
>     already exists if I don't call KVM_SET_CPUID2.
>     So what is it, #UD on CPUID?
>
>
> I would have to check but I think you always get zeroes; not entirely 
> sensible.
In that case I would argue that you can't
break anything by changing that to something
sensible. :)
But anyway, since my problem is solved,
this is just a potential improvement for the
future, or the case for documenting.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: KVM_SET_CPUID doesn't check supported bits (was Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup)
  2020-12-07 11:47                 ` stsp
       [not found]                   ` <CABgObfYS57_ez-t=eu9+3S2bhSXC_9DTj=64Sna2jnYEMYo2Ag@mail.gmail.com>
@ 2020-12-07 23:59                   ` Jim Mattson
  1 sibling, 0 replies; 24+ messages in thread
From: Jim Mattson @ 2020-12-07 23:59 UTC (permalink / raw)
  To: stsp
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Joerg Roedel, kvm list, LKML

On Mon, Dec 7, 2020 at 3:47 AM stsp <stsp2@yandex.ru> wrote:
>
> 07.12.2020 14:29, Paolo Bonzini пишет:
> > On 07/12/20 12:24, stsp wrote:
> >> It tries to enable VME among other things.
> >> qemu appears to disable VME by default,
> >> unless you do "-cpu host". So we have a situation where
> >> the host (which is qemu) doesn't have VME,
> >> and guest (dosemu) is trying to enable it.
> >> Now obviously KVM_SET_CPUID doesn't check anyting
> >> at all and returns success. That later turns
> >> into an invalid guest state.
> >>
> >>
> >> Question: should KVM_SET_CPUID check for
> >> supported bits, end return error if not everything
> >> is supported?
> >
> > No, it is intentional.  Most bits of CPUID are not ever checked by
> > KVM, so userspace is supposed to set values that makes sense
> By "that makes sense" you probably
> meant to say "bits_that_makes_sense masked
> with the ones returned by KVM_GET_SUPPORTED_CPUID"?
>
> So am I right that KVM_SET_CPUID only "lowers"
> the supported bits? In which case I don't need to
> call it at all, but instead just call KVM_GET_SUPPORTED_CPUID
> and see if the needed bits are supported, and
> exit otherwise, right?

"Lowers" is a tricky concept for CPUID information. Some feature bits
report 0 for "present" and 1 for "not-present." Some multi-bit fields
are interpreted as numbers, which may be signed or unsigned. Some
multi-bit fields are strings. Some fields have dependencies on other
fields. Etc.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-12-08  0:00 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-07  1:44 [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup Sean Christopherson
2020-10-07  1:44 ` [PATCH 1/6] KVM: VMX: Drop guest CPUID check for VMXE in vmx_set_cr4() Sean Christopherson
2020-10-07  1:44 ` [PATCH 2/6] KVM: VMX: Drop explicit 'nested' check from vmx_set_cr4() Sean Christopherson
2020-10-07  1:44 ` [PATCH 3/6] KVM: SVM: Drop VMXE check from svm_set_cr4() Sean Christopherson
2020-10-07  1:44 ` [PATCH 4/6] KVM: x86: Move vendor CR4 validity check to dedicated kvm_x86_ops hook Sean Christopherson
2020-10-07  1:44 ` [PATCH 5/6] KVM: x86: Return bool instead of int for CR4 and SREGS validity checks Sean Christopherson
2020-10-07  1:44 ` [PATCH 6/6] KVM: selftests: Verify supported CR4 bits can be set before KVM_SET_CPUID2 Sean Christopherson
2020-10-08 16:00 ` [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup stsp
2020-10-08 17:59   ` Sean Christopherson
2020-10-08 18:18     ` stsp
2020-10-09  4:04       ` Sean Christopherson
2020-10-09 14:11         ` stsp
2020-10-09 15:30           ` Sean Christopherson
2020-10-09 15:48             ` stsp
2020-10-09 16:11               ` Sean Christopherson
2020-12-07 11:19             ` KVM_SET_CPUID doesn't check supported bits (was Re: [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup) stsp
2020-12-07 11:24             ` stsp
2020-12-07 11:29               ` Paolo Bonzini
2020-12-07 11:47                 ` stsp
     [not found]                   ` <CABgObfYS57_ez-t=eu9+3S2bhSXC_9DTj=64Sna2jnYEMYo2Ag@mail.gmail.com>
2020-12-07 14:03                     ` stsp
     [not found]                       ` <CABgObfb_4r=k_qakd+48hPar8rzc-P50+dgdoYvQaL2H-po6+g@mail.gmail.com>
2020-12-07 14:29                         ` stsp
     [not found]                           ` <CABgObfYN7Okdt+YfHtsd3M_00iuWf=UyKPmbQhhYBhoiMtdXuw@mail.gmail.com>
2020-12-07 14:41                             ` stsp
2020-12-07 23:59                   ` Jim Mattson
2020-11-13 11:36 ` [PATCH 0/6] KVM: x86: KVM_SET_SREGS.CR4 bug fixes and cleanup Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).