kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] Handle #GP for SVM execution instructions
@ 2021-01-26  8:18 Wei Huang
  2021-01-26  8:18 ` [PATCH v3 1/4] KVM: x86: Factor out x86 instruction emulation with decoding Wei Huang
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Wei Huang @ 2021-01-26  8:18 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, pbonzini, vkuznets, mlevitsk, seanjc, joro, bp,
	tglx, mingo, x86, jmattson, wanpengli, bsd, dgilbert, luto,
	wei.huang2

While running SVM related instructions (VMRUN/VMSAVE/VMLOAD), some AMD
CPUs check EAX against reserved memory regions (e.g. SMM memory on host)
before checking VMCB's instruction intercept. If EAX falls into such
memory areas, #GP is triggered before #VMEXIT. This causes unexpected #GP
under nested virtualization. To solve this problem, this patchset makes
KVM trap #GP and emulate these SVM instuctions accordingly.

Also newer AMD CPUs will change this behavior by triggering #VMEXIT
before #GP. This change is indicated by CPUID_0x8000000A_EDX[28]. Under
this circumstance, #GP interception is not required. This patchset supports
the new feature.

This patchset has been verified with vmrun_errata_test and vmware_backdoor
tests of kvm_unit_test on the following configs. Also it was verified that
vmware_backdoor can be turned on under nested on nested.
  * Current CPU: nested, nested on nested
  * New CPU with X86_FEATURE_SVME_ADDR_CHK: nested, nested on nested

v2->v3:
  * Change the decode function name to x86_decode_emulated_instruction()
  * Add a new variable, svm_gp_erratum_intercept, to control interception
  * Turn on VM's X86_FEATURE_SVME_ADDR_CHK feature in svm_set_cpu_caps()
  * Fix instruction emulation for vmware_backdoor under nested-on-nested
  * Minor comment fixes

v1->v2:
  * Factor out instruction decode for sharing
  * Re-org gp_interception() handling for both #GP and vmware_backdoor
  * Use kvm_cpu_cap for X86_FEATURE_SVME_ADDR_CHK feature support
  * Add nested on nested support

Thanks,
-Wei

Wei Huang (4):
  KVM: x86: Factor out x86 instruction emulation with decoding
  KVM: SVM: Add emulation support for #GP triggered by SVM instructions
  KVM: SVM: Add support for SVM instruction address check change
  KVM: SVM: Support #GP handling for the case of nested on nested

 arch/x86/include/asm/cpufeatures.h |   1 +
 arch/x86/kvm/svm/svm.c             | 128 +++++++++++++++++++++++++----
 arch/x86/kvm/x86.c                 |  62 ++++++++------
 arch/x86/kvm/x86.h                 |   2 +
 4 files changed, 152 insertions(+), 41 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/4] KVM: x86: Factor out x86 instruction emulation with decoding
  2021-01-26  8:18 [PATCH v3 0/4] Handle #GP for SVM execution instructions Wei Huang
@ 2021-01-26  8:18 ` Wei Huang
  2021-01-26  8:18 ` [PATCH v3 2/4] KVM: SVM: Add emulation support for #GP triggered by SVM instructions Wei Huang
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Wei Huang @ 2021-01-26  8:18 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, pbonzini, vkuznets, mlevitsk, seanjc, joro, bp,
	tglx, mingo, x86, jmattson, wanpengli, bsd, dgilbert, luto,
	wei.huang2

Move the instruction decode part out of x86_emulate_instruction() for it
to be used in other places. Also kvm_clear_exception_queue() is moved
inside the if-statement as it doesn't apply when KVM are coming back from
userspace.

Co-developed-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
---
 arch/x86/kvm/x86.c | 62 +++++++++++++++++++++++++++++-----------------
 arch/x86/kvm/x86.h |  2 ++
 2 files changed, 41 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9a8969a6dd06..a1c83cd43c1a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7298,6 +7298,42 @@ static bool is_vmware_backdoor_opcode(struct x86_emulate_ctxt *ctxt)
 	return false;
 }
 
+/*
+ * Decode to be emulated instruction. Return EMULATION_OK if success.
+ */
+int x86_decode_emulated_instruction(struct kvm_vcpu *vcpu, int emulation_type,
+				    void *insn, int insn_len)
+{
+	int r = EMULATION_OK;
+	struct x86_emulate_ctxt *ctxt = vcpu->arch.emulate_ctxt;
+
+	init_emulate_ctxt(vcpu);
+
+	/*
+	 * We will reenter on the same instruction since we do not set
+	 * complete_userspace_io. This does not handle watchpoints yet,
+	 * those would be handled in the emulate_ops.
+	 */
+	if (!(emulation_type & EMULTYPE_SKIP) &&
+	    kvm_vcpu_check_breakpoint(vcpu, &r))
+		return r;
+
+	ctxt->interruptibility = 0;
+	ctxt->have_exception = false;
+	ctxt->exception.vector = -1;
+	ctxt->perm_ok = false;
+
+	ctxt->ud = emulation_type & EMULTYPE_TRAP_UD;
+
+	r = x86_decode_insn(ctxt, insn, insn_len);
+
+	trace_kvm_emulate_insn_start(vcpu);
+	++vcpu->stat.insn_emulation;
+
+	return r;
+}
+EXPORT_SYMBOL_GPL(x86_decode_emulated_instruction);
+
 int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 			    int emulation_type, void *insn, int insn_len)
 {
@@ -7317,32 +7353,12 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 	 */
 	write_fault_to_spt = vcpu->arch.write_fault_to_shadow_pgtable;
 	vcpu->arch.write_fault_to_shadow_pgtable = false;
-	kvm_clear_exception_queue(vcpu);
 
 	if (!(emulation_type & EMULTYPE_NO_DECODE)) {
-		init_emulate_ctxt(vcpu);
-
-		/*
-		 * We will reenter on the same instruction since
-		 * we do not set complete_userspace_io.  This does not
-		 * handle watchpoints yet, those would be handled in
-		 * the emulate_ops.
-		 */
-		if (!(emulation_type & EMULTYPE_SKIP) &&
-		    kvm_vcpu_check_breakpoint(vcpu, &r))
-			return r;
-
-		ctxt->interruptibility = 0;
-		ctxt->have_exception = false;
-		ctxt->exception.vector = -1;
-		ctxt->perm_ok = false;
-
-		ctxt->ud = emulation_type & EMULTYPE_TRAP_UD;
-
-		r = x86_decode_insn(ctxt, insn, insn_len);
+		kvm_clear_exception_queue(vcpu);
 
-		trace_kvm_emulate_insn_start(vcpu);
-		++vcpu->stat.insn_emulation;
+		r = x86_decode_emulated_instruction(vcpu, emulation_type,
+						    insn, insn_len);
 		if (r != EMULATION_OK)  {
 			if ((emulation_type & EMULTYPE_TRAP_UD) ||
 			    (emulation_type & EMULTYPE_TRAP_UD_FORCED)) {
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index c5ee0f5ce0f1..482e7f24801e 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -273,6 +273,8 @@ bool kvm_mtrr_check_gfn_range_consistency(struct kvm_vcpu *vcpu, gfn_t gfn,
 					  int page_num);
 bool kvm_vector_hashing_enabled(void);
 void kvm_fixup_and_inject_pf_error(struct kvm_vcpu *vcpu, gva_t gva, u16 error_code);
+int x86_decode_emulated_instruction(struct kvm_vcpu *vcpu, int emulation_type,
+				    void *insn, int insn_len);
 int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 			    int emulation_type, void *insn, int insn_len);
 fastpath_t handle_fastpath_set_msr_irqoff(struct kvm_vcpu *vcpu);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/4] KVM: SVM: Add emulation support for #GP triggered by SVM instructions
  2021-01-26  8:18 [PATCH v3 0/4] Handle #GP for SVM execution instructions Wei Huang
  2021-01-26  8:18 ` [PATCH v3 1/4] KVM: x86: Factor out x86 instruction emulation with decoding Wei Huang
@ 2021-01-26  8:18 ` Wei Huang
  2021-01-26 11:34   ` Paolo Bonzini
  2021-01-26 11:50   ` Maxim Levitsky
  2021-01-26  8:18 ` [PATCH v3 3/4] KVM: SVM: Add support for SVM instruction address check change Wei Huang
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 13+ messages in thread
From: Wei Huang @ 2021-01-26  8:18 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, pbonzini, vkuznets, mlevitsk, seanjc, joro, bp,
	tglx, mingo, x86, jmattson, wanpengli, bsd, dgilbert, luto,
	wei.huang2

From: Bandan Das <bsd@redhat.com>

While running SVM related instructions (VMRUN/VMSAVE/VMLOAD), some AMD
CPUs check EAX against reserved memory regions (e.g. SMM memory on host)
before checking VMCB's instruction intercept. If EAX falls into such
memory areas, #GP is triggered before VMEXIT. This causes problem under
nested virtualization. To solve this problem, KVM needs to trap #GP and
check the instructions triggering #GP. For VM execution instructions,
KVM emulates these instructions.

Co-developed-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
---
 arch/x86/kvm/svm/svm.c | 109 ++++++++++++++++++++++++++++++++++-------
 1 file changed, 91 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 7ef171790d02..e5ca01e25e89 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -200,6 +200,8 @@ module_param(sev_es, int, 0444);
 bool __read_mostly dump_invalid_vmcb;
 module_param(dump_invalid_vmcb, bool, 0644);
 
+bool svm_gp_erratum_intercept = true;
+
 static u8 rsm_ins_bytes[] = "\x0f\xaa";
 
 static void svm_complete_interrupts(struct vcpu_svm *svm);
@@ -288,6 +290,9 @@ int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)
 		if (!(efer & EFER_SVME)) {
 			svm_leave_nested(svm);
 			svm_set_gif(svm, true);
+			/* #GP intercept is still needed in vmware_backdoor */
+			if (!enable_vmware_backdoor)
+				clr_exception_intercept(svm, GP_VECTOR);
 
 			/*
 			 * Free the nested guest state, unless we are in SMM.
@@ -309,6 +314,10 @@ int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)
 
 	svm->vmcb->save.efer = efer | EFER_SVME;
 	vmcb_mark_dirty(svm->vmcb, VMCB_CR);
+	/* Enable #GP interception for SVM instructions */
+	if (svm_gp_erratum_intercept)
+		set_exception_intercept(svm, GP_VECTOR);
+
 	return 0;
 }
 
@@ -1957,24 +1966,6 @@ static int ac_interception(struct vcpu_svm *svm)
 	return 1;
 }
 
-static int gp_interception(struct vcpu_svm *svm)
-{
-	struct kvm_vcpu *vcpu = &svm->vcpu;
-	u32 error_code = svm->vmcb->control.exit_info_1;
-
-	WARN_ON_ONCE(!enable_vmware_backdoor);
-
-	/*
-	 * VMware backdoor emulation on #GP interception only handles IN{S},
-	 * OUT{S}, and RDPMC, none of which generate a non-zero error code.
-	 */
-	if (error_code) {
-		kvm_queue_exception_e(vcpu, GP_VECTOR, error_code);
-		return 1;
-	}
-	return kvm_emulate_instruction(vcpu, EMULTYPE_VMWARE_GP);
-}
-
 static bool is_erratum_383(void)
 {
 	int err, i;
@@ -2173,6 +2164,88 @@ static int vmrun_interception(struct vcpu_svm *svm)
 	return nested_svm_vmrun(svm);
 }
 
+enum {
+	NONE_SVM_INSTR,
+	SVM_INSTR_VMRUN,
+	SVM_INSTR_VMLOAD,
+	SVM_INSTR_VMSAVE,
+};
+
+/* Return NONE_SVM_INSTR if not SVM instrs, otherwise return decode result */
+static int svm_instr_opcode(struct kvm_vcpu *vcpu)
+{
+	struct x86_emulate_ctxt *ctxt = vcpu->arch.emulate_ctxt;
+
+	if (ctxt->b != 0x1 || ctxt->opcode_len != 2)
+		return NONE_SVM_INSTR;
+
+	switch (ctxt->modrm) {
+	case 0xd8: /* VMRUN */
+		return SVM_INSTR_VMRUN;
+	case 0xda: /* VMLOAD */
+		return SVM_INSTR_VMLOAD;
+	case 0xdb: /* VMSAVE */
+		return SVM_INSTR_VMSAVE;
+	default:
+		break;
+	}
+
+	return NONE_SVM_INSTR;
+}
+
+static int emulate_svm_instr(struct kvm_vcpu *vcpu, int opcode)
+{
+	int (*const svm_instr_handlers[])(struct vcpu_svm *svm) = {
+		[SVM_INSTR_VMRUN] = vmrun_interception,
+		[SVM_INSTR_VMLOAD] = vmload_interception,
+		[SVM_INSTR_VMSAVE] = vmsave_interception,
+	};
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	return svm_instr_handlers[opcode](svm);
+}
+
+/*
+ * #GP handling code. Note that #GP can be triggered under the following two
+ * cases:
+ *   1) SVM VM-related instructions (VMRUN/VMSAVE/VMLOAD) that trigger #GP on
+ *      some AMD CPUs when EAX of these instructions are in the reserved memory
+ *      regions (e.g. SMM memory on host).
+ *   2) VMware backdoor
+ */
+static int gp_interception(struct vcpu_svm *svm)
+{
+	struct kvm_vcpu *vcpu = &svm->vcpu;
+	u32 error_code = svm->vmcb->control.exit_info_1;
+	int opcode;
+
+	/* Both #GP cases have zero error_code */
+	if (error_code)
+		goto reinject;
+
+	/* Decode the instruction for usage later */
+	if (x86_decode_emulated_instruction(vcpu, 0, NULL, 0) != EMULATION_OK)
+		goto reinject;
+
+	opcode = svm_instr_opcode(vcpu);
+
+	if (opcode == NONE_SVM_INSTR) {
+		WARN_ON_ONCE(!enable_vmware_backdoor);
+
+		/*
+		 * VMware backdoor emulation on #GP interception only handles
+		 * IN{S}, OUT{S}, and RDPMC.
+		 */
+		return kvm_emulate_instruction(vcpu,
+				EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
+	} else
+		return emulate_svm_instr(vcpu, opcode);
+
+reinject:
+	kvm_queue_exception_e(vcpu, GP_VECTOR, error_code);
+	return 1;
+}
+
 void svm_set_gif(struct vcpu_svm *svm, bool value)
 {
 	if (value) {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 3/4] KVM: SVM: Add support for SVM instruction address check change
  2021-01-26  8:18 [PATCH v3 0/4] Handle #GP for SVM execution instructions Wei Huang
  2021-01-26  8:18 ` [PATCH v3 1/4] KVM: x86: Factor out x86 instruction emulation with decoding Wei Huang
  2021-01-26  8:18 ` [PATCH v3 2/4] KVM: SVM: Add emulation support for #GP triggered by SVM instructions Wei Huang
@ 2021-01-26  8:18 ` Wei Huang
  2021-01-26 11:52   ` Maxim Levitsky
  2021-01-26  8:18 ` [PATCH v3 4/4] KVM: SVM: Support #GP handling for the case of nested on nested Wei Huang
  2021-01-26 11:39 ` [PATCH v3 0/4] Handle #GP for SVM execution instructions Paolo Bonzini
  4 siblings, 1 reply; 13+ messages in thread
From: Wei Huang @ 2021-01-26  8:18 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, pbonzini, vkuznets, mlevitsk, seanjc, joro, bp,
	tglx, mingo, x86, jmattson, wanpengli, bsd, dgilbert, luto,
	wei.huang2

New AMD CPUs have a change that checks #VMEXIT intercept on special SVM
instructions before checking their EAX against reserved memory region.
This change is indicated by CPUID_0x8000000A_EDX[28]. If it is 1, #VMEXIT
is triggered before #GP. KVM doesn't need to intercept and emulate #GP
faults as #GP is supposed to be triggered.

Co-developed-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/kvm/svm/svm.c             | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 84b887825f12..ea89d6fdd79a 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -337,6 +337,7 @@
 #define X86_FEATURE_AVIC		(15*32+13) /* Virtual Interrupt Controller */
 #define X86_FEATURE_V_VMSAVE_VMLOAD	(15*32+15) /* Virtual VMSAVE VMLOAD */
 #define X86_FEATURE_VGIF		(15*32+16) /* Virtual GIF */
+#define X86_FEATURE_SVME_ADDR_CHK	(15*32+28) /* "" SVME addr check */
 
 /* Intel-defined CPU features, CPUID level 0x00000007:0 (ECX), word 16 */
 #define X86_FEATURE_AVX512VBMI		(16*32+ 1) /* AVX512 Vector Bit Manipulation instructions*/
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e5ca01e25e89..f9233c79265b 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1036,6 +1036,9 @@ static __init int svm_hardware_setup(void)
 		}
 	}
 
+	if (boot_cpu_has(X86_FEATURE_SVME_ADDR_CHK))
+		svm_gp_erratum_intercept = false;
+
 	if (vgif) {
 		if (!boot_cpu_has(X86_FEATURE_VGIF))
 			vgif = false;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 4/4] KVM: SVM: Support #GP handling for the case of nested on nested
  2021-01-26  8:18 [PATCH v3 0/4] Handle #GP for SVM execution instructions Wei Huang
                   ` (2 preceding siblings ...)
  2021-01-26  8:18 ` [PATCH v3 3/4] KVM: SVM: Add support for SVM instruction address check change Wei Huang
@ 2021-01-26  8:18 ` Wei Huang
  2021-01-26 11:39   ` Paolo Bonzini
  2021-01-26 11:59   ` Maxim Levitsky
  2021-01-26 11:39 ` [PATCH v3 0/4] Handle #GP for SVM execution instructions Paolo Bonzini
  4 siblings, 2 replies; 13+ messages in thread
From: Wei Huang @ 2021-01-26  8:18 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, pbonzini, vkuznets, mlevitsk, seanjc, joro, bp,
	tglx, mingo, x86, jmattson, wanpengli, bsd, dgilbert, luto,
	wei.huang2

Under the case of nested on nested (L0->L1->L2->L3), #GP triggered by
SVM instructions can be hided from L1. Instead the hypervisor can
inject the proper #VMEXIT to inform L1 of what is happening. Thus L1
can avoid invoking the #GP workaround. For this reason we turns on
guest VM's X86_FEATURE_SVME_ADDR_CHK bit for KVM running inside VM to
receive the notification and change behavior.

Similarly we check if vcpu is under guest mode before emulating the
vmware-backdoor instructions. For the case of nested on nested, we
let the guest handle it.

Co-developed-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Tested-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/svm/svm.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index f9233c79265b..83c401d2709f 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -929,6 +929,9 @@ static __init void svm_set_cpu_caps(void)
 
 		if (npt_enabled)
 			kvm_cpu_cap_set(X86_FEATURE_NPT);
+
+		/* Nested VM can receive #VMEXIT instead of triggering #GP */
+		kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK);
 	}
 
 	/* CPUID 0x80000008 */
@@ -2198,6 +2201,11 @@ static int svm_instr_opcode(struct kvm_vcpu *vcpu)
 
 static int emulate_svm_instr(struct kvm_vcpu *vcpu, int opcode)
 {
+	const int guest_mode_exit_codes[] = {
+		[SVM_INSTR_VMRUN] = SVM_EXIT_VMRUN,
+		[SVM_INSTR_VMLOAD] = SVM_EXIT_VMLOAD,
+		[SVM_INSTR_VMSAVE] = SVM_EXIT_VMSAVE,
+	};
 	int (*const svm_instr_handlers[])(struct vcpu_svm *svm) = {
 		[SVM_INSTR_VMRUN] = vmrun_interception,
 		[SVM_INSTR_VMLOAD] = vmload_interception,
@@ -2205,7 +2213,14 @@ static int emulate_svm_instr(struct kvm_vcpu *vcpu, int opcode)
 	};
 	struct vcpu_svm *svm = to_svm(vcpu);
 
-	return svm_instr_handlers[opcode](svm);
+	if (is_guest_mode(vcpu)) {
+		svm->vmcb->control.exit_code = guest_mode_exit_codes[opcode];
+		svm->vmcb->control.exit_info_1 = 0;
+		svm->vmcb->control.exit_info_2 = 0;
+
+		return nested_svm_vmexit(svm);
+	} else
+		return svm_instr_handlers[opcode](svm);
 }
 
 /*
@@ -2239,7 +2254,8 @@ static int gp_interception(struct vcpu_svm *svm)
 		 * VMware backdoor emulation on #GP interception only handles
 		 * IN{S}, OUT{S}, and RDPMC.
 		 */
-		return kvm_emulate_instruction(vcpu,
+		if (!is_guest_mode(vcpu))
+			return kvm_emulate_instruction(vcpu,
 				EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
 	} else
 		return emulate_svm_instr(vcpu, opcode);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/4] KVM: SVM: Add emulation support for #GP triggered by SVM instructions
  2021-01-26  8:18 ` [PATCH v3 2/4] KVM: SVM: Add emulation support for #GP triggered by SVM instructions Wei Huang
@ 2021-01-26 11:34   ` Paolo Bonzini
  2021-01-26 11:50   ` Maxim Levitsky
  1 sibling, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2021-01-26 11:34 UTC (permalink / raw)
  To: Wei Huang, kvm
  Cc: linux-kernel, vkuznets, mlevitsk, seanjc, joro, bp, tglx, mingo,
	x86, jmattson, wanpengli, bsd, dgilbert, luto

On 26/01/21 09:18, Wei Huang wrote:
> 
> @@ -288,6 +290,9 @@ int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)
>  		if (!(efer & EFER_SVME)) {
>  			svm_leave_nested(svm);
>  			svm_set_gif(svm, true);
> +			/* #GP intercept is still needed in vmware_backdoor */
> +			if (!enable_vmware_backdoor)
> +				clr_exception_intercept(svm, GP_VECTOR);
>  
>  			/*
>  			 * Free the nested guest state, unless we are in SMM.
> @@ -309,6 +314,10 @@ int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)
>  
>  	svm->vmcb->save.efer = efer | EFER_SVME;
>  	vmcb_mark_dirty(svm->vmcb, VMCB_CR);
> +	/* Enable #GP interception for SVM instructions */
> +	if (svm_gp_erratum_intercept)
> +		set_exception_intercept(svm, GP_VECTOR);
> +
>  	return 0;
>  }
>  

This should be in the "if (!(efer & EFER_SVME)) else" branch.  I'll fix 
it up myself.

Paolo


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 4/4] KVM: SVM: Support #GP handling for the case of nested on nested
  2021-01-26  8:18 ` [PATCH v3 4/4] KVM: SVM: Support #GP handling for the case of nested on nested Wei Huang
@ 2021-01-26 11:39   ` Paolo Bonzini
  2021-01-26 11:59   ` Maxim Levitsky
  1 sibling, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2021-01-26 11:39 UTC (permalink / raw)
  To: Wei Huang, kvm
  Cc: linux-kernel, vkuznets, mlevitsk, seanjc, joro, bp, tglx, mingo,
	x86, jmattson, wanpengli, bsd, dgilbert, luto

On 26/01/21 09:18, Wei Huang wrote:
> Under the case of nested on nested (L0->L1->L2->L3), #GP triggered by 
> SVM instructions can be hided from L1. Instead the hypervisor can inject 
> the proper #VMEXIT to inform L1 of what is happening. Thus L1 can avoid 
> invoking the #GP workaround. For this reason we turns on guest VM's 
> X86_FEATURE_SVME_ADDR_CHK bit for KVM running inside VM to receive the 
> notification and change behavior.

Slightly reworked commit message:

KVM: SVM: Fix #GP handling for doubly-nested virtualization

Under the case of nested on nested (L0, L1, L2 are all hypervisors),
#GP triggered by SVM instructions can be hidden from L1.  Because
we do not support emulation of the vVMLOAD/VMSAVE feature, the
L0 hypervisor can inject the proper #VMEXIT to inform L1 of what is
happening and L1 can avoid invoking the #GP workaround.

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/4] Handle #GP for SVM execution instructions
  2021-01-26  8:18 [PATCH v3 0/4] Handle #GP for SVM execution instructions Wei Huang
                   ` (3 preceding siblings ...)
  2021-01-26  8:18 ` [PATCH v3 4/4] KVM: SVM: Support #GP handling for the case of nested on nested Wei Huang
@ 2021-01-26 11:39 ` Paolo Bonzini
  2021-01-26 15:05   ` Wei Huang
  4 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2021-01-26 11:39 UTC (permalink / raw)
  To: Wei Huang, kvm
  Cc: linux-kernel, vkuznets, mlevitsk, seanjc, joro, bp, tglx, mingo,
	x86, jmattson, wanpengli, bsd, dgilbert, luto

On 26/01/21 09:18, Wei Huang wrote:
> While running SVM related instructions (VMRUN/VMSAVE/VMLOAD), some AMD
> CPUs check EAX against reserved memory regions (e.g. SMM memory on host)
> before checking VMCB's instruction intercept. If EAX falls into such
> memory areas, #GP is triggered before #VMEXIT. This causes unexpected #GP
> under nested virtualization. To solve this problem, this patchset makes
> KVM trap #GP and emulate these SVM instuctions accordingly.
> 
> Also newer AMD CPUs will change this behavior by triggering #VMEXIT
> before #GP. This change is indicated by CPUID_0x8000000A_EDX[28]. Under
> this circumstance, #GP interception is not required. This patchset supports
> the new feature.
> 
> This patchset has been verified with vmrun_errata_test and vmware_backdoor
> tests of kvm_unit_test on the following configs. Also it was verified that
> vmware_backdoor can be turned on under nested on nested.
>    * Current CPU: nested, nested on nested
>    * New CPU with X86_FEATURE_SVME_ADDR_CHK: nested, nested on nested
> 
> v2->v3:
>    * Change the decode function name to x86_decode_emulated_instruction()
>    * Add a new variable, svm_gp_erratum_intercept, to control interception
>    * Turn on VM's X86_FEATURE_SVME_ADDR_CHK feature in svm_set_cpu_caps()
>    * Fix instruction emulation for vmware_backdoor under nested-on-nested
>    * Minor comment fixes
> 
> v1->v2:
>    * Factor out instruction decode for sharing
>    * Re-org gp_interception() handling for both #GP and vmware_backdoor
>    * Use kvm_cpu_cap for X86_FEATURE_SVME_ADDR_CHK feature support
>    * Add nested on nested support
> 
> Thanks,
> -Wei
> 
> Wei Huang (4):
>    KVM: x86: Factor out x86 instruction emulation with decoding
>    KVM: SVM: Add emulation support for #GP triggered by SVM instructions
>    KVM: SVM: Add support for SVM instruction address check change
>    KVM: SVM: Support #GP handling for the case of nested on nested
> 
>   arch/x86/include/asm/cpufeatures.h |   1 +
>   arch/x86/kvm/svm/svm.c             | 128 +++++++++++++++++++++++++----
>   arch/x86/kvm/x86.c                 |  62 ++++++++------
>   arch/x86/kvm/x86.h                 |   2 +
>   4 files changed, 152 insertions(+), 41 deletions(-)
> 

Queued, thanks.

Paolo


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/4] KVM: SVM: Add emulation support for #GP triggered by SVM instructions
  2021-01-26  8:18 ` [PATCH v3 2/4] KVM: SVM: Add emulation support for #GP triggered by SVM instructions Wei Huang
  2021-01-26 11:34   ` Paolo Bonzini
@ 2021-01-26 11:50   ` Maxim Levitsky
  1 sibling, 0 replies; 13+ messages in thread
From: Maxim Levitsky @ 2021-01-26 11:50 UTC (permalink / raw)
  To: Wei Huang, kvm
  Cc: linux-kernel, pbonzini, vkuznets, seanjc, joro, bp, tglx, mingo,
	x86, jmattson, wanpengli, bsd, dgilbert, luto

On Tue, 2021-01-26 at 03:18 -0500, Wei Huang wrote:
> From: Bandan Das <bsd@redhat.com>
> 
> While running SVM related instructions (VMRUN/VMSAVE/VMLOAD), some AMD
> CPUs check EAX against reserved memory regions (e.g. SMM memory on host)
> before checking VMCB's instruction intercept. If EAX falls into such
> memory areas, #GP is triggered before VMEXIT. This causes problem under
> nested virtualization. To solve this problem, KVM needs to trap #GP and
> check the instructions triggering #GP. For VM execution instructions,
> KVM emulates these instructions.
> 
> Co-developed-by: Wei Huang <wei.huang2@amd.com>
> Signed-off-by: Wei Huang <wei.huang2@amd.com>
> Signed-off-by: Bandan Das <bsd@redhat.com>
> ---
>  arch/x86/kvm/svm/svm.c | 109 ++++++++++++++++++++++++++++++++++-------
>  1 file changed, 91 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 7ef171790d02..e5ca01e25e89 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -200,6 +200,8 @@ module_param(sev_es, int, 0444);
>  bool __read_mostly dump_invalid_vmcb;
>  module_param(dump_invalid_vmcb, bool, 0644);
>  
> +bool svm_gp_erratum_intercept = true;
I'll expect this to be a module parm, so that user
could override it, just like enable_vmware_backdoor

> +
>  static u8 rsm_ins_bytes[] = "\x0f\xaa";
>  
>  static void svm_complete_interrupts(struct vcpu_svm *svm);
> @@ -288,6 +290,9 @@ int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)
>  		if (!(efer & EFER_SVME)) {
>  			svm_leave_nested(svm);
>  			svm_set_gif(svm, true);
> +			/* #GP intercept is still needed in vmware_backdoor */
> +			if (!enable_vmware_backdoor)
I would use if (svm_gp_erratum_intercept && !enable_vmware_backdoor) to document
this.

> +				clr_exception_intercept(svm, GP_VECTOR);
>  
>  			/*
>  			 * Free the nested guest state, unless we are in SMM.
> @@ -309,6 +314,10 @@ int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)
>  
>  	svm->vmcb->save.efer = efer | EFER_SVME;
>  	vmcb_mark_dirty(svm->vmcb, VMCB_CR);
> +	/* Enable #GP interception for SVM instructions */
> +	if (svm_gp_erratum_intercept)
> +		set_exception_intercept(svm, GP_VECTOR);
> +
>  	return 0;
>  }
>  
> @@ -1957,24 +1966,6 @@ static int ac_interception(struct vcpu_svm *svm)
>  	return 1;
>  }
>  
> -static int gp_interception(struct vcpu_svm *svm)
> -{
> -	struct kvm_vcpu *vcpu = &svm->vcpu;
> -	u32 error_code = svm->vmcb->control.exit_info_1;
> -
> -	WARN_ON_ONCE(!enable_vmware_backdoor);	
> -
> -	/*
> -	 * VMware backdoor emulation on #GP interception only handles IN{S},
> -	 * OUT{S}, and RDPMC, none of which generate a non-zero error code.
> -	 */
> -	if (error_code) {
> -		kvm_queue_exception_e(vcpu, GP_VECTOR, error_code);
> -		return 1;
> -	}
> -	return kvm_emulate_instruction(vcpu, EMULTYPE_VMWARE_GP);
> -}
> -
>  static bool is_erratum_383(void)
>  {
>  	int err, i;
> @@ -2173,6 +2164,88 @@ static int vmrun_interception(struct vcpu_svm *svm)
>  	return nested_svm_vmrun(svm);
>  }
>  
> +enum {
> +	NONE_SVM_INSTR,
> +	SVM_INSTR_VMRUN,
> +	SVM_INSTR_VMLOAD,
> +	SVM_INSTR_VMSAVE,
> +};
> +
> +/* Return NONE_SVM_INSTR if not SVM instrs, otherwise return decode result */
> +static int svm_instr_opcode(struct kvm_vcpu *vcpu)
> +{
> +	struct x86_emulate_ctxt *ctxt = vcpu->arch.emulate_ctxt;
> +
> +	if (ctxt->b != 0x1 || ctxt->opcode_len != 2)
> +		return NONE_SVM_INSTR;
> +
> +	switch (ctxt->modrm) {
> +	case 0xd8: /* VMRUN */
> +		return SVM_INSTR_VMRUN;
> +	case 0xda: /* VMLOAD */
> +		return SVM_INSTR_VMLOAD;
> +	case 0xdb: /* VMSAVE */
> +		return SVM_INSTR_VMSAVE;
> +	default:
> +		break;
> +	}
> +
> +	return NONE_SVM_INSTR;
> +}
> +
> +static int emulate_svm_instr(struct kvm_vcpu *vcpu, int opcode)
> +{
> +	int (*const svm_instr_handlers[])(struct vcpu_svm *svm) = {
> +		[SVM_INSTR_VMRUN] = vmrun_interception,
> +		[SVM_INSTR_VMLOAD] = vmload_interception,
> +		[SVM_INSTR_VMSAVE] = vmsave_interception,
> +	};
> +	struct vcpu_svm *svm = to_svm(vcpu);
> +
> +	return svm_instr_handlers[opcode](svm);
> +}
> +
> +/*
> + * #GP handling code. Note that #GP can be triggered under the following two
> + * cases:
> + *   1) SVM VM-related instructions (VMRUN/VMSAVE/VMLOAD) that trigger #GP on
> + *      some AMD CPUs when EAX of these instructions are in the reserved memory
> + *      regions (e.g. SMM memory on host).
> + *   2) VMware backdoor
> + */
> +static int gp_interception(struct vcpu_svm *svm)
> +{
> +	struct kvm_vcpu *vcpu = &svm->vcpu;
> +	u32 error_code = svm->vmcb->control.exit_info_1;
> +	int opcode;
> +
> +	/* Both #GP cases have zero error_code */
> +	if (error_code)
> +		goto reinject;
> +
> +	/* Decode the instruction for usage later */
> +	if (x86_decode_emulated_instruction(vcpu, 0, NULL, 0) != EMULATION_OK)
> +		goto reinject;
> +
> +	opcode = svm_instr_opcode(vcpu);
> +
> +	if (opcode == NONE_SVM_INSTR) {
> +		WARN_ON_ONCE(!enable_vmware_backdoor);
> +
> +		/*
> +		 * VMware backdoor emulation on #GP interception only handles
> +		 * IN{S}, OUT{S}, and RDPMC.
> +		 */
> +		return kvm_emulate_instruction(vcpu,
> +				EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
> +	} else

I would check svm_gp_erratum_intercept here, not do any emulation
if not set, and print a warning.

> +		return emulate_svm_instr(vcpu, opcode);
> +
> +reinject:
> +	kvm_queue_exception_e(vcpu, GP_VECTOR, error_code);
> +	return 1;
> +}
> +
>  void svm_set_gif(struct vcpu_svm *svm, bool value)
>  {
>  	if (value) {


Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] KVM: SVM: Add support for SVM instruction address check change
  2021-01-26  8:18 ` [PATCH v3 3/4] KVM: SVM: Add support for SVM instruction address check change Wei Huang
@ 2021-01-26 11:52   ` Maxim Levitsky
  2021-01-26 15:39     ` Wei Huang
  0 siblings, 1 reply; 13+ messages in thread
From: Maxim Levitsky @ 2021-01-26 11:52 UTC (permalink / raw)
  To: Wei Huang, kvm
  Cc: linux-kernel, pbonzini, vkuznets, seanjc, joro, bp, tglx, mingo,
	x86, jmattson, wanpengli, bsd, dgilbert, luto

On Tue, 2021-01-26 at 03:18 -0500, Wei Huang wrote:
> New AMD CPUs have a change that checks #VMEXIT intercept on special SVM
> instructions before checking their EAX against reserved memory region.
> This change is indicated by CPUID_0x8000000A_EDX[28]. If it is 1, #VMEXIT
> is triggered before #GP. KVM doesn't need to intercept and emulate #GP
> faults as #GP is supposed to be triggered.
> 
> Co-developed-by: Bandan Das <bsd@redhat.com>
> Signed-off-by: Bandan Das <bsd@redhat.com>
> Signed-off-by: Wei Huang <wei.huang2@amd.com>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  arch/x86/include/asm/cpufeatures.h | 1 +
>  arch/x86/kvm/svm/svm.c             | 3 +++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index 84b887825f12..ea89d6fdd79a 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -337,6 +337,7 @@
>  #define X86_FEATURE_AVIC		(15*32+13) /* Virtual Interrupt Controller */
>  #define X86_FEATURE_V_VMSAVE_VMLOAD	(15*32+15) /* Virtual VMSAVE VMLOAD */
>  #define X86_FEATURE_VGIF		(15*32+16) /* Virtual GIF */
> +#define X86_FEATURE_SVME_ADDR_CHK	(15*32+28) /* "" SVME addr check */
>  
>  /* Intel-defined CPU features, CPUID level 0x00000007:0 (ECX), word 16 */
>  #define X86_FEATURE_AVX512VBMI		(16*32+ 1) /* AVX512 Vector Bit Manipulation instructions*/
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index e5ca01e25e89..f9233c79265b 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -1036,6 +1036,9 @@ static __init int svm_hardware_setup(void)
>  		}
>  	}
>  
> +	if (boot_cpu_has(X86_FEATURE_SVME_ADDR_CHK))
> +		svm_gp_erratum_intercept = false;
> +
Again, I would make svm_gp_erratum_intercept a tri-state module param,
and here if it is in 'auto' state do this.

Also I might as well made this code fail if X86_FEATURE_SVME_ADDR_CHK is set but
user insists on svm_gp_erratum_intercept = true.

>  	if (vgif) {
>  		if (!boot_cpu_has(X86_FEATURE_VGIF))
>  			vgif = false;


Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 4/4] KVM: SVM: Support #GP handling for the case of nested on nested
  2021-01-26  8:18 ` [PATCH v3 4/4] KVM: SVM: Support #GP handling for the case of nested on nested Wei Huang
  2021-01-26 11:39   ` Paolo Bonzini
@ 2021-01-26 11:59   ` Maxim Levitsky
  1 sibling, 0 replies; 13+ messages in thread
From: Maxim Levitsky @ 2021-01-26 11:59 UTC (permalink / raw)
  To: Wei Huang, kvm
  Cc: linux-kernel, pbonzini, vkuznets, seanjc, joro, bp, tglx, mingo,
	x86, jmattson, wanpengli, bsd, dgilbert, luto

On Tue, 2021-01-26 at 03:18 -0500, Wei Huang wrote:
> Under the case of nested on nested (L0->L1->L2->L3), #GP triggered by
> SVM instructions can be hided from L1. Instead the hypervisor can
> inject the proper #VMEXIT to inform L1 of what is happening. Thus L1
> can avoid invoking the #GP workaround. For this reason we turns on
> guest VM's X86_FEATURE_SVME_ADDR_CHK bit for KVM running inside VM to
> receive the notification and change behavior.
> 
> Similarly we check if vcpu is under guest mode before emulating the
> vmware-backdoor instructions. For the case of nested on nested, we
> let the guest handle it.
> 
> Co-developed-by: Bandan Das <bsd@redhat.com>
> Signed-off-by: Bandan Das <bsd@redhat.com>
> Signed-off-by: Wei Huang <wei.huang2@amd.com>
> Tested-by: Maxim Levitsky <mlevitsk@redhat.com>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  arch/x86/kvm/svm/svm.c | 20 ++++++++++++++++++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index f9233c79265b..83c401d2709f 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -929,6 +929,9 @@ static __init void svm_set_cpu_caps(void)
>  
>  		if (npt_enabled)
>  			kvm_cpu_cap_set(X86_FEATURE_NPT);
> +
> +		/* Nested VM can receive #VMEXIT instead of triggering #GP */
> +		kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK);
>  	}
>  
>  	/* CPUID 0x80000008 */
> @@ -2198,6 +2201,11 @@ static int svm_instr_opcode(struct kvm_vcpu *vcpu)
>  
>  static int emulate_svm_instr(struct kvm_vcpu *vcpu, int opcode)
>  {
> +	const int guest_mode_exit_codes[] = {
> +		[SVM_INSTR_VMRUN] = SVM_EXIT_VMRUN,
> +		[SVM_INSTR_VMLOAD] = SVM_EXIT_VMLOAD,
> +		[SVM_INSTR_VMSAVE] = SVM_EXIT_VMSAVE,
> +	};
>  	int (*const svm_instr_handlers[])(struct vcpu_svm *svm) = {
>  		[SVM_INSTR_VMRUN] = vmrun_interception,
>  		[SVM_INSTR_VMLOAD] = vmload_interception,
> @@ -2205,7 +2213,14 @@ static int emulate_svm_instr(struct kvm_vcpu *vcpu, int opcode)
>  	};
>  	struct vcpu_svm *svm = to_svm(vcpu);
>  
> -	return svm_instr_handlers[opcode](svm);
> +	if (is_guest_mode(vcpu)) {
> +		svm->vmcb->control.exit_code = guest_mode_exit_codes[opcode];
> +		svm->vmcb->control.exit_info_1 = 0;
> +		svm->vmcb->control.exit_info_2 = 0;
> +
> +		return nested_svm_vmexit(svm);
> +	} else
> +		return svm_instr_handlers[opcode](svm);
>  }
>  
>  /*
> @@ -2239,7 +2254,8 @@ static int gp_interception(struct vcpu_svm *svm)
>  		 * VMware backdoor emulation on #GP interception only handles
>  		 * IN{S}, OUT{S}, and RDPMC.
>  		 */
> -		return kvm_emulate_instruction(vcpu,
> +		if (!is_guest_mode(vcpu))
> +			return kvm_emulate_instruction(vcpu,
>  				EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
>  	} else
>  		return emulate_svm_instr(vcpu, opcode);

To be honest I expected the vmware backdoor fix to be in a separate patch,
but I see that Paulo already took these patches so I guess it is too late.

Anyway I am very happy to see this workaround merged, and see that bug
disappear forever.

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/4] Handle #GP for SVM execution instructions
  2021-01-26 11:39 ` [PATCH v3 0/4] Handle #GP for SVM execution instructions Paolo Bonzini
@ 2021-01-26 15:05   ` Wei Huang
  0 siblings, 0 replies; 13+ messages in thread
From: Wei Huang @ 2021-01-26 15:05 UTC (permalink / raw)
  To: Paolo Bonzini, kvm
  Cc: linux-kernel, vkuznets, mlevitsk, seanjc, joro, bp, tglx, mingo,
	x86, jmattson, wanpengli, bsd, dgilbert, luto



On 1/26/21 5:39 AM, Paolo Bonzini wrote:
> On 26/01/21 09:18, Wei Huang wrote:
>> While running SVM related instructions (VMRUN/VMSAVE/VMLOAD), some AMD
>> CPUs check EAX against reserved memory regions (e.g. SMM memory on host)
>> before checking VMCB's instruction intercept. If EAX falls into such
>> memory areas, #GP is triggered before #VMEXIT. This causes unexpected #GP
>> under nested virtualization. To solve this problem, this patchset makes
>> KVM trap #GP and emulate these SVM instuctions accordingly.
>>
>> Also newer AMD CPUs will change this behavior by triggering #VMEXIT
>> before #GP. This change is indicated by CPUID_0x8000000A_EDX[28]. Under
>> this circumstance, #GP interception is not required. This patchset 
>> supports
>> the new feature.
>>
>> This patchset has been verified with vmrun_errata_test and 
>> vmware_backdoor
>> tests of kvm_unit_test on the following configs. Also it was verified 
>> that
>> vmware_backdoor can be turned on under nested on nested.
>>    * Current CPU: nested, nested on nested
>>    * New CPU with X86_FEATURE_SVME_ADDR_CHK: nested, nested on nested
>>
>> v2->v3:
>>    * Change the decode function name to x86_decode_emulated_instruction()
>>    * Add a new variable, svm_gp_erratum_intercept, to control 
>> interception
>>    * Turn on VM's X86_FEATURE_SVME_ADDR_CHK feature in svm_set_cpu_caps()
>>    * Fix instruction emulation for vmware_backdoor under nested-on-nested
>>    * Minor comment fixes
>>
>> v1->v2:
>>    * Factor out instruction decode for sharing
>>    * Re-org gp_interception() handling for both #GP and vmware_backdoor
>>    * Use kvm_cpu_cap for X86_FEATURE_SVME_ADDR_CHK feature support
>>    * Add nested on nested support
>>
>> Thanks,
>> -Wei
>>
>> Wei Huang (4):
>>    KVM: x86: Factor out x86 instruction emulation with decoding
>>    KVM: SVM: Add emulation support for #GP triggered by SVM instructions
>>    KVM: SVM: Add support for SVM instruction address check change
>>    KVM: SVM: Support #GP handling for the case of nested on nested
>>
>>   arch/x86/include/asm/cpufeatures.h |   1 +
>>   arch/x86/kvm/svm/svm.c             | 128 +++++++++++++++++++++++++----
>>   arch/x86/kvm/x86.c                 |  62 ++++++++------
>>   arch/x86/kvm/x86.h                 |   2 +
>>   4 files changed, 152 insertions(+), 41 deletions(-)
>>
> 
> Queued, thanks.

Thanks. BTW because we use kvm_cpu_cap_set() in svm_set_cpu_caps(). This 
will be reflected into the CPUID received by QEMU. QEMU needs a one-line 
patch to declare the new feature. I will send it out this morning.

-Wei

> 
> Paolo
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] KVM: SVM: Add support for SVM instruction address check change
  2021-01-26 11:52   ` Maxim Levitsky
@ 2021-01-26 15:39     ` Wei Huang
  0 siblings, 0 replies; 13+ messages in thread
From: Wei Huang @ 2021-01-26 15:39 UTC (permalink / raw)
  To: Maxim Levitsky, kvm
  Cc: linux-kernel, pbonzini, vkuznets, seanjc, joro, bp, tglx, mingo,
	x86, jmattson, wanpengli, bsd, dgilbert, luto



On 1/26/21 5:52 AM, Maxim Levitsky wrote:
> On Tue, 2021-01-26 at 03:18 -0500, Wei Huang wrote:
>> New AMD CPUs have a change that checks #VMEXIT intercept on special SVM
>> instructions before checking their EAX against reserved memory region.
>> This change is indicated by CPUID_0x8000000A_EDX[28]. If it is 1, #VMEXIT
>> is triggered before #GP. KVM doesn't need to intercept and emulate #GP
>> faults as #GP is supposed to be triggered.
>>
>> Co-developed-by: Bandan Das <bsd@redhat.com>
>> Signed-off-by: Bandan Das <bsd@redhat.com>
>> Signed-off-by: Wei Huang <wei.huang2@amd.com>
>> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
>> ---
>>   arch/x86/include/asm/cpufeatures.h | 1 +
>>   arch/x86/kvm/svm/svm.c             | 3 +++
>>   2 files changed, 4 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
>> index 84b887825f12..ea89d6fdd79a 100644
>> --- a/arch/x86/include/asm/cpufeatures.h
>> +++ b/arch/x86/include/asm/cpufeatures.h
>> @@ -337,6 +337,7 @@
>>   #define X86_FEATURE_AVIC		(15*32+13) /* Virtual Interrupt Controller */
>>   #define X86_FEATURE_V_VMSAVE_VMLOAD	(15*32+15) /* Virtual VMSAVE VMLOAD */
>>   #define X86_FEATURE_VGIF		(15*32+16) /* Virtual GIF */
>> +#define X86_FEATURE_SVME_ADDR_CHK	(15*32+28) /* "" SVME addr check */
>>   
>>   /* Intel-defined CPU features, CPUID level 0x00000007:0 (ECX), word 16 */
>>   #define X86_FEATURE_AVX512VBMI		(16*32+ 1) /* AVX512 Vector Bit Manipulation instructions*/
>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>> index e5ca01e25e89..f9233c79265b 100644
>> --- a/arch/x86/kvm/svm/svm.c
>> +++ b/arch/x86/kvm/svm/svm.c
>> @@ -1036,6 +1036,9 @@ static __init int svm_hardware_setup(void)
>>   		}
>>   	}
>>   
>> +	if (boot_cpu_has(X86_FEATURE_SVME_ADDR_CHK))
>> +		svm_gp_erratum_intercept = false;
>> +
> Again, I would make svm_gp_erratum_intercept a tri-state module param,
> and here if it is in 'auto' state do this.
> 

I will try to craft a param patch and see if it flies...


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-01-26 17:09 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-26  8:18 [PATCH v3 0/4] Handle #GP for SVM execution instructions Wei Huang
2021-01-26  8:18 ` [PATCH v3 1/4] KVM: x86: Factor out x86 instruction emulation with decoding Wei Huang
2021-01-26  8:18 ` [PATCH v3 2/4] KVM: SVM: Add emulation support for #GP triggered by SVM instructions Wei Huang
2021-01-26 11:34   ` Paolo Bonzini
2021-01-26 11:50   ` Maxim Levitsky
2021-01-26  8:18 ` [PATCH v3 3/4] KVM: SVM: Add support for SVM instruction address check change Wei Huang
2021-01-26 11:52   ` Maxim Levitsky
2021-01-26 15:39     ` Wei Huang
2021-01-26  8:18 ` [PATCH v3 4/4] KVM: SVM: Support #GP handling for the case of nested on nested Wei Huang
2021-01-26 11:39   ` Paolo Bonzini
2021-01-26 11:59   ` Maxim Levitsky
2021-01-26 11:39 ` [PATCH v3 0/4] Handle #GP for SVM execution instructions Paolo Bonzini
2021-01-26 15:05   ` Wei Huang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).