All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] KVM: nested virt SMM fixes
@ 2017-09-19 15:37 Ladi Prosek
  2017-09-19 15:37 ` [PATCH v2 1/6] KVM: x86: introduce ISA specific SMM entry/exit callbacks Ladi Prosek
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Ladi Prosek @ 2017-09-19 15:37 UTC (permalink / raw)
  To: kvm; +Cc: rkrcmar, pbonzini

Windows Server 2016 with Hyper-V enabled fails to boot on OVMF with SMM
(OVMF_CODE-need-smm.fd). Turns out that the SMM emulation code in KVM
does not handle nested virtualization very well, leading to a whole bunch
of issues.

For example, Hyper-V uses descriptor table exiting (SECONDARY_EXEC_DESC)
so when the SMM handler tries to switch from real mode a VM exit occurs
and is forwarded to a clueless L1.

This series fixes it by switching the vcpu to !guest_mode, i.e. to the L1
state, before entering SMM and then switching back to L2 after the RSM
instruction is emulated.

Patches 1 and 2 are common for both Intel and AMD, patch 3 fixes Intel,
and patches 5-6 AMD. Patch 4 adds more state to the SMRAM save area as
prescribed by the Intel SDM. It is however not required to make Windows
work.

v1->v2:
* Moved left_smm detection to emulator_set_hflags (couldn't quite get rid
  of the field despite my original claim) (Paolo)
* Moved the kvm_x86_ops->post_leave_smm() call a few statements down so
  it really runs after all state has been synced.
* Added the smi_allowed callback (new patch 2) to avoid running into
  WARN_ON_ONCE(vmx->nested.nested_run_pending) on Intel.

Ladi Prosek (6):
      KVM: x86: introduce ISA specific SMM entry/exit callbacks
      KVM: x86: introduce ISA specific smi_allowed callback
      KVM: nVMX: fix SMI injection in guest mode
      KVM: nVMX: save nested EPT information in SMRAM state save map
      KVM: nSVM: refactor nested_svm_vmrun
      KVM: nSVM: fix SMI injection in guest mode

 arch/x86/include/asm/kvm_emulate.h |   1 +
 arch/x86/include/asm/kvm_host.h    |   9 ++
 arch/x86/kvm/svm.c                 | 186 ++++++++++++++++++++++++-------------
 arch/x86/kvm/vmx.c                 |  91 ++++++++++++++++--
 arch/x86/kvm/x86.c                 |  17 +++-
 5 files changed, 227 insertions(+), 77 deletions(-)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/6] KVM: x86: introduce ISA specific SMM entry/exit callbacks
  2017-09-19 15:37 [PATCH v2 0/6] KVM: nested virt SMM fixes Ladi Prosek
@ 2017-09-19 15:37 ` Ladi Prosek
  2017-09-19 15:37 ` [PATCH v2 2/6] KVM: x86: introduce ISA specific smi_allowed callback Ladi Prosek
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Ladi Prosek @ 2017-09-19 15:37 UTC (permalink / raw)
  To: kvm; +Cc: rkrcmar, pbonzini

Entering and exiting SMM may require ISA specific handling under certain
circumstances. This commit adds two new callbacks with empty implementations.
Actual functionality will be added in following commits.

* prep_enter_smm() is to be called when injecting an SMM, before any
  SMM related vcpu state has been changed
* post_leave_smm() is to be called when emulating the RSM instruction,
  after all SMM related vcpu state has been restored

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
---
 arch/x86/include/asm/kvm_emulate.h |  1 +
 arch/x86/include/asm/kvm_host.h    |  3 +++
 arch/x86/kvm/svm.c                 | 15 +++++++++++++++
 arch/x86/kvm/vmx.c                 | 15 +++++++++++++++
 arch/x86/kvm/x86.c                 | 12 +++++++++++-
 5 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index fde36f189836..8d56703d9e2e 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -298,6 +298,7 @@ struct x86_emulate_ctxt {
 	bool perm_ok; /* do not check permissions if true */
 	bool ud;	/* inject an #UD if host doesn't support insn */
 	bool tf;	/* TF value before instruction (after for syscall/sysret) */
+	bool left_smm;  /* post_leave_smm() needs to be called after emulation */
 
 	bool have_exception;
 	struct x86_exception exception;
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 92c9032502d8..26acdb3c9a8f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1058,6 +1058,9 @@ struct kvm_x86_ops {
 	void (*cancel_hv_timer)(struct kvm_vcpu *vcpu);
 
 	void (*setup_mce)(struct kvm_vcpu *vcpu);
+
+	int (*prep_enter_smm)(struct kvm_vcpu *vcpu, char *smstate);
+	int (*post_leave_smm)(struct kvm_vcpu *vcpu);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index af256b786a70..0c5a5999556b 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5357,6 +5357,18 @@ static void svm_setup_mce(struct kvm_vcpu *vcpu)
 	vcpu->arch.mcg_cap &= 0x1ff;
 }
 
+static int svm_prep_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
+{
+	/* TODO: Implement */
+	return 0;
+}
+
+static int svm_post_leave_smm(struct kvm_vcpu *vcpu)
+{
+	/* TODO: Implement */
+	return 0;
+}
+
 static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.cpu_has_kvm_support = has_svm,
 	.disabled_by_bios = is_disabled,
@@ -5467,6 +5479,9 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.deliver_posted_interrupt = svm_deliver_avic_intr,
 	.update_pi_irte = svm_update_pi_irte,
 	.setup_mce = svm_setup_mce,
+
+	.prep_enter_smm = svm_prep_enter_smm,
+	.post_leave_smm = svm_post_leave_smm,
 };
 
 static int __init svm_init(void)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c6ef2940119b..d56a528ae15f 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -11629,6 +11629,18 @@ static void vmx_setup_mce(struct kvm_vcpu *vcpu)
 			~FEATURE_CONTROL_LMCE;
 }
 
+static int vmx_prep_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
+{
+	/* TODO: Implement */
+	return 0;
+}
+
+static int vmx_post_leave_smm(struct kvm_vcpu *vcpu)
+{
+	/* TODO: Implement */
+	return 0;
+}
+
 static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
 	.cpu_has_kvm_support = cpu_has_kvm_support,
 	.disabled_by_bios = vmx_disabled_by_bios,
@@ -11754,6 +11766,9 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
 #endif
 
 	.setup_mce = vmx_setup_mce,
+
+	.prep_enter_smm = vmx_prep_enter_smm,
+	.post_leave_smm = vmx_post_leave_smm,
 };
 
 static int __init vmx_init(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 272320eb328c..dad9110e5a1c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5254,7 +5254,11 @@ static unsigned emulator_get_hflags(struct x86_emulate_ctxt *ctxt)
 
 static void emulator_set_hflags(struct x86_emulate_ctxt *ctxt, unsigned emul_flags)
 {
-	kvm_set_hflags(emul_to_vcpu(ctxt), emul_flags);
+	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
+
+	if ((vcpu->arch.hflags & HF_SMM_MASK) && !(emul_flags & HF_SMM_MASK))
+		ctxt->left_smm = true;
+	kvm_set_hflags(vcpu, emul_flags);
 }
 
 static const struct x86_emulate_ops emulate_ops = {
@@ -5674,6 +5678,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
 		ctxt->have_exception = false;
 		ctxt->exception.vector = -1;
 		ctxt->perm_ok = false;
+		ctxt->left_smm = false;
 
 		ctxt->ud = emulation_type & EMULTYPE_TRAP_UD;
 
@@ -5761,6 +5766,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
 		if (!ctxt->have_exception ||
 		    exception_type(ctxt->exception.vector) == EXCPT_TRAP)
 			__kvm_set_rflags(vcpu, ctxt->eflags);
+		if (r == EMULATE_DONE && ctxt->left_smm)
+			kvm_x86_ops->post_leave_smm(vcpu);
 
 		/*
 		 * For STI, interrupts are shadowed; so KVM_REQ_EVENT will
@@ -6614,6 +6621,9 @@ static void enter_smm(struct kvm_vcpu *vcpu)
 	trace_kvm_enter_smm(vcpu->vcpu_id, vcpu->arch.smbase, true);
 	vcpu->arch.hflags |= HF_SMM_MASK;
 	memset(buf, 0, 512);
+
+	kvm_x86_ops->prep_enter_smm(vcpu, buf);
+
 	if (guest_cpuid_has_longmode(vcpu))
 		enter_smm_save_state_64(vcpu, buf);
 	else
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/6] KVM: x86: introduce ISA specific smi_allowed callback
  2017-09-19 15:37 [PATCH v2 0/6] KVM: nested virt SMM fixes Ladi Prosek
  2017-09-19 15:37 ` [PATCH v2 1/6] KVM: x86: introduce ISA specific SMM entry/exit callbacks Ladi Prosek
@ 2017-09-19 15:37 ` Ladi Prosek
  2017-09-19 15:37 ` [PATCH v2 3/6] KVM: nVMX: fix SMI injection in guest mode Ladi Prosek
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Ladi Prosek @ 2017-09-19 15:37 UTC (permalink / raw)
  To: kvm; +Cc: rkrcmar, pbonzini

Similar to NMI, there may be ISA specific reasons why an SMI cannot be
injected into the guest. This commit adds a new smi_allowed callback to
be implemented in following commits.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
---
 arch/x86/include/asm/kvm_host.h | 1 +
 arch/x86/kvm/svm.c              | 6 ++++++
 arch/x86/kvm/vmx.c              | 6 ++++++
 arch/x86/kvm/x86.c              | 2 +-
 4 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 26acdb3c9a8f..9719347da965 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1059,6 +1059,7 @@ struct kvm_x86_ops {
 
 	void (*setup_mce)(struct kvm_vcpu *vcpu);
 
+	int (*smi_allowed)(struct kvm_vcpu *vcpu);
 	int (*prep_enter_smm)(struct kvm_vcpu *vcpu, char *smstate);
 	int (*post_leave_smm)(struct kvm_vcpu *vcpu);
 };
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 0c5a5999556b..f57dd362f1b7 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5357,6 +5357,11 @@ static void svm_setup_mce(struct kvm_vcpu *vcpu)
 	vcpu->arch.mcg_cap &= 0x1ff;
 }
 
+static int svm_smi_allowed(struct kvm_vcpu *vcpu)
+{
+	return 1;
+}
+
 static int svm_prep_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
 {
 	/* TODO: Implement */
@@ -5480,6 +5485,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.update_pi_irte = svm_update_pi_irte,
 	.setup_mce = svm_setup_mce,
 
+	.smi_allowed = svm_smi_allowed,
 	.prep_enter_smm = svm_prep_enter_smm,
 	.post_leave_smm = svm_post_leave_smm,
 };
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index d56a528ae15f..8b2687ec8623 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -11629,6 +11629,11 @@ static void vmx_setup_mce(struct kvm_vcpu *vcpu)
 			~FEATURE_CONTROL_LMCE;
 }
 
+static int vmx_smi_allowed(struct kvm_vcpu *vcpu)
+{
+	return 1;
+}
+
 static int vmx_prep_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
 {
 	/* TODO: Implement */
@@ -11767,6 +11772,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
 
 	.setup_mce = vmx_setup_mce,
 
+	.smi_allowed = vmx_smi_allowed,
 	.prep_enter_smm = vmx_prep_enter_smm,
 	.post_leave_smm = vmx_post_leave_smm,
 };
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index dad9110e5a1c..dad293ed2868 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6404,7 +6404,7 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool req_int_win)
 	}
 
 	/* try to inject new event if pending */
-	if (vcpu->arch.smi_pending && !is_smm(vcpu)) {
+	if (vcpu->arch.smi_pending && !is_smm(vcpu) && kvm_x86_ops->smi_allowed(vcpu)) {
 		vcpu->arch.smi_pending = false;
 		enter_smm(vcpu);
 	} else if (vcpu->arch.nmi_pending && kvm_x86_ops->nmi_allowed(vcpu)) {
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 3/6] KVM: nVMX: fix SMI injection in guest mode
  2017-09-19 15:37 [PATCH v2 0/6] KVM: nested virt SMM fixes Ladi Prosek
  2017-09-19 15:37 ` [PATCH v2 1/6] KVM: x86: introduce ISA specific SMM entry/exit callbacks Ladi Prosek
  2017-09-19 15:37 ` [PATCH v2 2/6] KVM: x86: introduce ISA specific smi_allowed callback Ladi Prosek
@ 2017-09-19 15:37 ` Ladi Prosek
  2017-09-20  7:53   ` Paolo Bonzini
  2017-09-19 15:37 ` [PATCH v2 4/6] KVM: nVMX: save nested EPT information in SMRAM state save map Ladi Prosek
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Ladi Prosek @ 2017-09-19 15:37 UTC (permalink / raw)
  To: kvm; +Cc: rkrcmar, pbonzini

Entering SMM while running in guest mode wasn't working very well because several
pieces of the vcpu state were left set up for nested operation.

Some of the issues observed:

* L1 was getting unexpected VM exits (using L1 interception controls but running
  in SMM execution environment)
* SMM handler couldn't write to vmx_set_cr4 because of incorrect validity checks
  predicated on nested.vmxon
* MMU was confused (walk_mmu was still set to nested_mmu)

Intel SDM actually prescribes the logical processor to "leave VMX operation" upon
entering SMM in 34.14.1 Default Treatment of SMI Delivery. What we need to do is
basically get out of guest mode and set nested.vmxon to false for the duration of
SMM. All this completely transparent to L1, i.e. L1 is not given control and no
L1 observable state changes.

To avoid code duplication this commit takes advantage of the existing nested
vmexit and run functionality, perhaps at the cost of efficiency. To get out of
guest mode, nested_vmx_vmexit with exit_reason == -1 is called, a trick already
used in vmx_leave_nested. Re-entering is cleaner, using enter_vmx_non_root_mode.

This commit fixes running Windows Server 2016 with Hyper-V enabled in a VM with
OVMF firmware (OVMF_CODE-need-smm.fd).

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
---
 arch/x86/kvm/vmx.c | 58 +++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 47 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 8b2687ec8623..f37514f1d9cb 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -481,6 +481,14 @@ struct nested_vmx {
 	u64 nested_vmx_cr4_fixed0;
 	u64 nested_vmx_cr4_fixed1;
 	u64 nested_vmx_vmcs_enum;
+
+	/* SMM related state */
+	struct {
+		/* in VMX operation on SMM entry? */
+		bool vmxon;
+		/* in guest mode on SMM entry? */
+		bool guest_mode;
+	} smm;
 };
 
 #define POSTED_INTR_ON  0
@@ -11112,8 +11120,11 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 	WARN_ON_ONCE(vmx->nested.nested_run_pending);
 
 	leave_guest_mode(vcpu);
-	prepare_vmcs12(vcpu, vmcs12, exit_reason, exit_intr_info,
-		       exit_qualification);
+	if (exit_reason == -1)
+		sync_vmcs12(vcpu, vmcs12);
+	else
+		prepare_vmcs12(vcpu, vmcs12, exit_reason, exit_intr_info,
+			       exit_qualification);
 
 	if (nested_vmx_store_msr(vcpu, vmcs12->vm_exit_msr_store_addr,
 				 vmcs12->vm_exit_msr_store_count))
@@ -11139,12 +11150,13 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 			INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
 	}
 
-	trace_kvm_nested_vmexit_inject(vmcs12->vm_exit_reason,
-				       vmcs12->exit_qualification,
-				       vmcs12->idt_vectoring_info_field,
-				       vmcs12->vm_exit_intr_info,
-				       vmcs12->vm_exit_intr_error_code,
-				       KVM_ISA_VMX);
+	if (exit_reason != -1)
+		trace_kvm_nested_vmexit_inject(vmcs12->vm_exit_reason,
+					       vmcs12->exit_qualification,
+					       vmcs12->idt_vectoring_info_field,
+					       vmcs12->vm_exit_intr_info,
+					       vmcs12->vm_exit_intr_error_code,
+					       KVM_ISA_VMX);
 
 	vm_entry_controls_reset_shadow(vmx);
 	vm_exit_controls_reset_shadow(vmx);
@@ -11214,7 +11226,7 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 		nested_vmx_failValid(vcpu, vm_inst_error);
 	} else
 		nested_vmx_succeed(vcpu);
-	if (enable_shadow_vmcs)
+	if (enable_shadow_vmcs && exit_reason != -1)
 		vmx->nested.sync_shadow_vmcs = true;
 
 	/* in case we halted in L2 */
@@ -11631,18 +11643,42 @@ static void vmx_setup_mce(struct kvm_vcpu *vcpu)
 
 static int vmx_smi_allowed(struct kvm_vcpu *vcpu)
 {
+	/* we need a nested vmexit to enter SMM, postpone if run is pending */
+	if (to_vmx(vcpu)->nested.nested_run_pending)
+		return 0;
 	return 1;
 }
 
 static int vmx_prep_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
 {
-	/* TODO: Implement */
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+	vmx->nested.smm.guest_mode = is_guest_mode(vcpu);
+	if (vmx->nested.smm.guest_mode)
+		nested_vmx_vmexit(vcpu, -1, 0, 0);
+
+	vmx->nested.smm.vmxon = vmx->nested.vmxon;
+	vmx->nested.vmxon = false;
 	return 0;
 }
 
 static int vmx_post_leave_smm(struct kvm_vcpu *vcpu)
 {
-	/* TODO: Implement */
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+	int ret;
+
+	if (vmx->nested.smm.vmxon) {
+		vmx->nested.vmxon = true;
+		vmx->nested.smm.vmxon = false;
+	}
+
+	if (vmx->nested.smm.guest_mode) {
+		ret = enter_vmx_non_root_mode(vcpu, false);
+		if (ret)
+			return ret;
+
+		vmx->nested.smm.guest_mode = false;
+	}
 	return 0;
 }
 
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 4/6] KVM: nVMX: save nested EPT information in SMRAM state save map
  2017-09-19 15:37 [PATCH v2 0/6] KVM: nested virt SMM fixes Ladi Prosek
                   ` (2 preceding siblings ...)
  2017-09-19 15:37 ` [PATCH v2 3/6] KVM: nVMX: fix SMI injection in guest mode Ladi Prosek
@ 2017-09-19 15:37 ` Ladi Prosek
  2017-09-19 15:37 ` [PATCH v2 5/6] KVM: nSVM: refactor nested_svm_vmrun Ladi Prosek
  2017-09-19 15:37 ` [PATCH v2 6/6] KVM: nSVM: fix SMI injection in guest mode Ladi Prosek
  5 siblings, 0 replies; 15+ messages in thread
From: Ladi Prosek @ 2017-09-19 15:37 UTC (permalink / raw)
  To: kvm; +Cc: rkrcmar, pbonzini

According to Intel SDM 34.14.1 "Default Treatment of SMI Delivery", if the cpu
is in VMX non-root operation and EPT is enabled at the time of an SMI, this
fact is recorded in the SMRAM state save area.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  5 +++++
 arch/x86/kvm/vmx.c              | 18 +++++++++++++++++-
 arch/x86/kvm/x86.c              |  3 ---
 3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 9719347da965..414d362d9fe9 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1441,4 +1441,9 @@ static inline int kvm_cpu_get_apicid(int mps_cpu)
 #endif
 }
 
+#define get_smstate(type, buf, offset)                           \
+	*(type *)((buf) + (offset) - 0x7e00)
+#define put_smstate(type, buf, offset, val)                      \
+	*(type *)((buf) + (offset) - 0x7e00) = val
+
 #endif /* _ASM_X86_KVM_HOST_H */
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index f37514f1d9cb..6d1c8ffc02ee 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -11652,10 +11652,26 @@ static int vmx_smi_allowed(struct kvm_vcpu *vcpu)
 static int vmx_prep_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
+	struct vmcs12 *vmcs12;
 
 	vmx->nested.smm.guest_mode = is_guest_mode(vcpu);
-	if (vmx->nested.smm.guest_mode)
+	if (vmx->nested.smm.guest_mode) {
+		vmcs12 = get_vmcs12(vcpu);
+		if (nested_cpu_has_ept(vmcs12)) {
+			/*
+			 * 34.14.1 Default Treatment of SMI Delivery
+			 * Bit 0 of the 32-bit field at offset SMBASE + 8000H + 7EE0H
+			 * indicates whether the cpu was in VMX non-root operation with
+			 * EPT enabled.
+			 * The 64-bit field at offset SMBASE + 8000H + 7ED8H holds the
+			 * value of the EPT pointer.
+			 */
+			put_smstate(u32, smstate, 0x7ee0,
+				    get_smstate(u32, smstate, 0x7ee0) | 1);
+			put_smstate(u64, smstate, 0x7ed8, vmcs12->ept_pointer);
+		}
 		nested_vmx_vmexit(vcpu, -1, 0, 0);
+	}
 
 	vmx->nested.smm.vmxon = vmx->nested.vmxon;
 	vmx->nested.vmxon = false;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index dad293ed2868..99cd41f1b1ee 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6451,9 +6451,6 @@ static void process_nmi(struct kvm_vcpu *vcpu)
 	kvm_make_request(KVM_REQ_EVENT, vcpu);
 }
 
-#define put_smstate(type, buf, offset, val)			  \
-	*(type *)((buf) + (offset) - 0x7e00) = val
-
 static u32 enter_smm_get_segment_flags(struct kvm_segment *seg)
 {
 	u32 flags = 0;
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 5/6] KVM: nSVM: refactor nested_svm_vmrun
  2017-09-19 15:37 [PATCH v2 0/6] KVM: nested virt SMM fixes Ladi Prosek
                   ` (3 preceding siblings ...)
  2017-09-19 15:37 ` [PATCH v2 4/6] KVM: nVMX: save nested EPT information in SMRAM state save map Ladi Prosek
@ 2017-09-19 15:37 ` Ladi Prosek
  2017-09-19 15:37 ` [PATCH v2 6/6] KVM: nSVM: fix SMI injection in guest mode Ladi Prosek
  5 siblings, 0 replies; 15+ messages in thread
From: Ladi Prosek @ 2017-09-19 15:37 UTC (permalink / raw)
  To: kvm; +Cc: rkrcmar, pbonzini

Analogous to 858e25c06fb0 ("kvm: nVMX: Refactor nested_vmx_run()"), this commit splits
nested_svm_vmrun into two parts. The newly introduced enter_svm_guest_mode modifies the
vcpu state to transition from L1 to L2, while the code left in nested_svm_vmrun handles
the VMRUN instruction.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
---
 arch/x86/kvm/svm.c | 132 ++++++++++++++++++++++++++++-------------------------
 1 file changed, 69 insertions(+), 63 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index f57dd362f1b7..45dcd29ff744 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2892,70 +2892,9 @@ static bool nested_vmcb_checks(struct vmcb *vmcb)
 	return true;
 }
 
-static bool nested_svm_vmrun(struct vcpu_svm *svm)
+static void enter_svm_guest_mode(struct vcpu_svm *svm, u64 vmcb_gpa,
+				 struct vmcb *nested_vmcb, struct page *page)
 {
-	struct vmcb *nested_vmcb;
-	struct vmcb *hsave = svm->nested.hsave;
-	struct vmcb *vmcb = svm->vmcb;
-	struct page *page;
-	u64 vmcb_gpa;
-
-	vmcb_gpa = svm->vmcb->save.rax;
-
-	nested_vmcb = nested_svm_map(svm, svm->vmcb->save.rax, &page);
-	if (!nested_vmcb)
-		return false;
-
-	if (!nested_vmcb_checks(nested_vmcb)) {
-		nested_vmcb->control.exit_code    = SVM_EXIT_ERR;
-		nested_vmcb->control.exit_code_hi = 0;
-		nested_vmcb->control.exit_info_1  = 0;
-		nested_vmcb->control.exit_info_2  = 0;
-
-		nested_svm_unmap(page);
-
-		return false;
-	}
-
-	trace_kvm_nested_vmrun(svm->vmcb->save.rip, vmcb_gpa,
-			       nested_vmcb->save.rip,
-			       nested_vmcb->control.int_ctl,
-			       nested_vmcb->control.event_inj,
-			       nested_vmcb->control.nested_ctl);
-
-	trace_kvm_nested_intercepts(nested_vmcb->control.intercept_cr & 0xffff,
-				    nested_vmcb->control.intercept_cr >> 16,
-				    nested_vmcb->control.intercept_exceptions,
-				    nested_vmcb->control.intercept);
-
-	/* Clear internal status */
-	kvm_clear_exception_queue(&svm->vcpu);
-	kvm_clear_interrupt_queue(&svm->vcpu);
-
-	/*
-	 * Save the old vmcb, so we don't need to pick what we save, but can
-	 * restore everything when a VMEXIT occurs
-	 */
-	hsave->save.es     = vmcb->save.es;
-	hsave->save.cs     = vmcb->save.cs;
-	hsave->save.ss     = vmcb->save.ss;
-	hsave->save.ds     = vmcb->save.ds;
-	hsave->save.gdtr   = vmcb->save.gdtr;
-	hsave->save.idtr   = vmcb->save.idtr;
-	hsave->save.efer   = svm->vcpu.arch.efer;
-	hsave->save.cr0    = kvm_read_cr0(&svm->vcpu);
-	hsave->save.cr4    = svm->vcpu.arch.cr4;
-	hsave->save.rflags = kvm_get_rflags(&svm->vcpu);
-	hsave->save.rip    = kvm_rip_read(&svm->vcpu);
-	hsave->save.rsp    = vmcb->save.rsp;
-	hsave->save.rax    = vmcb->save.rax;
-	if (npt_enabled)
-		hsave->save.cr3    = vmcb->save.cr3;
-	else
-		hsave->save.cr3    = kvm_read_cr3(&svm->vcpu);
-
-	copy_vmcb_control_area(hsave, vmcb);
-
 	if (kvm_get_rflags(&svm->vcpu) & X86_EFLAGS_IF)
 		svm->vcpu.arch.hflags |= HF_HIF_MASK;
 	else
@@ -3048,6 +2987,73 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
 	enable_gif(svm);
 
 	mark_all_dirty(svm->vmcb);
+}
+
+static bool nested_svm_vmrun(struct vcpu_svm *svm)
+{
+	struct vmcb *nested_vmcb;
+	struct vmcb *hsave = svm->nested.hsave;
+	struct vmcb *vmcb = svm->vmcb;
+	struct page *page;
+	u64 vmcb_gpa;
+
+	vmcb_gpa = svm->vmcb->save.rax;
+
+	nested_vmcb = nested_svm_map(svm, svm->vmcb->save.rax, &page);
+	if (!nested_vmcb)
+		return false;
+
+	if (!nested_vmcb_checks(nested_vmcb)) {
+		nested_vmcb->control.exit_code    = SVM_EXIT_ERR;
+		nested_vmcb->control.exit_code_hi = 0;
+		nested_vmcb->control.exit_info_1  = 0;
+		nested_vmcb->control.exit_info_2  = 0;
+
+		nested_svm_unmap(page);
+
+		return false;
+	}
+
+	trace_kvm_nested_vmrun(svm->vmcb->save.rip, vmcb_gpa,
+			       nested_vmcb->save.rip,
+			       nested_vmcb->control.int_ctl,
+			       nested_vmcb->control.event_inj,
+			       nested_vmcb->control.nested_ctl);
+
+	trace_kvm_nested_intercepts(nested_vmcb->control.intercept_cr & 0xffff,
+				    nested_vmcb->control.intercept_cr >> 16,
+				    nested_vmcb->control.intercept_exceptions,
+				    nested_vmcb->control.intercept);
+
+	/* Clear internal status */
+	kvm_clear_exception_queue(&svm->vcpu);
+	kvm_clear_interrupt_queue(&svm->vcpu);
+
+	/*
+	 * Save the old vmcb, so we don't need to pick what we save, but can
+	 * restore everything when a VMEXIT occurs
+	 */
+	hsave->save.es     = vmcb->save.es;
+	hsave->save.cs     = vmcb->save.cs;
+	hsave->save.ss     = vmcb->save.ss;
+	hsave->save.ds     = vmcb->save.ds;
+	hsave->save.gdtr   = vmcb->save.gdtr;
+	hsave->save.idtr   = vmcb->save.idtr;
+	hsave->save.efer   = svm->vcpu.arch.efer;
+	hsave->save.cr0    = kvm_read_cr0(&svm->vcpu);
+	hsave->save.cr4    = svm->vcpu.arch.cr4;
+	hsave->save.rflags = kvm_get_rflags(&svm->vcpu);
+	hsave->save.rip    = kvm_rip_read(&svm->vcpu);
+	hsave->save.rsp    = vmcb->save.rsp;
+	hsave->save.rax    = vmcb->save.rax;
+	if (npt_enabled)
+		hsave->save.cr3    = vmcb->save.cr3;
+	else
+		hsave->save.cr3    = kvm_read_cr3(&svm->vcpu);
+
+	copy_vmcb_control_area(hsave, vmcb);
+
+	enter_svm_guest_mode(svm, vmcb_gpa, nested_vmcb, page);
 
 	return true;
 }
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 6/6] KVM: nSVM: fix SMI injection in guest mode
  2017-09-19 15:37 [PATCH v2 0/6] KVM: nested virt SMM fixes Ladi Prosek
                   ` (4 preceding siblings ...)
  2017-09-19 15:37 ` [PATCH v2 5/6] KVM: nSVM: refactor nested_svm_vmrun Ladi Prosek
@ 2017-09-19 15:37 ` Ladi Prosek
  2017-09-20  7:41   ` Paolo Bonzini
  5 siblings, 1 reply; 15+ messages in thread
From: Ladi Prosek @ 2017-09-19 15:37 UTC (permalink / raw)
  To: kvm; +Cc: rkrcmar, pbonzini

Entering SMM while running in guest mode wasn't working very well because several
pieces of the vcpu state were left set up for nested operation.

Some of the issues observed:

* L1 was getting unexpected VM exits (using L1 interception controls but running
  in SMM execution environment)
* MMU was confused (walk_mmu was still set to nested_mmu)

Intel SDM actually prescribes the logical processor to "leave VMX operation" upon
entering SMM in 34.14.1 Default Treatment of SMI Delivery. AMD doesn't seem to
document this but handling SMM in the same fashion is a safe bet. What we need
to do is basically get out of guest mode for the duration of SMM. All this
completely transparent to L1, i.e. L1 is not given control and no L1 observable
state changes.

To avoid code duplication this commit takes advantage of the existing nested
vmexit and run functionality, perhaps at the cost of efficiency. To get out of
guest mode, nested_svm_vmexit is called, unchanged. Re-entering is performed using
enter_svm_guest_mode.

This commit fixes running Windows Server 2016 with Hyper-V enabled in a VM with
OVMF firmware (OVMF_CODE-need-smm.fd).

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
---
 arch/x86/kvm/svm.c | 37 +++++++++++++++++++++++++++++++++++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 45dcd29ff744..f592f1c271fb 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -152,6 +152,14 @@ struct nested_state {
 
 	/* Nested Paging related state */
 	u64 nested_cr3;
+
+	/* SMM related state */
+	struct {
+		/* in guest mode on SMM entry? */
+		bool guest_mode;
+		/* current nested VMCB on SMM entry */
+		u64 vmcb;
+	} smm;
 };
 
 #define MSRPM_OFFSETS	16
@@ -5370,13 +5378,38 @@ static int svm_smi_allowed(struct kvm_vcpu *vcpu)
 
 static int svm_prep_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
 {
-	/* TODO: Implement */
+	struct vcpu_svm *svm = to_svm(vcpu);
+	int ret;
+
+	svm->nested.smm.guest_mode = is_guest_mode(vcpu);
+	if (svm->nested.smm.guest_mode) {
+		svm->nested.smm.vmcb = svm->nested.vmcb;
+
+		svm->vmcb->save.rax = vcpu->arch.regs[VCPU_REGS_RAX];
+		svm->vmcb->save.rsp = vcpu->arch.regs[VCPU_REGS_RSP];
+		svm->vmcb->save.rip = vcpu->arch.regs[VCPU_REGS_RIP];
+
+		ret = nested_svm_vmexit(svm);
+		if (ret)
+			return ret;
+	}
 	return 0;
 }
 
 static int svm_post_leave_smm(struct kvm_vcpu *vcpu)
 {
-	/* TODO: Implement */
+	struct vcpu_svm *svm = to_svm(vcpu);
+	struct vmcb *nested_vmcb;
+	struct page *page;
+
+	if (svm->nested.smm.guest_mode) {
+		nested_vmcb = nested_svm_map(svm, svm->nested.smm.vmcb, &page);
+		if (!nested_vmcb)
+			return 1;
+		enter_svm_guest_mode(svm, svm->nested.smm.vmcb, nested_vmcb, page);
+
+		svm->nested.smm.guest_mode = false;
+	}
 	return 0;
 }
 
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 6/6] KVM: nSVM: fix SMI injection in guest mode
  2017-09-19 15:37 ` [PATCH v2 6/6] KVM: nSVM: fix SMI injection in guest mode Ladi Prosek
@ 2017-09-20  7:41   ` Paolo Bonzini
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2017-09-20  7:41 UTC (permalink / raw)
  To: Ladi Prosek, kvm; +Cc: rkrcmar

On 19/09/2017 17:37, Ladi Prosek wrote:
> Entering SMM while running in guest mode wasn't working very well because several
> pieces of the vcpu state were left set up for nested operation.
> 
> Some of the issues observed:
> 
> * L1 was getting unexpected VM exits (using L1 interception controls but running
>   in SMM execution environment)
> * MMU was confused (walk_mmu was still set to nested_mmu)
> 
> Intel SDM actually prescribes the logical processor to "leave VMX operation" upon
> entering SMM in 34.14.1 Default Treatment of SMI Delivery. AMD doesn't seem to
> document this but handling SMM in the same fashion is a safe bet. What we need
> to do is basically get out of guest mode for the duration of SMM. All this
> completely transparent to L1, i.e. L1 is not given control and no L1 observable
> state changes.

I agree with this, however AMD does document some fields in the SMM
state save area that are related to SVM.  At least the following are
worth using, because they remove the need for svm->nested.smm

- "SVM Guest" at 0xFED8h

- "SVM Guest VMCB Physical Address" at 0xFEE0h

Both are quadwords (8 bytes).

Paolo

> To avoid code duplication this commit takes advantage of the existing nested
> vmexit and run functionality, perhaps at the cost of efficiency. To get out of
> guest mode, nested_svm_vmexit is called, unchanged. Re-entering is performed using
> enter_svm_guest_mode.
> 
> This commit fixes running Windows Server 2016 with Hyper-V enabled in a VM with
> OVMF firmware (OVMF_CODE-need-smm.fd).
> 
> Signed-off-by: Ladi Prosek <lprosek@redhat.com>
> ---
>  arch/x86/kvm/svm.c | 37 +++++++++++++++++++++++++++++++++++--
>  1 file changed, 35 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 45dcd29ff744..f592f1c271fb 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -152,6 +152,14 @@ struct nested_state {
>  
>  	/* Nested Paging related state */
>  	u64 nested_cr3;
> +
> +	/* SMM related state */
> +	struct {
> +		/* in guest mode on SMM entry? */
> +		bool guest_mode;
> +		/* current nested VMCB on SMM entry */
> +		u64 vmcb;
> +	} smm;
>  };
>  
>  #define MSRPM_OFFSETS	16
> @@ -5370,13 +5378,38 @@ static int svm_smi_allowed(struct kvm_vcpu *vcpu)
>  
>  static int svm_prep_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
>  {
> -	/* TODO: Implement */
> +	struct vcpu_svm *svm = to_svm(vcpu);
> +	int ret;
> +
> +	svm->nested.smm.guest_mode = is_guest_mode(vcpu);
> +	if (svm->nested.smm.guest_mode) {
> +		svm->nested.smm.vmcb = svm->nested.vmcb;
> +
> +		svm->vmcb->save.rax = vcpu->arch.regs[VCPU_REGS_RAX];
> +		svm->vmcb->save.rsp = vcpu->arch.regs[VCPU_REGS_RSP];
> +		svm->vmcb->save.rip = vcpu->arch.regs[VCPU_REGS_RIP];
> +
> +		ret = nested_svm_vmexit(svm);
> +		if (ret)
> +			return ret;
> +	}
>  	return 0;
>  }
>  
>  static int svm_post_leave_smm(struct kvm_vcpu *vcpu)
>  {
> -	/* TODO: Implement */
> +	struct vcpu_svm *svm = to_svm(vcpu);
> +	struct vmcb *nested_vmcb;
> +	struct page *page;
> +
> +	if (svm->nested.smm.guest_mode) {
> +		nested_vmcb = nested_svm_map(svm, svm->nested.smm.vmcb, &page);
> +		if (!nested_vmcb)
> +			return 1;
> +		enter_svm_guest_mode(svm, svm->nested.smm.vmcb, nested_vmcb, page);
> +
> +		svm->nested.smm.guest_mode = false;
> +	}
>  	return 0;
>  }
>  
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/6] KVM: nVMX: fix SMI injection in guest mode
  2017-09-19 15:37 ` [PATCH v2 3/6] KVM: nVMX: fix SMI injection in guest mode Ladi Prosek
@ 2017-09-20  7:53   ` Paolo Bonzini
  2017-09-20  8:13     ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2017-09-20  7:53 UTC (permalink / raw)
  To: Ladi Prosek, kvm; +Cc: rkrcmar

On 19/09/2017 17:37, Ladi Prosek wrote:
> +		if (nested_cpu_has_ept(vmcs12)) {
> +			/*
> +			 * 34.14.1 Default Treatment of SMI Delivery
> +			 * Bit 0 of the 32-bit field at offset SMBASE + 8000H + 7EE0H
> +			 * indicates whether the cpu was in VMX non-root operation with
> +			 * EPT enabled.
> +			 * The 64-bit field at offset SMBASE + 8000H + 7ED8H holds the
> +			 * value of the EPT pointer.
> +			 */
> +			put_smstate(u32, smstate, 0x7ee0,
> +				    get_smstate(u32, smstate, 0x7ee0) | 1);

This is zero, so it should be enough to just write 1 here.

But, it is not clear to me why this is needed.  Since we use the AMD
format for the SMM state save area anyway, I'm inclined to omit it...

Paolo

> +			put_smstate(u64, smstate, 0x7ed8, vmcs12->ept_pointer);
> +		}

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/6] KVM: nVMX: fix SMI injection in guest mode
  2017-09-20  7:53   ` Paolo Bonzini
@ 2017-09-20  8:13     ` Paolo Bonzini
  2017-09-20  8:52       ` Ladi Prosek
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2017-09-20  8:13 UTC (permalink / raw)
  To: Ladi Prosek, kvm; +Cc: rkrcmar

On 20/09/2017 09:53, Paolo Bonzini wrote:
> On 19/09/2017 17:37, Ladi Prosek wrote:
>> +		if (nested_cpu_has_ept(vmcs12)) {
>> +			/*
>> +			 * 34.14.1 Default Treatment of SMI Delivery
>> +			 * Bit 0 of the 32-bit field at offset SMBASE + 8000H + 7EE0H
>> +			 * indicates whether the cpu was in VMX non-root operation with
>> +			 * EPT enabled.
>> +			 * The 64-bit field at offset SMBASE + 8000H + 7ED8H holds the
>> +			 * value of the EPT pointer.
>> +			 */
>> +			put_smstate(u32, smstate, 0x7ee0,
>> +				    get_smstate(u32, smstate, 0x7ee0) | 1);
> 
> This is zero, so it should be enough to just write 1 here.
> 
> But, it is not clear to me why this is needed.  Since we use the AMD
> format for the SMM state save area anyway, I'm inclined to omit it...

To clarify more what I mean, my guess is that the EPT state is meant for
SMI state save handlers that want to emulate e.g. REP INS or REP OUTS,
and therefore need to walk the caller's page tables.  We do not have
such magic SMI state save handlers because we don't use SMM for crazy
things such as emulating the 8042 on top of a USB keyboard.  Therefore,
since we document our SMM state save map version as the AMD format, we
can omit this information.

I probably should also clarify my different opinion on vmx->nested.smm
versus svm->nested.smm.  AMD lets you use SVM in system management mode,
without all the complication of dual-monitor treatment and the like.
They do it simply by storing the SVM state (which is not much) in the
state save map.

On the other hand, Intel's VMX state (VMXON, VMXON area, current VMCS
pointer) is inaccessible to SMM under the default treatment of SMIs.
That's because CR4.VMXE is reserved for SMM under default treatment (KVM
currenty does not implement this and it should be changed---not
necessarily by you, though I won't complain if it is part of v3 :)).

Therefore, storing such hidden state or a subset of it in
vmx->nested.smm is merely a convenience that lets you keep
vmx->nested.vmxon == false while in SMM.  This is different from nSVM,
and I think it's an acceptable difference for the sake of convenience.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/6] KVM: nVMX: fix SMI injection in guest mode
  2017-09-20  8:13     ` Paolo Bonzini
@ 2017-09-20  8:52       ` Ladi Prosek
  2017-09-20 10:14         ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Ladi Prosek @ 2017-09-20  8:52 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: KVM list, Radim Krcmar

On Wed, Sep 20, 2017 at 10:13 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 20/09/2017 09:53, Paolo Bonzini wrote:
>> On 19/09/2017 17:37, Ladi Prosek wrote:
>>> +            if (nested_cpu_has_ept(vmcs12)) {
>>> +                    /*
>>> +                     * 34.14.1 Default Treatment of SMI Delivery
>>> +                     * Bit 0 of the 32-bit field at offset SMBASE + 8000H + 7EE0H
>>> +                     * indicates whether the cpu was in VMX non-root operation with
>>> +                     * EPT enabled.
>>> +                     * The 64-bit field at offset SMBASE + 8000H + 7ED8H holds the
>>> +                     * value of the EPT pointer.
>>> +                     */
>>> +                    put_smstate(u32, smstate, 0x7ee0,
>>> +                                get_smstate(u32, smstate, 0x7ee0) | 1);
>>
>> This is zero, so it should be enough to just write 1 here.
>>
>> But, it is not clear to me why this is needed.  Since we use the AMD
>> format for the SMM state save area anyway, I'm inclined to omit it...
>
> To clarify more what I mean, my guess is that the EPT state is meant for
> SMI state save handlers that want to emulate e.g. REP INS or REP OUTS,
> and therefore need to walk the caller's page tables.  We do not have
> such magic SMI state save handlers because we don't use SMM for crazy
> things such as emulating the 8042 on top of a USB keyboard.  Therefore,
> since we document our SMM state save map version as the AMD format, we
> can omit this information.

Sure, this was just for completeness since I noticed it in the manual.
Can be omitted.

> I probably should also clarify my different opinion on vmx->nested.smm
> versus svm->nested.smm.  AMD lets you use SVM in system management mode,
> without all the complication of dual-monitor treatment and the like.
> They do it simply by storing the SVM state (which is not much) in the
> state save map.
>
> On the other hand, Intel's VMX state (VMXON, VMXON area, current VMCS
> pointer) is inaccessible to SMM under the default treatment of SMIs.
> That's because CR4.VMXE is reserved for SMM under default treatment (KVM
> currenty does not implement this and it should be changed---not
> necessarily by you, though I won't complain if it is part of v3 :)).

Will do :)

> Therefore, storing such hidden state or a subset of it in
> vmx->nested.smm is merely a convenience that lets you keep
> vmx->nested.vmxon == false while in SMM.  This is different from nSVM,
> and I think it's an acceptable difference for the sake of convenience.

So for nSVM, you'd save to and restore from the state save area? I can
certainly do that but it would be hard to guarantee that KVM won't
blow up if the SMM handler pokes at these fields. It feels safer to
keep it internal or just save, not restore from it. I know that this
comment belongs to the nSVM patch but it looks similar to your
suggestion to protect CR4.VMXE on Intel. If we want to prevent SMM
from wreaking havoc by trying to enable VMX, shouldn't we use the same
measures on AMD and prevent it from switching VMCB while the nested
guest is running. Thanks!

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/6] KVM: nVMX: fix SMI injection in guest mode
  2017-09-20  8:52       ` Ladi Prosek
@ 2017-09-20 10:14         ` Paolo Bonzini
  2017-09-20 13:05           ` Ladi Prosek
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2017-09-20 10:14 UTC (permalink / raw)
  To: Ladi Prosek; +Cc: KVM list, Radim Krcmar

On 20/09/2017 10:52, Ladi Prosek wrote:
>> Therefore, storing such hidden state or a subset of it in
>> vmx->nested.smm is merely a convenience that lets you keep
>> vmx->nested.vmxon == false while in SMM.  This is different from nSVM,
>> and I think it's an acceptable difference for the sake of convenience.
> So for nSVM, you'd save to and restore from the state save area? I can
> certainly do that but it would be hard to guarantee that KVM won't
> blow up if the SMM handler pokes at these fields. It feels safer to
> keep it internal or just save, not restore from it. I know that this
> comment belongs to the nSVM patch but it looks similar to your
> suggestion to protect CR4.VMXE on Intel. If we want to prevent SMM
> from wreaking havoc by trying to enable VMX, shouldn't we use the same
> measures on AMD and prevent it from switching VMCB while the nested
> guest is running. Thanks!

The difference is that on AMD you're actually allowed to do that.  The
SMM handler can also poke the general purpose register or even the
descriptor cache, so I think we should stay close to what the manual
says---even if it does two opposite things on Intel vs. AMD.

Thanks!

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/6] KVM: nVMX: fix SMI injection in guest mode
  2017-09-20 10:14         ` Paolo Bonzini
@ 2017-09-20 13:05           ` Ladi Prosek
  2017-09-20 17:02             ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Ladi Prosek @ 2017-09-20 13:05 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: KVM list, Radim Krcmar

On Wed, Sep 20, 2017 at 12:14 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 20/09/2017 10:52, Ladi Prosek wrote:
>>> Therefore, storing such hidden state or a subset of it in
>>> vmx->nested.smm is merely a convenience that lets you keep
>>> vmx->nested.vmxon == false while in SMM.  This is different from nSVM,
>>> and I think it's an acceptable difference for the sake of convenience.
>> So for nSVM, you'd save to and restore from the state save area? I can
>> certainly do that but it would be hard to guarantee that KVM won't
>> blow up if the SMM handler pokes at these fields. It feels safer to
>> keep it internal or just save, not restore from it. I know that this
>> comment belongs to the nSVM patch but it looks similar to your
>> suggestion to protect CR4.VMXE on Intel. If we want to prevent SMM
>> from wreaking havoc by trying to enable VMX, shouldn't we use the same
>> measures on AMD and prevent it from switching VMCB while the nested
>> guest is running. Thanks!
>
> The difference is that on AMD you're actually allowed to do that.  The
> SMM handler can also poke the general purpose register or even the
> descriptor cache, so I think we should stay close to what the manual
> says---even if it does two opposite things on Intel vs. AMD.

The AMD manual seems to list the SVM fields as Read-Only.

"Software should not modify offsets specified as read-only or
reserved, otherwise unpredictable results can occur."

Would this convince you to keep patch 6 as is? Saving and restoring
from the save state area eliminates a bit of internal state but it
adds complexity to other places (need to remember smbase before em_rsm
overwrites it, for example). It's also a bit faster to restore from
internal than from guest memory and it would match VMX.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/6] KVM: nVMX: fix SMI injection in guest mode
  2017-09-20 13:05           ` Ladi Prosek
@ 2017-09-20 17:02             ` Paolo Bonzini
  2017-09-21  6:51               ` Ladi Prosek
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2017-09-20 17:02 UTC (permalink / raw)
  To: Ladi Prosek; +Cc: KVM list, Radim Krcmar

On 20/09/2017 15:05, Ladi Prosek wrote:
> On Wed, Sep 20, 2017 at 12:14 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> On 20/09/2017 10:52, Ladi Prosek wrote:
>>>> Therefore, storing such hidden state or a subset of it in
>>>> vmx->nested.smm is merely a convenience that lets you keep
>>>> vmx->nested.vmxon == false while in SMM.  This is different from nSVM,
>>>> and I think it's an acceptable difference for the sake of convenience.
>>> So for nSVM, you'd save to and restore from the state save area? I can
>>> certainly do that but it would be hard to guarantee that KVM won't
>>> blow up if the SMM handler pokes at these fields. It feels safer to
>>> keep it internal or just save, not restore from it. I know that this
>>> comment belongs to the nSVM patch but it looks similar to your
>>> suggestion to protect CR4.VMXE on Intel. If we want to prevent SMM
>>> from wreaking havoc by trying to enable VMX, shouldn't we use the same
>>> measures on AMD and prevent it from switching VMCB while the nested
>>> guest is running. Thanks!
>>
>> The difference is that on AMD you're actually allowed to do that.  The
>> SMM handler can also poke the general purpose register or even the
>> descriptor cache, so I think we should stay close to what the manual
>> says---even if it does two opposite things on Intel vs. AMD.
> 
> The AMD manual seems to list the SVM fields as Read-Only.
> "Software should not modify offsets specified as read-only or
> reserved, otherwise unpredictable results can occur."

In practice, that means "otherwise you'll get exactly what you think
you'll get, with possibly entertaining effects".  However, this is not
relevant: the point is that on AMD the SMM handler can do VMRUN, while
on VMX it cannot even do VMXON.

So on AMD the handler can change the active VMCB (via VMRUN, which
writes to svm->nested.vmcb).  On RSM however you must get back to L2
using the VMCB stored in the SMM state save area.  This is exactly what
patch 6 does but, because AMD doesn't have hidden registers outside
guest mode, I think restoring from the SMM state save area is more in
line with the spirit of AMD's extensions.

> Would this convince you to keep patch 6 as is? Saving and restoring
> from the save state area eliminates a bit of internal state but it
> adds complexity to other places (need to remember smbase before em_rsm
> overwrites it, for example). It's also a bit faster to restore from
> internal than from guest memory and it would match VMX.

What if you instead reached the kvm_x86_ops post_leave_smm callback from
rsm_load_state_64 (via a new emulator op)?

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/6] KVM: nVMX: fix SMI injection in guest mode
  2017-09-20 17:02             ` Paolo Bonzini
@ 2017-09-21  6:51               ` Ladi Prosek
  0 siblings, 0 replies; 15+ messages in thread
From: Ladi Prosek @ 2017-09-21  6:51 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: KVM list, Radim Krcmar

On Wed, Sep 20, 2017 at 7:02 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 20/09/2017 15:05, Ladi Prosek wrote:
>> On Wed, Sep 20, 2017 at 12:14 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> On 20/09/2017 10:52, Ladi Prosek wrote:
>>>>> Therefore, storing such hidden state or a subset of it in
>>>>> vmx->nested.smm is merely a convenience that lets you keep
>>>>> vmx->nested.vmxon == false while in SMM.  This is different from nSVM,
>>>>> and I think it's an acceptable difference for the sake of convenience.
>>>> So for nSVM, you'd save to and restore from the state save area? I can
>>>> certainly do that but it would be hard to guarantee that KVM won't
>>>> blow up if the SMM handler pokes at these fields. It feels safer to
>>>> keep it internal or just save, not restore from it. I know that this
>>>> comment belongs to the nSVM patch but it looks similar to your
>>>> suggestion to protect CR4.VMXE on Intel. If we want to prevent SMM
>>>> from wreaking havoc by trying to enable VMX, shouldn't we use the same
>>>> measures on AMD and prevent it from switching VMCB while the nested
>>>> guest is running. Thanks!
>>>
>>> The difference is that on AMD you're actually allowed to do that.  The
>>> SMM handler can also poke the general purpose register or even the
>>> descriptor cache, so I think we should stay close to what the manual
>>> says---even if it does two opposite things on Intel vs. AMD.
>>
>> The AMD manual seems to list the SVM fields as Read-Only.
>> "Software should not modify offsets specified as read-only or
>> reserved, otherwise unpredictable results can occur."
>
> In practice, that means "otherwise you'll get exactly what you think
> you'll get, with possibly entertaining effects".  However, this is not
> relevant: the point is that on AMD the SMM handler can do VMRUN, while
> on VMX it cannot even do VMXON.

Understood.

> So on AMD the handler can change the active VMCB (via VMRUN, which
> writes to svm->nested.vmcb).  On RSM however you must get back to L2
> using the VMCB stored in the SMM state save area.  This is exactly what
> patch 6 does but, because AMD doesn't have hidden registers outside
> guest mode, I think restoring from the SMM state save area is more in
> line with the spirit of AMD's extensions.

Agreed that it's more in line with the spirit. In general, I would
lean toward being defensive - in that if supporting a state transition
(like changing the current VMCB by a SMM handler) is not mandated by
the CPU spec, I wouldn't add it. Because less attack surface, less
testing, maintenance, etc.

>> Would this convince you to keep patch 6 as is? Saving and restoring
>> from the save state area eliminates a bit of internal state but it
>> adds complexity to other places (need to remember smbase before em_rsm
>> overwrites it, for example). It's also a bit faster to restore from
>> internal than from guest memory and it would match VMX.
>
> What if you instead reached the kvm_x86_ops post_leave_smm callback from
> rsm_load_state_64 (via a new emulator op)?

That's too early for the post_leave_smm callback to run unfortunately.
I'll either split post_leave_smm to two parts, one running within
rsm_load_state_64 and the other at the end of emulation. Or I'll just
keep a bit of state in x86_emulate_ctxt and keep one callback. Thanks!

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-09-21  6:51 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-19 15:37 [PATCH v2 0/6] KVM: nested virt SMM fixes Ladi Prosek
2017-09-19 15:37 ` [PATCH v2 1/6] KVM: x86: introduce ISA specific SMM entry/exit callbacks Ladi Prosek
2017-09-19 15:37 ` [PATCH v2 2/6] KVM: x86: introduce ISA specific smi_allowed callback Ladi Prosek
2017-09-19 15:37 ` [PATCH v2 3/6] KVM: nVMX: fix SMI injection in guest mode Ladi Prosek
2017-09-20  7:53   ` Paolo Bonzini
2017-09-20  8:13     ` Paolo Bonzini
2017-09-20  8:52       ` Ladi Prosek
2017-09-20 10:14         ` Paolo Bonzini
2017-09-20 13:05           ` Ladi Prosek
2017-09-20 17:02             ` Paolo Bonzini
2017-09-21  6:51               ` Ladi Prosek
2017-09-19 15:37 ` [PATCH v2 4/6] KVM: nVMX: save nested EPT information in SMRAM state save map Ladi Prosek
2017-09-19 15:37 ` [PATCH v2 5/6] KVM: nSVM: refactor nested_svm_vmrun Ladi Prosek
2017-09-19 15:37 ` [PATCH v2 6/6] KVM: nSVM: fix SMI injection in guest mode Ladi Prosek
2017-09-20  7:41   ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.