[PATCH v3 0/5] KVM: nVMX: Make direct IRQ/NMI injection work

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v3 0/5] KVM: nVMX: Make direct IRQ/NMI injection work
@ 2013-03-24 18:44 Jan Kiszka
  2013-03-24 18:44 ` [PATCH v3 1/5] KVM: nVMX: Fix injection of PENDING_INTERRUPT and NMI_WINDOW exits to L1 Jan Kiszka
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Jan Kiszka @ 2013-03-24 18:44 UTC (permalink / raw)
  To: Gleb Natapov, Marcelo Tosatti; +Cc: kvm, Paolo Bonzini, Nadav Har'El

This version addresses the comment on patch 2, simplifying it
significantly by dropping everything that assumed an L2 vmentry on
vmlaunch/vmresume could be canceled by an emulated vmexit to L1.

Jan Kiszka (5):
  KVM: nVMX: Fix injection of PENDING_INTERRUPT and NMI_WINDOW exits to
    L1
  KVM: nVMX: Rework event injection and recovery
  KVM: VMX: Move vmx_nmi_allowed after vmx_set_nmi_mask
  KVM: nVMX: Fix conditions for interrupt injection
  KVM: nVMX: Fix conditions for NMI injection

 arch/x86/kvm/vmx.c |  182 +++++++++++++++++++++++++++++++++++----------------
 1 files changed, 125 insertions(+), 57 deletions(-)

-- 
1.7.3.4


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/5] KVM: nVMX: Fix injection of PENDING_INTERRUPT and NMI_WINDOW exits to L1
  2013-03-24 18:44 [PATCH v3 0/5] KVM: nVMX: Make direct IRQ/NMI injection work Jan Kiszka
@ 2013-03-24 18:44 ` Jan Kiszka
  2013-03-24 18:44 ` [PATCH v3 2/5] KVM: nVMX: Rework event injection and recovery Jan Kiszka
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Jan Kiszka @ 2013-03-24 18:44 UTC (permalink / raw)
  To: Gleb Natapov, Marcelo Tosatti; +Cc: kvm, Paolo Bonzini, Nadav Har'El

From: Jan Kiszka <jan.kiszka@siemens.com>

Check if the interrupt or NMI window exit is for L1 by testing if it has
the corresponding controls enabled. This is required when we allow
direct injection from L0 to L2

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/kvm/vmx.c |    9 ++-------
 1 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 03f5746..8827b3b 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6112,14 +6112,9 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
 	case EXIT_REASON_TRIPLE_FAULT:
 		return 1;
 	case EXIT_REASON_PENDING_INTERRUPT:
+		return nested_cpu_has(vmcs12, CPU_BASED_VIRTUAL_INTR_PENDING);
 	case EXIT_REASON_NMI_WINDOW:
-		/*
-		 * prepare_vmcs02() set the CPU_BASED_VIRTUAL_INTR_PENDING bit
-		 * (aka Interrupt Window Exiting) only when L1 turned it on,
-		 * so if we got a PENDING_INTERRUPT exit, this must be for L1.
-		 * Same for NMI Window Exiting.
-		 */
-		return 1;
+		return nested_cpu_has(vmcs12, CPU_BASED_VIRTUAL_NMI_PENDING);
 	case EXIT_REASON_TASK_SWITCH:
 		return 1;
 	case EXIT_REASON_CPUID:
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/5] KVM: nVMX: Rework event injection and recovery
  2013-03-24 18:44 [PATCH v3 0/5] KVM: nVMX: Make direct IRQ/NMI injection work Jan Kiszka
  2013-03-24 18:44 ` [PATCH v3 1/5] KVM: nVMX: Fix injection of PENDING_INTERRUPT and NMI_WINDOW exits to L1 Jan Kiszka
@ 2013-03-24 18:44 ` Jan Kiszka
  2013-04-10 13:42   ` Gleb Natapov
  2013-04-11 11:22   ` Gleb Natapov
  2013-03-24 18:44 ` [PATCH v3 3/5] KVM: VMX: Move vmx_nmi_allowed after vmx_set_nmi_mask Jan Kiszka
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 13+ messages in thread
From: Jan Kiszka @ 2013-03-24 18:44 UTC (permalink / raw)
  To: Gleb Natapov, Marcelo Tosatti; +Cc: kvm, Paolo Bonzini, Nadav Har'El

From: Jan Kiszka <jan.kiszka@siemens.com>

The basic idea is to always transfer the pending event injection on
vmexit into the architectural state of the VCPU and then drop it from
there if it turns out that we left L2 to enter L1, i.e. if we enter
prepare_vmcs12.

vmcs12_save_pending_events takes care to transfer pending L0 events into
the queue of L1. That is mandatory as L1 may decide to switch the guest
state completely, invalidating or preserving the pending events for
later injection (including on a different node, once we support
migration).

This concept is based on the rule that a pending vmlaunch/vmresume is
not canceled. Otherwise, we would risk to lose injected events or leak
them into the wrong queues. Encode this rule via a WARN_ON_ONCE at the
entry of nested_vmx_vmexit.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 arch/x86/kvm/vmx.c |   90 +++++++++++++++++++++++++++++++++------------------
 1 files changed, 58 insertions(+), 32 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 8827b3b..9d9ff74 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6493,8 +6493,6 @@ static void __vmx_complete_interrupts(struct kvm_vcpu *vcpu,
 
 static void vmx_complete_interrupts(struct vcpu_vmx *vmx)
 {
-	if (is_guest_mode(&vmx->vcpu))
-		return;
 	__vmx_complete_interrupts(&vmx->vcpu, vmx->idt_vectoring_info,
 				  VM_EXIT_INSTRUCTION_LEN,
 				  IDT_VECTORING_ERROR_CODE);
@@ -6502,8 +6500,6 @@ static void vmx_complete_interrupts(struct vcpu_vmx *vmx)
 
 static void vmx_cancel_injection(struct kvm_vcpu *vcpu)
 {
-	if (is_guest_mode(vcpu))
-		return;
 	__vmx_complete_interrupts(vcpu,
 				  vmcs_read32(VM_ENTRY_INTR_INFO_FIELD),
 				  VM_ENTRY_INSTRUCTION_LEN,
@@ -6535,21 +6531,6 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	unsigned long debugctlmsr;
 
-	if (is_guest_mode(vcpu) && !vmx->nested.nested_run_pending) {
-		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
-		if (vmcs12->idt_vectoring_info_field &
-				VECTORING_INFO_VALID_MASK) {
-			vmcs_write32(VM_ENTRY_INTR_INFO_FIELD,
-				vmcs12->idt_vectoring_info_field);
-			vmcs_write32(VM_ENTRY_INSTRUCTION_LEN,
-				vmcs12->vm_exit_instruction_len);
-			if (vmcs12->idt_vectoring_info_field &
-					VECTORING_INFO_DELIVER_CODE_MASK)
-				vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE,
-					vmcs12->idt_vectoring_error_code);
-		}
-	}
-
 	/* Record the guest's net vcpu time for enforced NMI injections. */
 	if (unlikely(!cpu_has_virtual_nmis() && vmx->soft_vnmi_blocked))
 		vmx->entry_time = ktime_get();
@@ -6708,17 +6689,6 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 
 	vmx->idt_vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD);
 
-	if (is_guest_mode(vcpu)) {
-		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
-		vmcs12->idt_vectoring_info_field = vmx->idt_vectoring_info;
-		if (vmx->idt_vectoring_info & VECTORING_INFO_VALID_MASK) {
-			vmcs12->idt_vectoring_error_code =
-				vmcs_read32(IDT_VECTORING_ERROR_CODE);
-			vmcs12->vm_exit_instruction_len =
-				vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
-		}
-	}
-
 	vmx->loaded_vmcs->launched = 1;
 
 	vmx->exit_reason = vmcs_read32(VM_EXIT_REASON);
@@ -7325,6 +7295,48 @@ vmcs12_guest_cr4(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
 			vcpu->arch.cr4_guest_owned_bits));
 }
 
+static void vmcs12_save_pending_event(struct kvm_vcpu *vcpu,
+				       struct vmcs12 *vmcs12)
+{
+	u32 idt_vectoring;
+	unsigned int nr;
+
+	if (vcpu->arch.exception.pending) {
+		nr = vcpu->arch.exception.nr;
+		idt_vectoring = nr | VECTORING_INFO_VALID_MASK;
+
+		if (kvm_exception_is_soft(nr)) {
+			vmcs12->vm_exit_instruction_len =
+				vcpu->arch.event_exit_inst_len;
+			idt_vectoring |= INTR_TYPE_SOFT_EXCEPTION;
+		} else
+			idt_vectoring |= INTR_TYPE_HARD_EXCEPTION;
+
+		if (vcpu->arch.exception.has_error_code) {
+			idt_vectoring |= VECTORING_INFO_DELIVER_CODE_MASK;
+			vmcs12->idt_vectoring_error_code =
+				vcpu->arch.exception.error_code;
+		}
+
+		vmcs12->idt_vectoring_info_field = idt_vectoring;
+	} else if (vcpu->arch.nmi_pending) {
+		vmcs12->idt_vectoring_info_field =
+			INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK | NMI_VECTOR;
+	} else if (vcpu->arch.interrupt.pending) {
+		nr = vcpu->arch.interrupt.nr;
+		idt_vectoring = nr | VECTORING_INFO_VALID_MASK;
+
+		if (vcpu->arch.interrupt.soft) {
+			idt_vectoring |= INTR_TYPE_SOFT_INTR;
+			vmcs12->vm_entry_instruction_len =
+				vcpu->arch.event_exit_inst_len;
+		} else
+			idt_vectoring |= INTR_TYPE_EXT_INTR;
+
+		vmcs12->idt_vectoring_info_field = idt_vectoring;
+	}
+}
+
 /*
  * prepare_vmcs12 is part of what we need to do when the nested L2 guest exits
  * and we want to prepare to run its L1 parent. L1 keeps a vmcs for L2 (vmcs12),
@@ -7416,9 +7428,20 @@ static void prepare_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
 	vmcs12->vm_exit_instruction_len = vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
 	vmcs12->vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO);
 
-	/* clear vm-entry fields which are to be cleared on exit */
 	if (!(vmcs12->vm_exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY))
-		vmcs12->vm_entry_intr_info_field &= ~INTR_INFO_VALID_MASK;
+		/*
+		 * Transfer the event that L0 or L1 may wanted to inject into
+		 * L2 to IDT_VECTORING_INFO_FIELD.
+		 */
+		vmcs12_save_pending_event(vcpu, vmcs12);
+
+	/*
+	 * Drop what we picked up for L2 via vmx_complete_interrupts. It is
+	 * preserved above and would only end up incorrectly in L1.
+	 */
+	vcpu->arch.nmi_injected = false;
+	kvm_clear_exception_queue(vcpu);
+	kvm_clear_interrupt_queue(vcpu);
 }
 
 /*
@@ -7518,6 +7541,9 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu)
 	int cpu;
 	struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
 
+	/* trying to cancel vmlaunch/vmresume is a bug */
+	WARN_ON_ONCE(vmx->nested.nested_run_pending);
+
 	leave_guest_mode(vcpu);
 	prepare_vmcs12(vcpu, vmcs12);
 
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 3/5] KVM: VMX: Move vmx_nmi_allowed after vmx_set_nmi_mask
  2013-03-24 18:44 [PATCH v3 0/5] KVM: nVMX: Make direct IRQ/NMI injection work Jan Kiszka
  2013-03-24 18:44 ` [PATCH v3 1/5] KVM: nVMX: Fix injection of PENDING_INTERRUPT and NMI_WINDOW exits to L1 Jan Kiszka
  2013-03-24 18:44 ` [PATCH v3 2/5] KVM: nVMX: Rework event injection and recovery Jan Kiszka
@ 2013-03-24 18:44 ` Jan Kiszka
  2013-03-24 18:44 ` [PATCH v3 4/5] KVM: nVMX: Fix conditions for interrupt injection Jan Kiszka
  2013-03-24 18:44 ` [PATCH v3 5/5] KVM: nVMX: Fix conditions for NMI injection Jan Kiszka
  4 siblings, 0 replies; 13+ messages in thread
From: Jan Kiszka @ 2013-03-24 18:44 UTC (permalink / raw)
  To: Gleb Natapov, Marcelo Tosatti; +Cc: kvm, Paolo Bonzini, Nadav Har'El

From: Jan Kiszka <jan.kiszka@siemens.com>

vmx_set_nmi_mask will soon be used by vmx_nmi_allowed. No functional
changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 arch/x86/kvm/vmx.c |   20 ++++++++++----------
 1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 9d9ff74..d1bc834 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4284,16 +4284,6 @@ static void vmx_inject_nmi(struct kvm_vcpu *vcpu)
 			INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK | NMI_VECTOR);
 }
 
-static int vmx_nmi_allowed(struct kvm_vcpu *vcpu)
-{
-	if (!cpu_has_virtual_nmis() && to_vmx(vcpu)->soft_vnmi_blocked)
-		return 0;
-
-	return	!(vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) &
-		  (GUEST_INTR_STATE_MOV_SS | GUEST_INTR_STATE_STI
-		   | GUEST_INTR_STATE_NMI));
-}
-
 static bool vmx_get_nmi_mask(struct kvm_vcpu *vcpu)
 {
 	if (!cpu_has_virtual_nmis())
@@ -4323,6 +4313,16 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked)
 	}
 }
 
+static int vmx_nmi_allowed(struct kvm_vcpu *vcpu)
+{
+	if (!cpu_has_virtual_nmis() && to_vmx(vcpu)->soft_vnmi_blocked)
+		return 0;
+
+	return	!(vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) &
+		  (GUEST_INTR_STATE_MOV_SS | GUEST_INTR_STATE_STI
+		   | GUEST_INTR_STATE_NMI));
+}
+
 static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
 {
 	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu)) {
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 4/5] KVM: nVMX: Fix conditions for interrupt injection
  2013-03-24 18:44 [PATCH v3 0/5] KVM: nVMX: Make direct IRQ/NMI injection work Jan Kiszka
                   ` (2 preceding siblings ...)
  2013-03-24 18:44 ` [PATCH v3 3/5] KVM: VMX: Move vmx_nmi_allowed after vmx_set_nmi_mask Jan Kiszka
@ 2013-03-24 18:44 ` Jan Kiszka
  2013-04-11 11:20   ` Gleb Natapov
  2013-03-24 18:44 ` [PATCH v3 5/5] KVM: nVMX: Fix conditions for NMI injection Jan Kiszka
  4 siblings, 1 reply; 13+ messages in thread
From: Jan Kiszka @ 2013-03-24 18:44 UTC (permalink / raw)
  To: Gleb Natapov, Marcelo Tosatti; +Cc: kvm, Paolo Bonzini, Nadav Har'El

From: Jan Kiszka <jan.kiszka@siemens.com>

If we are in guest mode, L0 can only inject events into L2 if L1 has
nothing pending. Otherwise, L0 would overwrite L1's events and they
would get lost. But even if no injection of L1 is pending, we do not
want L0 to interrupt unnecessarily an on going vmentry with all its side
effects on the vmcs. Therefore, injection shall be disallowed during
L1->L2 transitions. This check is conceptually independent of
nested_exit_on_intr.

If L1 traps external interrupts, then we also need to look at L1's
idt_vectoring_info_field. If it is empty, we can kick the guest from L2
to L1, just like the previous code worked.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 arch/x86/kvm/vmx.c |   28 ++++++++++++++++++++--------
 1 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index d1bc834..30aa198 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4325,16 +4325,28 @@ static int vmx_nmi_allowed(struct kvm_vcpu *vcpu)
 
 static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
 {
-	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu)) {
+	if (is_guest_mode(vcpu)) {
 		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
-		if (to_vmx(vcpu)->nested.nested_run_pending ||
-		    (vmcs12->idt_vectoring_info_field &
-		     VECTORING_INFO_VALID_MASK))
+
+		if (to_vmx(vcpu)->nested.nested_run_pending)
 			return 0;
-		nested_vmx_vmexit(vcpu);
-		vmcs12->vm_exit_reason = EXIT_REASON_EXTERNAL_INTERRUPT;
-		vmcs12->vm_exit_intr_info = 0;
-		/* fall through to normal code, but now in L1, not L2 */
+		if (nested_exit_on_intr(vcpu)) {
+			/*
+			 * Check if the idt_vectoring_info_field is free. We
+			 * cannot raise EXIT_REASON_EXTERNAL_INTERRUPT if it
+			 * isn't.
+			 */
+			if (vmcs12->idt_vectoring_info_field &
+			    VECTORING_INFO_VALID_MASK)
+				return 0;
+			nested_vmx_vmexit(vcpu);
+			vmcs12->vm_exit_reason =
+				EXIT_REASON_EXTERNAL_INTERRUPT;
+			vmcs12->vm_exit_intr_info = 0;
+			/*
+			 * fall through to normal code, but now in L1, not L2
+			 */
+		}
 	}
 
 	return (vmcs_readl(GUEST_RFLAGS) & X86_EFLAGS_IF) &&
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 5/5] KVM: nVMX: Fix conditions for NMI injection
  2013-03-24 18:44 [PATCH v3 0/5] KVM: nVMX: Make direct IRQ/NMI injection work Jan Kiszka
                   ` (3 preceding siblings ...)
  2013-03-24 18:44 ` [PATCH v3 4/5] KVM: nVMX: Fix conditions for interrupt injection Jan Kiszka
@ 2013-03-24 18:44 ` Jan Kiszka
  4 siblings, 0 replies; 13+ messages in thread
From: Jan Kiszka @ 2013-03-24 18:44 UTC (permalink / raw)
  To: Gleb Natapov, Marcelo Tosatti; +Cc: kvm, Paolo Bonzini, Nadav Har'El

From: Jan Kiszka <jan.kiszka@siemens.com>

The logic for checking if interrupts can be injected has to be applied
also on NMIs. The difference is that if NMI interception is on these
events are consumed and blocked by the VM exit.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 arch/x86/kvm/vmx.c |   35 +++++++++++++++++++++++++++++++++++
 1 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 30aa198..c01d487 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4190,6 +4190,12 @@ static bool nested_exit_on_intr(struct kvm_vcpu *vcpu)
 		PIN_BASED_EXT_INTR_MASK;
 }
 
+static bool nested_exit_on_nmi(struct kvm_vcpu *vcpu)
+{
+	return get_vmcs12(vcpu)->pin_based_vm_exec_control &
+		PIN_BASED_NMI_EXITING;
+}
+
 static void enable_irq_window(struct kvm_vcpu *vcpu)
 {
 	u32 cpu_based_vm_exec_control;
@@ -4315,6 +4321,35 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked)
 
 static int vmx_nmi_allowed(struct kvm_vcpu *vcpu)
 {
+	if (is_guest_mode(vcpu)) {
+		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
+
+		if (to_vmx(vcpu)->nested.nested_run_pending ||
+		    vmcs_read32(GUEST_ACTIVITY_STATE) ==
+			   GUEST_ACTIVITY_WAIT_SIPI)
+			return 0;
+		if (nested_exit_on_nmi(vcpu)) {
+			/*
+			 * Check if the idt_vectoring_info_field is free. We
+			 * cannot raise EXIT_REASON_EXCEPTION_NMI if it isn't.
+			 */
+			if (vmcs12->idt_vectoring_info_field &
+			    VECTORING_INFO_VALID_MASK)
+				return 0;
+			nested_vmx_vmexit(vcpu);
+			vmcs12->vm_exit_reason = EXIT_REASON_EXCEPTION_NMI;
+			vmcs12->vm_exit_intr_info = NMI_VECTOR |
+				INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK;
+			/*
+			 * The NMI-triggered VM exit counts as injection:
+			 * clear this one and block further NMIs.
+			 */
+			vcpu->arch.nmi_pending = 0;
+			vmx_set_nmi_mask(vcpu, true);
+			return 0;
+		}
+	}
+
 	if (!cpu_has_virtual_nmis() && to_vmx(vcpu)->soft_vnmi_blocked)
 		return 0;
 
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/5] KVM: nVMX: Rework event injection and recovery
  2013-03-24 18:44 ` [PATCH v3 2/5] KVM: nVMX: Rework event injection and recovery Jan Kiszka
@ 2013-04-10 13:42   ` Gleb Natapov
  2013-04-10 13:49     ` Jan Kiszka
  2013-04-11 11:22   ` Gleb Natapov
  1 sibling, 1 reply; 13+ messages in thread
From: Gleb Natapov @ 2013-04-10 13:42 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, kvm, Paolo Bonzini, Nadav Har'El

On Sun, Mar 24, 2013 at 07:44:45PM +0100, Jan Kiszka wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
> 
> The basic idea is to always transfer the pending event injection on
> vmexit into the architectural state of the VCPU and then drop it from
> there if it turns out that we left L2 to enter L1, i.e. if we enter
> prepare_vmcs12.
> 
> vmcs12_save_pending_events takes care to transfer pending L0 events into
> the queue of L1. That is mandatory as L1 may decide to switch the guest
> state completely, invalidating or preserving the pending events for
> later injection (including on a different node, once we support
> migration).
> 
> This concept is based on the rule that a pending vmlaunch/vmresume is
> not canceled. Otherwise, we would risk to lose injected events or leak
> them into the wrong queues. Encode this rule via a WARN_ON_ONCE at the
> entry of nested_vmx_vmexit.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---
>  arch/x86/kvm/vmx.c |   90 +++++++++++++++++++++++++++++++++------------------
>  1 files changed, 58 insertions(+), 32 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 8827b3b..9d9ff74 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -6493,8 +6493,6 @@ static void __vmx_complete_interrupts(struct kvm_vcpu *vcpu,
>  
>  static void vmx_complete_interrupts(struct vcpu_vmx *vmx)
>  {
> -	if (is_guest_mode(&vmx->vcpu))
> -		return;
>  	__vmx_complete_interrupts(&vmx->vcpu, vmx->idt_vectoring_info,
>  				  VM_EXIT_INSTRUCTION_LEN,
>  				  IDT_VECTORING_ERROR_CODE);
> @@ -6502,8 +6500,6 @@ static void vmx_complete_interrupts(struct vcpu_vmx *vmx)
>  
>  static void vmx_cancel_injection(struct kvm_vcpu *vcpu)
>  {
> -	if (is_guest_mode(vcpu))
> -		return;
>  	__vmx_complete_interrupts(vcpu,
>  				  vmcs_read32(VM_ENTRY_INTR_INFO_FIELD),
>  				  VM_ENTRY_INSTRUCTION_LEN,
> @@ -6535,21 +6531,6 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
>  	unsigned long debugctlmsr;
>  
> -	if (is_guest_mode(vcpu) && !vmx->nested.nested_run_pending) {
> -		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> -		if (vmcs12->idt_vectoring_info_field &
> -				VECTORING_INFO_VALID_MASK) {
> -			vmcs_write32(VM_ENTRY_INTR_INFO_FIELD,
> -				vmcs12->idt_vectoring_info_field);
> -			vmcs_write32(VM_ENTRY_INSTRUCTION_LEN,
> -				vmcs12->vm_exit_instruction_len);
> -			if (vmcs12->idt_vectoring_info_field &
> -					VECTORING_INFO_DELIVER_CODE_MASK)
> -				vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE,
> -					vmcs12->idt_vectoring_error_code);
> -		}
> -	}
> -
>  	/* Record the guest's net vcpu time for enforced NMI injections. */
>  	if (unlikely(!cpu_has_virtual_nmis() && vmx->soft_vnmi_blocked))
>  		vmx->entry_time = ktime_get();
> @@ -6708,17 +6689,6 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>  
>  	vmx->idt_vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD);
>  
> -	if (is_guest_mode(vcpu)) {
> -		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> -		vmcs12->idt_vectoring_info_field = vmx->idt_vectoring_info;
> -		if (vmx->idt_vectoring_info & VECTORING_INFO_VALID_MASK) {
> -			vmcs12->idt_vectoring_error_code =
> -				vmcs_read32(IDT_VECTORING_ERROR_CODE);
> -			vmcs12->vm_exit_instruction_len =
> -				vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
> -		}
> -	}
> -
>  	vmx->loaded_vmcs->launched = 1;
>  
>  	vmx->exit_reason = vmcs_read32(VM_EXIT_REASON);
> @@ -7325,6 +7295,48 @@ vmcs12_guest_cr4(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
>  			vcpu->arch.cr4_guest_owned_bits));
>  }
>  
> +static void vmcs12_save_pending_event(struct kvm_vcpu *vcpu,
> +				       struct vmcs12 *vmcs12)
> +{
> +	u32 idt_vectoring;
> +	unsigned int nr;
> +
> +	if (vcpu->arch.exception.pending) {
> +		nr = vcpu->arch.exception.nr;
> +		idt_vectoring = nr | VECTORING_INFO_VALID_MASK;
> +
> +		if (kvm_exception_is_soft(nr)) {
> +			vmcs12->vm_exit_instruction_len =
> +				vcpu->arch.event_exit_inst_len;
> +			idt_vectoring |= INTR_TYPE_SOFT_EXCEPTION;
> +		} else
> +			idt_vectoring |= INTR_TYPE_HARD_EXCEPTION;
> +
> +		if (vcpu->arch.exception.has_error_code) {
> +			idt_vectoring |= VECTORING_INFO_DELIVER_CODE_MASK;
> +			vmcs12->idt_vectoring_error_code =
> +				vcpu->arch.exception.error_code;
> +		}
> +
> +		vmcs12->idt_vectoring_info_field = idt_vectoring;
> +	} else if (vcpu->arch.nmi_pending) {
> +		vmcs12->idt_vectoring_info_field =
> +			INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK | NMI_VECTOR;
> +	} else if (vcpu->arch.interrupt.pending) {
> +		nr = vcpu->arch.interrupt.nr;
> +		idt_vectoring = nr | VECTORING_INFO_VALID_MASK;
> +
> +		if (vcpu->arch.interrupt.soft) {
> +			idt_vectoring |= INTR_TYPE_SOFT_INTR;
> +			vmcs12->vm_entry_instruction_len =
> +				vcpu->arch.event_exit_inst_len;
> +		} else
> +			idt_vectoring |= INTR_TYPE_EXT_INTR;
> +
> +		vmcs12->idt_vectoring_info_field = idt_vectoring;
> +	}
> +}
> +
>  /*
>   * prepare_vmcs12 is part of what we need to do when the nested L2 guest exits
>   * and we want to prepare to run its L1 parent. L1 keeps a vmcs for L2 (vmcs12),
> @@ -7416,9 +7428,20 @@ static void prepare_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
>  	vmcs12->vm_exit_instruction_len = vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
>  	vmcs12->vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO);
>  
> -	/* clear vm-entry fields which are to be cleared on exit */
>  	if (!(vmcs12->vm_exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY))
> -		vmcs12->vm_entry_intr_info_field &= ~INTR_INFO_VALID_MASK;
Why have you dropped this? Where is it cleaned now?

> +		/*
> +		 * Transfer the event that L0 or L1 may wanted to inject into
> +		 * L2 to IDT_VECTORING_INFO_FIELD.
> +		 */
> +		vmcs12_save_pending_event(vcpu, vmcs12);
> +
> +	/*
> +	 * Drop what we picked up for L2 via vmx_complete_interrupts. It is
> +	 * preserved above and would only end up incorrectly in L1.
> +	 */
> +	vcpu->arch.nmi_injected = false;
> +	kvm_clear_exception_queue(vcpu);
> +	kvm_clear_interrupt_queue(vcpu);
>  }
>  
>  /*
> @@ -7518,6 +7541,9 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu)
>  	int cpu;
>  	struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
>  
> +	/* trying to cancel vmlaunch/vmresume is a bug */
> +	WARN_ON_ONCE(vmx->nested.nested_run_pending);
> +
>  	leave_guest_mode(vcpu);
>  	prepare_vmcs12(vcpu, vmcs12);
>  
> -- 
> 1.7.3.4

--
			Gleb.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/5] KVM: nVMX: Rework event injection and recovery
  2013-04-10 13:42   ` Gleb Natapov
@ 2013-04-10 13:49     ` Jan Kiszka
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Kiszka @ 2013-04-10 13:49 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Marcelo Tosatti, kvm, Paolo Bonzini, Nadav Har'El

[-- Attachment #1: Type: text/plain, Size: 6021 bytes --]

On 2013-04-10 15:42, Gleb Natapov wrote:
> On Sun, Mar 24, 2013 at 07:44:45PM +0100, Jan Kiszka wrote:
>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>
>> The basic idea is to always transfer the pending event injection on
>> vmexit into the architectural state of the VCPU and then drop it from
>> there if it turns out that we left L2 to enter L1, i.e. if we enter
>> prepare_vmcs12.
>>
>> vmcs12_save_pending_events takes care to transfer pending L0 events into
>> the queue of L1. That is mandatory as L1 may decide to switch the guest
>> state completely, invalidating or preserving the pending events for
>> later injection (including on a different node, once we support
>> migration).
>>
>> This concept is based on the rule that a pending vmlaunch/vmresume is
>> not canceled. Otherwise, we would risk to lose injected events or leak
>> them into the wrong queues. Encode this rule via a WARN_ON_ONCE at the
>> entry of nested_vmx_vmexit.
>>
>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>> ---
>>  arch/x86/kvm/vmx.c |   90 +++++++++++++++++++++++++++++++++------------------
>>  1 files changed, 58 insertions(+), 32 deletions(-)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 8827b3b..9d9ff74 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -6493,8 +6493,6 @@ static void __vmx_complete_interrupts(struct kvm_vcpu *vcpu,
>>  
>>  static void vmx_complete_interrupts(struct vcpu_vmx *vmx)
>>  {
>> -	if (is_guest_mode(&vmx->vcpu))
>> -		return;
>>  	__vmx_complete_interrupts(&vmx->vcpu, vmx->idt_vectoring_info,
>>  				  VM_EXIT_INSTRUCTION_LEN,
>>  				  IDT_VECTORING_ERROR_CODE);
>> @@ -6502,8 +6500,6 @@ static void vmx_complete_interrupts(struct vcpu_vmx *vmx)
>>  
>>  static void vmx_cancel_injection(struct kvm_vcpu *vcpu)
>>  {
>> -	if (is_guest_mode(vcpu))
>> -		return;
>>  	__vmx_complete_interrupts(vcpu,
>>  				  vmcs_read32(VM_ENTRY_INTR_INFO_FIELD),
>>  				  VM_ENTRY_INSTRUCTION_LEN,
>> @@ -6535,21 +6531,6 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
>>  	unsigned long debugctlmsr;
>>  
>> -	if (is_guest_mode(vcpu) && !vmx->nested.nested_run_pending) {
>> -		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
>> -		if (vmcs12->idt_vectoring_info_field &
>> -				VECTORING_INFO_VALID_MASK) {
>> -			vmcs_write32(VM_ENTRY_INTR_INFO_FIELD,
>> -				vmcs12->idt_vectoring_info_field);
>> -			vmcs_write32(VM_ENTRY_INSTRUCTION_LEN,
>> -				vmcs12->vm_exit_instruction_len);
>> -			if (vmcs12->idt_vectoring_info_field &
>> -					VECTORING_INFO_DELIVER_CODE_MASK)
>> -				vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE,
>> -					vmcs12->idt_vectoring_error_code);
>> -		}
>> -	}
>> -
>>  	/* Record the guest's net vcpu time for enforced NMI injections. */
>>  	if (unlikely(!cpu_has_virtual_nmis() && vmx->soft_vnmi_blocked))
>>  		vmx->entry_time = ktime_get();
>> @@ -6708,17 +6689,6 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>>  
>>  	vmx->idt_vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD);
>>  
>> -	if (is_guest_mode(vcpu)) {
>> -		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
>> -		vmcs12->idt_vectoring_info_field = vmx->idt_vectoring_info;
>> -		if (vmx->idt_vectoring_info & VECTORING_INFO_VALID_MASK) {
>> -			vmcs12->idt_vectoring_error_code =
>> -				vmcs_read32(IDT_VECTORING_ERROR_CODE);
>> -			vmcs12->vm_exit_instruction_len =
>> -				vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
>> -		}
>> -	}
>> -
>>  	vmx->loaded_vmcs->launched = 1;
>>  
>>  	vmx->exit_reason = vmcs_read32(VM_EXIT_REASON);
>> @@ -7325,6 +7295,48 @@ vmcs12_guest_cr4(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
>>  			vcpu->arch.cr4_guest_owned_bits));
>>  }
>>  
>> +static void vmcs12_save_pending_event(struct kvm_vcpu *vcpu,
>> +				       struct vmcs12 *vmcs12)
>> +{
>> +	u32 idt_vectoring;
>> +	unsigned int nr;
>> +
>> +	if (vcpu->arch.exception.pending) {
>> +		nr = vcpu->arch.exception.nr;
>> +		idt_vectoring = nr | VECTORING_INFO_VALID_MASK;
>> +
>> +		if (kvm_exception_is_soft(nr)) {
>> +			vmcs12->vm_exit_instruction_len =
>> +				vcpu->arch.event_exit_inst_len;
>> +			idt_vectoring |= INTR_TYPE_SOFT_EXCEPTION;
>> +		} else
>> +			idt_vectoring |= INTR_TYPE_HARD_EXCEPTION;
>> +
>> +		if (vcpu->arch.exception.has_error_code) {
>> +			idt_vectoring |= VECTORING_INFO_DELIVER_CODE_MASK;
>> +			vmcs12->idt_vectoring_error_code =
>> +				vcpu->arch.exception.error_code;
>> +		}
>> +
>> +		vmcs12->idt_vectoring_info_field = idt_vectoring;
>> +	} else if (vcpu->arch.nmi_pending) {
>> +		vmcs12->idt_vectoring_info_field =
>> +			INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK | NMI_VECTOR;
>> +	} else if (vcpu->arch.interrupt.pending) {
>> +		nr = vcpu->arch.interrupt.nr;
>> +		idt_vectoring = nr | VECTORING_INFO_VALID_MASK;
>> +
>> +		if (vcpu->arch.interrupt.soft) {
>> +			idt_vectoring |= INTR_TYPE_SOFT_INTR;
>> +			vmcs12->vm_entry_instruction_len =
>> +				vcpu->arch.event_exit_inst_len;
>> +		} else
>> +			idt_vectoring |= INTR_TYPE_EXT_INTR;
>> +
>> +		vmcs12->idt_vectoring_info_field = idt_vectoring;
>> +	}
>> +}
>> +
>>  /*
>>   * prepare_vmcs12 is part of what we need to do when the nested L2 guest exits
>>   * and we want to prepare to run its L1 parent. L1 keeps a vmcs for L2 (vmcs12),
>> @@ -7416,9 +7428,20 @@ static void prepare_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
>>  	vmcs12->vm_exit_instruction_len = vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
>>  	vmcs12->vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO);
>>  
>> -	/* clear vm-entry fields which are to be cleared on exit */
>>  	if (!(vmcs12->vm_exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY))
>> -		vmcs12->vm_entry_intr_info_field &= ~INTR_INFO_VALID_MASK;
> Why have you dropped this? Where is it cleaned now?

Hmm, looks like I read something like "vm_exit_intr_info". Will restore
and just improve the comment.

Jan



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 4/5] KVM: nVMX: Fix conditions for interrupt injection
  2013-03-24 18:44 ` [PATCH v3 4/5] KVM: nVMX: Fix conditions for interrupt injection Jan Kiszka
@ 2013-04-11 11:20   ` Gleb Natapov
  2013-04-11 14:27     ` Jan Kiszka
  0 siblings, 1 reply; 13+ messages in thread
From: Gleb Natapov @ 2013-04-11 11:20 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, kvm, Paolo Bonzini, Nadav Har'El

On Sun, Mar 24, 2013 at 07:44:47PM +0100, Jan Kiszka wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
> 
> If we are in guest mode, L0 can only inject events into L2 if L1 has
> nothing pending. Otherwise, L0 would overwrite L1's events and they
> would get lost. But even if no injection of L1 is pending, we do not
> want L0 to interrupt unnecessarily an on going vmentry with all its side
> effects on the vmcs. Therefore, injection shall be disallowed during
> L1->L2 transitions. This check is conceptually independent of
> nested_exit_on_intr.
> 
> If L1 traps external interrupts, then we also need to look at L1's
> idt_vectoring_info_field. If it is empty, we can kick the guest from L2
> to L1, just like the previous code worked.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---
>  arch/x86/kvm/vmx.c |   28 ++++++++++++++++++++--------
>  1 files changed, 20 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index d1bc834..30aa198 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -4325,16 +4325,28 @@ static int vmx_nmi_allowed(struct kvm_vcpu *vcpu)
>  
>  static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
>  {
> -	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu)) {
> +	if (is_guest_mode(vcpu)) {
>  		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> -		if (to_vmx(vcpu)->nested.nested_run_pending ||
> -		    (vmcs12->idt_vectoring_info_field &
> -		     VECTORING_INFO_VALID_MASK))
> +
> +		if (to_vmx(vcpu)->nested.nested_run_pending)
>  			return 0;
> -		nested_vmx_vmexit(vcpu);
> -		vmcs12->vm_exit_reason = EXIT_REASON_EXTERNAL_INTERRUPT;
> -		vmcs12->vm_exit_intr_info = 0;
> -		/* fall through to normal code, but now in L1, not L2 */
> +		if (nested_exit_on_intr(vcpu)) {
> +			/*
> +			 * Check if the idt_vectoring_info_field is free. We
> +			 * cannot raise EXIT_REASON_EXTERNAL_INTERRUPT if it
> +			 * isn't.
> +			 */
> +			if (vmcs12->idt_vectoring_info_field &
> +			    VECTORING_INFO_VALID_MASK)
> +				return 0;
After patch 2 I do not see how this can be true. Now this case is
handled by the common code: since event queue is not empty the code will not
get here.

> +			nested_vmx_vmexit(vcpu);
> +			vmcs12->vm_exit_reason =
> +				EXIT_REASON_EXTERNAL_INTERRUPT;
> +			vmcs12->vm_exit_intr_info = 0;
> +			/*
> +			 * fall through to normal code, but now in L1, not L2
> +			 */
> +		}
>  	}
>  
>  	return (vmcs_readl(GUEST_RFLAGS) & X86_EFLAGS_IF) &&
> -- 
> 1.7.3.4

--
			Gleb.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/5] KVM: nVMX: Rework event injection and recovery
  2013-03-24 18:44 ` [PATCH v3 2/5] KVM: nVMX: Rework event injection and recovery Jan Kiszka
  2013-04-10 13:42   ` Gleb Natapov
@ 2013-04-11 11:22   ` Gleb Natapov
  1 sibling, 0 replies; 13+ messages in thread
From: Gleb Natapov @ 2013-04-11 11:22 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, kvm, Paolo Bonzini, Nadav Har'El

On Sun, Mar 24, 2013 at 07:44:45PM +0100, Jan Kiszka wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
> 
> The basic idea is to always transfer the pending event injection on
> vmexit into the architectural state of the VCPU and then drop it from
> there if it turns out that we left L2 to enter L1, i.e. if we enter
> prepare_vmcs12.
> 
> vmcs12_save_pending_events takes care to transfer pending L0 events into
> the queue of L1. That is mandatory as L1 may decide to switch the guest
> state completely, invalidating or preserving the pending events for
> later injection (including on a different node, once we support
> migration).
> 
> This concept is based on the rule that a pending vmlaunch/vmresume is
> not canceled. Otherwise, we would risk to lose injected events or leak
> them into the wrong queues. Encode this rule via a WARN_ON_ONCE at the
> entry of nested_vmx_vmexit.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---
>  arch/x86/kvm/vmx.c |   90 +++++++++++++++++++++++++++++++++------------------
>  1 files changed, 58 insertions(+), 32 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 8827b3b..9d9ff74 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -6493,8 +6493,6 @@ static void __vmx_complete_interrupts(struct kvm_vcpu *vcpu,
>  
>  static void vmx_complete_interrupts(struct vcpu_vmx *vmx)
>  {
> -	if (is_guest_mode(&vmx->vcpu))
> -		return;
>  	__vmx_complete_interrupts(&vmx->vcpu, vmx->idt_vectoring_info,
>  				  VM_EXIT_INSTRUCTION_LEN,
>  				  IDT_VECTORING_ERROR_CODE);
> @@ -6502,8 +6500,6 @@ static void vmx_complete_interrupts(struct vcpu_vmx *vmx)
>  
>  static void vmx_cancel_injection(struct kvm_vcpu *vcpu)
>  {
> -	if (is_guest_mode(vcpu))
> -		return;
>  	__vmx_complete_interrupts(vcpu,
>  				  vmcs_read32(VM_ENTRY_INTR_INFO_FIELD),
>  				  VM_ENTRY_INSTRUCTION_LEN,
> @@ -6535,21 +6531,6 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
>  	unsigned long debugctlmsr;
>  
> -	if (is_guest_mode(vcpu) && !vmx->nested.nested_run_pending) {
> -		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> -		if (vmcs12->idt_vectoring_info_field &
> -				VECTORING_INFO_VALID_MASK) {
> -			vmcs_write32(VM_ENTRY_INTR_INFO_FIELD,
> -				vmcs12->idt_vectoring_info_field);
> -			vmcs_write32(VM_ENTRY_INSTRUCTION_LEN,
> -				vmcs12->vm_exit_instruction_len);
> -			if (vmcs12->idt_vectoring_info_field &
> -					VECTORING_INFO_DELIVER_CODE_MASK)
> -				vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE,
> -					vmcs12->idt_vectoring_error_code);
> -		}
> -	}
> -
>  	/* Record the guest's net vcpu time for enforced NMI injections. */
>  	if (unlikely(!cpu_has_virtual_nmis() && vmx->soft_vnmi_blocked))
>  		vmx->entry_time = ktime_get();
> @@ -6708,17 +6689,6 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>  
>  	vmx->idt_vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD);
>  
> -	if (is_guest_mode(vcpu)) {
> -		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> -		vmcs12->idt_vectoring_info_field = vmx->idt_vectoring_info;
> -		if (vmx->idt_vectoring_info & VECTORING_INFO_VALID_MASK) {
> -			vmcs12->idt_vectoring_error_code =
> -				vmcs_read32(IDT_VECTORING_ERROR_CODE);
> -			vmcs12->vm_exit_instruction_len =
> -				vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
> -		}
> -	}
> -
>  	vmx->loaded_vmcs->launched = 1;
>  
>  	vmx->exit_reason = vmcs_read32(VM_EXIT_REASON);
> @@ -7325,6 +7295,48 @@ vmcs12_guest_cr4(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
>  			vcpu->arch.cr4_guest_owned_bits));
>  }
>  
> +static void vmcs12_save_pending_event(struct kvm_vcpu *vcpu,
> +				       struct vmcs12 *vmcs12)
> +{
> +	u32 idt_vectoring;
> +	unsigned int nr;
> +
> +	if (vcpu->arch.exception.pending) {
> +		nr = vcpu->arch.exception.nr;
> +		idt_vectoring = nr | VECTORING_INFO_VALID_MASK;
> +
> +		if (kvm_exception_is_soft(nr)) {
> +			vmcs12->vm_exit_instruction_len =
> +				vcpu->arch.event_exit_inst_len;
> +			idt_vectoring |= INTR_TYPE_SOFT_EXCEPTION;
> +		} else
> +			idt_vectoring |= INTR_TYPE_HARD_EXCEPTION;
> +
> +		if (vcpu->arch.exception.has_error_code) {
> +			idt_vectoring |= VECTORING_INFO_DELIVER_CODE_MASK;
> +			vmcs12->idt_vectoring_error_code =
> +				vcpu->arch.exception.error_code;
> +		}
> +
> +		vmcs12->idt_vectoring_info_field = idt_vectoring;
> +	} else if (vcpu->arch.nmi_pending) {
> +		vmcs12->idt_vectoring_info_field =
> +			INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK | NMI_VECTOR;
> +	} else if (vcpu->arch.interrupt.pending) {
> +		nr = vcpu->arch.interrupt.nr;
> +		idt_vectoring = nr | VECTORING_INFO_VALID_MASK;
> +
> +		if (vcpu->arch.interrupt.soft) {
> +			idt_vectoring |= INTR_TYPE_SOFT_INTR;
> +			vmcs12->vm_entry_instruction_len =
> +				vcpu->arch.event_exit_inst_len;
> +		} else
> +			idt_vectoring |= INTR_TYPE_EXT_INTR;
> +
> +		vmcs12->idt_vectoring_info_field = idt_vectoring;
> +	}
else
   vmcs12->idt_vectoring_info_field = 0.

Also you can drop
vmcs12->idt_vectoring_info_field = to_vmx(vcpu)->idt_vectoring_info;
from prepare_vmcs12().

> +}
> +
>  /*
>   * prepare_vmcs12 is part of what we need to do when the nested L2 guest exits
>   * and we want to prepare to run its L1 parent. L1 keeps a vmcs for L2 (vmcs12),
> @@ -7416,9 +7428,20 @@ static void prepare_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
>  	vmcs12->vm_exit_instruction_len = vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
>  	vmcs12->vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO);
>  
> -	/* clear vm-entry fields which are to be cleared on exit */
>  	if (!(vmcs12->vm_exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY))
> -		vmcs12->vm_entry_intr_info_field &= ~INTR_INFO_VALID_MASK;
> +		/*
> +		 * Transfer the event that L0 or L1 may wanted to inject into
> +		 * L2 to IDT_VECTORING_INFO_FIELD.
> +		 */
> +		vmcs12_save_pending_event(vcpu, vmcs12);
> +
> +	/*
> +	 * Drop what we picked up for L2 via vmx_complete_interrupts. It is
> +	 * preserved above and would only end up incorrectly in L1.
> +	 */
> +	vcpu->arch.nmi_injected = false;
> +	kvm_clear_exception_queue(vcpu);
> +	kvm_clear_interrupt_queue(vcpu);
>  }
>  
>  /*
> @@ -7518,6 +7541,9 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu)
>  	int cpu;
>  	struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
>  
> +	/* trying to cancel vmlaunch/vmresume is a bug */
> +	WARN_ON_ONCE(vmx->nested.nested_run_pending);
> +
>  	leave_guest_mode(vcpu);
>  	prepare_vmcs12(vcpu, vmcs12);
>  
> -- 
> 1.7.3.4

--
			Gleb.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 4/5] KVM: nVMX: Fix conditions for interrupt injection
  2013-04-11 11:20   ` Gleb Natapov
@ 2013-04-11 14:27     ` Jan Kiszka
  2013-04-11 14:29       ` Gleb Natapov
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Kiszka @ 2013-04-11 14:27 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Marcelo Tosatti, kvm, Paolo Bonzini, Nadav Har'El

[-- Attachment #1: Type: text/plain, Size: 2431 bytes --]

On 2013-04-11 13:20, Gleb Natapov wrote:
> On Sun, Mar 24, 2013 at 07:44:47PM +0100, Jan Kiszka wrote:
>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>
>> If we are in guest mode, L0 can only inject events into L2 if L1 has
>> nothing pending. Otherwise, L0 would overwrite L1's events and they
>> would get lost. But even if no injection of L1 is pending, we do not
>> want L0 to interrupt unnecessarily an on going vmentry with all its side
>> effects on the vmcs. Therefore, injection shall be disallowed during
>> L1->L2 transitions. This check is conceptually independent of
>> nested_exit_on_intr.
>>
>> If L1 traps external interrupts, then we also need to look at L1's
>> idt_vectoring_info_field. If it is empty, we can kick the guest from L2
>> to L1, just like the previous code worked.
>>
>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>> ---
>>  arch/x86/kvm/vmx.c |   28 ++++++++++++++++++++--------
>>  1 files changed, 20 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index d1bc834..30aa198 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -4325,16 +4325,28 @@ static int vmx_nmi_allowed(struct kvm_vcpu *vcpu)
>>  
>>  static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
>>  {
>> -	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu)) {
>> +	if (is_guest_mode(vcpu)) {
>>  		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
>> -		if (to_vmx(vcpu)->nested.nested_run_pending ||
>> -		    (vmcs12->idt_vectoring_info_field &
>> -		     VECTORING_INFO_VALID_MASK))
>> +
>> +		if (to_vmx(vcpu)->nested.nested_run_pending)
>>  			return 0;
>> -		nested_vmx_vmexit(vcpu);
>> -		vmcs12->vm_exit_reason = EXIT_REASON_EXTERNAL_INTERRUPT;
>> -		vmcs12->vm_exit_intr_info = 0;
>> -		/* fall through to normal code, but now in L1, not L2 */
>> +		if (nested_exit_on_intr(vcpu)) {
>> +			/*
>> +			 * Check if the idt_vectoring_info_field is free. We
>> +			 * cannot raise EXIT_REASON_EXTERNAL_INTERRUPT if it
>> +			 * isn't.
>> +			 */
>> +			if (vmcs12->idt_vectoring_info_field &
>> +			    VECTORING_INFO_VALID_MASK)
>> +				return 0;
> After patch 2 I do not see how this can be true. Now this case is
> handled by the common code: since event queue is not empty the code will not
> get here.

The event queue is unconditionally cleared (after being migrated to
vmcs12) in patch 2.

Jan



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 4/5] KVM: nVMX: Fix conditions for interrupt injection
  2013-04-11 14:27     ` Jan Kiszka
@ 2013-04-11 14:29       ` Gleb Natapov
  2013-04-12  9:00         ` Jan Kiszka
  0 siblings, 1 reply; 13+ messages in thread
From: Gleb Natapov @ 2013-04-11 14:29 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, kvm, Paolo Bonzini, Nadav Har'El

On Thu, Apr 11, 2013 at 04:27:23PM +0200, Jan Kiszka wrote:
> On 2013-04-11 13:20, Gleb Natapov wrote:
> > On Sun, Mar 24, 2013 at 07:44:47PM +0100, Jan Kiszka wrote:
> >> From: Jan Kiszka <jan.kiszka@siemens.com>
> >>
> >> If we are in guest mode, L0 can only inject events into L2 if L1 has
> >> nothing pending. Otherwise, L0 would overwrite L1's events and they
> >> would get lost. But even if no injection of L1 is pending, we do not
> >> want L0 to interrupt unnecessarily an on going vmentry with all its side
> >> effects on the vmcs. Therefore, injection shall be disallowed during
> >> L1->L2 transitions. This check is conceptually independent of
> >> nested_exit_on_intr.
> >>
> >> If L1 traps external interrupts, then we also need to look at L1's
> >> idt_vectoring_info_field. If it is empty, we can kick the guest from L2
> >> to L1, just like the previous code worked.
> >>
> >> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> >> ---
> >>  arch/x86/kvm/vmx.c |   28 ++++++++++++++++++++--------
> >>  1 files changed, 20 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> >> index d1bc834..30aa198 100644
> >> --- a/arch/x86/kvm/vmx.c
> >> +++ b/arch/x86/kvm/vmx.c
> >> @@ -4325,16 +4325,28 @@ static int vmx_nmi_allowed(struct kvm_vcpu *vcpu)
> >>  
> >>  static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
> >>  {
> >> -	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu)) {
> >> +	if (is_guest_mode(vcpu)) {
> >>  		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> >> -		if (to_vmx(vcpu)->nested.nested_run_pending ||
> >> -		    (vmcs12->idt_vectoring_info_field &
> >> -		     VECTORING_INFO_VALID_MASK))
> >> +
> >> +		if (to_vmx(vcpu)->nested.nested_run_pending)
> >>  			return 0;
> >> -		nested_vmx_vmexit(vcpu);
> >> -		vmcs12->vm_exit_reason = EXIT_REASON_EXTERNAL_INTERRUPT;
> >> -		vmcs12->vm_exit_intr_info = 0;
> >> -		/* fall through to normal code, but now in L1, not L2 */
> >> +		if (nested_exit_on_intr(vcpu)) {
> >> +			/*
> >> +			 * Check if the idt_vectoring_info_field is free. We
> >> +			 * cannot raise EXIT_REASON_EXTERNAL_INTERRUPT if it
> >> +			 * isn't.
> >> +			 */
> >> +			if (vmcs12->idt_vectoring_info_field &
> >> +			    VECTORING_INFO_VALID_MASK)
> >> +				return 0;
> > After patch 2 I do not see how this can be true. Now this case is
> > handled by the common code: since event queue is not empty the code will not
> > get here.
> 
> The event queue is unconditionally cleared (after being migrated to
> vmcs12) in patch 2.
> 
During vmexit, yes. But here we are in if(is_guest_mode(vcpu)).

--
			Gleb.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 4/5] KVM: nVMX: Fix conditions for interrupt injection
  2013-04-11 14:29       ` Gleb Natapov
@ 2013-04-12  9:00         ` Jan Kiszka
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Kiszka @ 2013-04-12  9:00 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Marcelo Tosatti, kvm, Paolo Bonzini, Nadav Har'El

On 2013-04-11 16:29, Gleb Natapov wrote:
> On Thu, Apr 11, 2013 at 04:27:23PM +0200, Jan Kiszka wrote:
>> On 2013-04-11 13:20, Gleb Natapov wrote:
>>> On Sun, Mar 24, 2013 at 07:44:47PM +0100, Jan Kiszka wrote:
>>>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>>>
>>>> If we are in guest mode, L0 can only inject events into L2 if L1 has
>>>> nothing pending. Otherwise, L0 would overwrite L1's events and they
>>>> would get lost. But even if no injection of L1 is pending, we do not
>>>> want L0 to interrupt unnecessarily an on going vmentry with all its side
>>>> effects on the vmcs. Therefore, injection shall be disallowed during
>>>> L1->L2 transitions. This check is conceptually independent of
>>>> nested_exit_on_intr.
>>>>
>>>> If L1 traps external interrupts, then we also need to look at L1's
>>>> idt_vectoring_info_field. If it is empty, we can kick the guest from L2
>>>> to L1, just like the previous code worked.
>>>>
>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>> ---
>>>>  arch/x86/kvm/vmx.c |   28 ++++++++++++++++++++--------
>>>>  1 files changed, 20 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>>> index d1bc834..30aa198 100644
>>>> --- a/arch/x86/kvm/vmx.c
>>>> +++ b/arch/x86/kvm/vmx.c
>>>> @@ -4325,16 +4325,28 @@ static int vmx_nmi_allowed(struct kvm_vcpu *vcpu)
>>>>  
>>>>  static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
>>>>  {
>>>> -	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu)) {
>>>> +	if (is_guest_mode(vcpu)) {
>>>>  		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
>>>> -		if (to_vmx(vcpu)->nested.nested_run_pending ||
>>>> -		    (vmcs12->idt_vectoring_info_field &
>>>> -		     VECTORING_INFO_VALID_MASK))
>>>> +
>>>> +		if (to_vmx(vcpu)->nested.nested_run_pending)
>>>>  			return 0;
>>>> -		nested_vmx_vmexit(vcpu);
>>>> -		vmcs12->vm_exit_reason = EXIT_REASON_EXTERNAL_INTERRUPT;
>>>> -		vmcs12->vm_exit_intr_info = 0;
>>>> -		/* fall through to normal code, but now in L1, not L2 */
>>>> +		if (nested_exit_on_intr(vcpu)) {
>>>> +			/*
>>>> +			 * Check if the idt_vectoring_info_field is free. We
>>>> +			 * cannot raise EXIT_REASON_EXTERNAL_INTERRUPT if it
>>>> +			 * isn't.
>>>> +			 */
>>>> +			if (vmcs12->idt_vectoring_info_field &
>>>> +			    VECTORING_INFO_VALID_MASK)
>>>> +				return 0;
>>> After patch 2 I do not see how this can be true. Now this case is
>>> handled by the common code: since event queue is not empty the code will not
>>> get here.
>>
>> The event queue is unconditionally cleared (after being migrated to
>> vmcs12) in patch 2.
>>
> During vmexit, yes. But here we are in if(is_guest_mode(vcpu)).

Hmm, looks like: We leave L2, transfer the real vectoring info into the
queue and then consider injecting something in addition. That should
actually be avoided at higher level. Ok, will drop this test.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-04-12  9:00 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-24 18:44 [PATCH v3 0/5] KVM: nVMX: Make direct IRQ/NMI injection work Jan Kiszka
2013-03-24 18:44 ` [PATCH v3 1/5] KVM: nVMX: Fix injection of PENDING_INTERRUPT and NMI_WINDOW exits to L1 Jan Kiszka
2013-03-24 18:44 ` [PATCH v3 2/5] KVM: nVMX: Rework event injection and recovery Jan Kiszka
2013-04-10 13:42   ` Gleb Natapov
2013-04-10 13:49     ` Jan Kiszka
2013-04-11 11:22   ` Gleb Natapov
2013-03-24 18:44 ` [PATCH v3 3/5] KVM: VMX: Move vmx_nmi_allowed after vmx_set_nmi_mask Jan Kiszka
2013-03-24 18:44 ` [PATCH v3 4/5] KVM: nVMX: Fix conditions for interrupt injection Jan Kiszka
2013-04-11 11:20   ` Gleb Natapov
2013-04-11 14:27     ` Jan Kiszka
2013-04-11 14:29       ` Gleb Natapov
2013-04-12  9:00         ` Jan Kiszka
2013-03-24 18:44 ` [PATCH v3 5/5] KVM: nVMX: Fix conditions for NMI injection Jan Kiszka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.