All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] nVMX injection corrections
@ 2011-09-22 10:52 Nadav Har'El
  2011-09-22 10:52 ` [PATCH 1/2] nVMX: Add KVM_REQ_IMMEDIATE_EXIT Nadav Har'El
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Nadav Har'El @ 2011-09-22 10:52 UTC (permalink / raw)
  To: Avi Kivity, kvm; +Cc: Dave Allan, Federico Simoncelli, Abel Gordon

The following two patches solve two injection-related nested VMX issues:

 1. When we must run L2 next (namely on L1's VMLAUNCH/VMRESUME), injection
    into L1 was delayed for an unknown amount of time - until L2 exits.
    We now force (using a self IPI) an exit immediately after entry to L2,
    so that the injection into L1 happens promptly.

 2. "unexpected, valid vectoring info" warnings appeared in L1.
    These are fixed by correcting the emulation of concurrent L0->L1 and
    L1->L2 injections: We cannot inject into L1 until the injection into L2
    has been processed.

Patch statistics:
-----------------

 arch/x86/kvm/vmx.c       |   18 +++++++++++-------
 arch/x86/kvm/x86.c       |    6 ++++++
 include/linux/kvm_host.h |    1 +
 3 files changed, 18 insertions(+), 7 deletions(-)

--
Nadav Har'El
IBM Haifa Research Lab

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] nVMX: Add KVM_REQ_IMMEDIATE_EXIT
  2011-09-22 10:52 [PATCH 0/2] nVMX injection corrections Nadav Har'El
@ 2011-09-22 10:52 ` Nadav Har'El
  2011-09-23 12:36   ` Marcelo Tosatti
  2011-09-22 10:53 ` [PATCH 2/2] nVMX: Fix warning-causing idt-vectoring-info behavior Nadav Har'El
  2011-09-26 17:01 ` [PATCH 0/2] nVMX injection corrections Avi Kivity
  2 siblings, 1 reply; 7+ messages in thread
From: Nadav Har'El @ 2011-09-22 10:52 UTC (permalink / raw)
  To: Avi Kivity, kvm; +Cc: Dave Allan, Federico Simoncelli, Abel Gordon

This patch adds a new vcpu->requests bit, KVM_REQ_IMMEDIATE_EXIT.
This bit requests that when next entering the guest, we should run it only
for as little as possible, and exit again.

We use this new option in nested VMX: When L1 launches L2, but L0 wishes L1
to continue running so it can inject an event to it, we unfortunately cannot
just pretend to have run L2 for a little while - We must really launch L2,
otherwise certain one-off vmcs12 parameters (namely, L1 injection into L2)
will be lost. So the existing code runs L2 in this case.
But L2 could potentially run for a long time until it exits, and the
injection into L1 will be delayed. The new KVM_REQ_IMMEDIATE_EXIT allows us
to request that L2 will be entered, as necessary, but will exit as soon as
possible after entry.

Our implementation of this request uses smp_send_reschedule() to send a
self-IPI, with interrupts disabled. The interrupts remain disabled until the
guest is entered, and then, after the entry is complete (often including
processing an injection and jumping to the relevant handler), the physical
interrupt is noticed and causes an exit.

On recent Intel processors, we could have achieved the same goal by using
MTF instead of a self-IPI. Another technique worth considering in the future
is to use VM_EXIT_ACK_INTR_ON_EXIT and a highest-priority vector IPI - to
slightly improve performance by avoiding the useless interrupt handler
which ends up being called when smp_send_reschedule() is used.

Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
---
 arch/x86/kvm/vmx.c       |   11 +++++++----
 arch/x86/kvm/x86.c       |    6 ++++++
 include/linux/kvm_host.h |    1 +
 3 files changed, 14 insertions(+), 4 deletions(-)

--- .before/include/linux/kvm_host.h	2011-09-22 13:51:31.000000000 +0300
+++ .after/include/linux/kvm_host.h	2011-09-22 13:51:31.000000000 +0300
@@ -48,6 +48,7 @@
 #define KVM_REQ_EVENT             11
 #define KVM_REQ_APF_HALT          12
 #define KVM_REQ_STEAL_UPDATE      13
+#define KVM_REQ_IMMEDIATE_EXIT    14
 
 #define KVM_USERSPACE_IRQ_SOURCE_ID	0
 
--- .before/arch/x86/kvm/x86.c	2011-09-22 13:51:31.000000000 +0300
+++ .after/arch/x86/kvm/x86.c	2011-09-22 13:51:31.000000000 +0300
@@ -5610,6 +5610,7 @@ static int vcpu_enter_guest(struct kvm_v
 	bool nmi_pending;
 	bool req_int_win = !irqchip_in_kernel(vcpu->kvm) &&
 		vcpu->run->request_interrupt_window;
+	bool req_immediate_exit = 0;
 
 	if (vcpu->requests) {
 		if (kvm_check_request(KVM_REQ_MMU_RELOAD, vcpu))
@@ -5647,6 +5648,8 @@ static int vcpu_enter_guest(struct kvm_v
 		}
 		if (kvm_check_request(KVM_REQ_STEAL_UPDATE, vcpu))
 			record_steal_time(vcpu);
+		req_immediate_exit =
+			kvm_check_request(KVM_REQ_IMMEDIATE_EXIT, vcpu);
 
 	}
 
@@ -5706,6 +5709,9 @@ static int vcpu_enter_guest(struct kvm_v
 
 	srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
 
+	if (req_immediate_exit)
+		smp_send_reschedule(vcpu->cpu);
+
 	kvm_guest_enter();
 
 	if (unlikely(vcpu->arch.switch_db_regs)) {
--- .before/arch/x86/kvm/vmx.c	2011-09-22 13:51:31.000000000 +0300
+++ .after/arch/x86/kvm/vmx.c	2011-09-22 13:51:31.000000000 +0300
@@ -3858,12 +3858,15 @@ static bool nested_exit_on_intr(struct k
 static void enable_irq_window(struct kvm_vcpu *vcpu)
 {
 	u32 cpu_based_vm_exec_control;
-	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu))
-		/* We can get here when nested_run_pending caused
-		 * vmx_interrupt_allowed() to return false. In this case, do
-		 * nothing - the interrupt will be injected later.
+	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu)) {
+		/*
+		 * We get here if vmx_interrupt_allowed() said we can't
+		 * inject to L1 now because L2 must run. Ask L2 to exit
+		 * right after entry, so we can inject to L1 more promptly.
 		 */
+		kvm_make_request(KVM_REQ_IMMEDIATE_EXIT, vcpu);
 		return;
+	}
 
 	cpu_based_vm_exec_control = vmcs_read32(CPU_BASED_VM_EXEC_CONTROL);
 	cpu_based_vm_exec_control |= CPU_BASED_VIRTUAL_INTR_PENDING;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/2] nVMX: Fix warning-causing idt-vectoring-info behavior
  2011-09-22 10:52 [PATCH 0/2] nVMX injection corrections Nadav Har'El
  2011-09-22 10:52 ` [PATCH 1/2] nVMX: Add KVM_REQ_IMMEDIATE_EXIT Nadav Har'El
@ 2011-09-22 10:53 ` Nadav Har'El
  2011-09-26 17:01 ` [PATCH 0/2] nVMX injection corrections Avi Kivity
  2 siblings, 0 replies; 7+ messages in thread
From: Nadav Har'El @ 2011-09-22 10:53 UTC (permalink / raw)
  To: Avi Kivity, kvm; +Cc: Dave Allan, Federico Simoncelli, Abel Gordon

When L0 wishes to inject an interrupt while L2 is running, it emulates an exit
to L1 with EXIT_REASON_EXTERNAL_INTERRUPT. This was explained in the original
nVMX patch 23, titled "Correct handling of interrupt injection".

Unfortunately, it is possible (though rare) that at this point there is valid
idt_vectoring_info in vmcs02. For example, L1 injected some interrupt to L2,
and when L2 tried to run this interrupt's handler, it got a page fault - so
it returns the original interrupt vector in idt_vectoring_info. The problem
is that if this is the case, we cannot exit to L1 with EXTERNAL_INTERRUPT
like we wished to, because the VMX spec guarantees that idt_vectoring_info
and exit_reason_external_interrupt can never happen together. This is not
just specified in the spec - a KVM L1 actually prints a kernel warning
"unexpected, valid vectoring info" if we violate this guarantee, and some
users noticed these warnings in L1's logs.

In order to better emulate a processor, which would never return the external
interrupt and the idt-vectoring-info together, we need to separate the two
injection steps: First, complete L1's injection into L2 (i.e., enter L2,
injecting to it the idt-vectoring-info); Second, after entry into L2 succeeds
and it exits back to L0, exit to L1 with the EXIT_REASON_EXTERNAL_INTERRUPT.
Most of this is already in the code - the only change we need is to remain
in L2 (and not exit to L1) in this case.

Note that the previous patch ensures (by using KVM_REQ_IMMEDIATE_EXIT) that
although we do enter L2 first, it will exit immediately after processing its
injection, allowing us to promptly inject to L1.

Note how we test vmcs12->idt_vectoring_info_field; This isn't really the
vmcs12 value (we haven't exited to L1 yet, so vmcs12 hasn't been updated),
but rather the place we save, at the end of vmx_vcpu_run, the vmcs02 value
of this field. This was explained in patch 25 ("Correct handling of idt
vectoring info") of the original nVMX patch series.

Thanks to Dave Allan and to Federico Simoncelli for reporting this bug,
to Abel Gordon for helping me figure out the solution, and to Avi Kivity
for helping to improve it.

Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
---
 arch/x86/kvm/vmx.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- .before/arch/x86/kvm/vmx.c	2011-09-22 13:51:31.000000000 +0300
+++ .after/arch/x86/kvm/vmx.c	2011-09-22 13:51:31.000000000 +0300
@@ -3993,11 +3993,12 @@ static void vmx_set_nmi_mask(struct kvm_
 static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
 {
 	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu)) {
-		struct vmcs12 *vmcs12;
-		if (to_vmx(vcpu)->nested.nested_run_pending)
+		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
+		if (to_vmx(vcpu)->nested.nested_run_pending ||
+		    (vmcs12->idt_vectoring_info_field &
+		     VECTORING_INFO_VALID_MASK))
 			return 0;
 		nested_vmx_vmexit(vcpu);
-		vmcs12 = get_vmcs12(vcpu);
 		vmcs12->vm_exit_reason = EXIT_REASON_EXTERNAL_INTERRUPT;
 		vmcs12->vm_exit_intr_info = 0;
 		/* fall through to normal code, but now in L1, not L2 */

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] nVMX: Add KVM_REQ_IMMEDIATE_EXIT
  2011-09-22 10:52 ` [PATCH 1/2] nVMX: Add KVM_REQ_IMMEDIATE_EXIT Nadav Har'El
@ 2011-09-23 12:36   ` Marcelo Tosatti
  2011-09-25  8:13     ` Nadav Har'El
  0 siblings, 1 reply; 7+ messages in thread
From: Marcelo Tosatti @ 2011-09-23 12:36 UTC (permalink / raw)
  To: Nadav Har'El
  Cc: Avi Kivity, kvm, Dave Allan, Federico Simoncelli, Abel Gordon

On Thu, Sep 22, 2011 at 01:52:56PM +0300, Nadav Har'El wrote:
> This patch adds a new vcpu->requests bit, KVM_REQ_IMMEDIATE_EXIT.
> This bit requests that when next entering the guest, we should run it only
> for as little as possible, and exit again.
> 
> We use this new option in nested VMX: When L1 launches L2, but L0 wishes L1
> to continue running so it can inject an event to it, we unfortunately cannot
> just pretend to have run L2 for a little while - We must really launch L2,
> otherwise certain one-off vmcs12 parameters (namely, L1 injection into L2)
> will be lost. So the existing code runs L2 in this case.
> But L2 could potentially run for a long time until it exits, and the
> injection into L1 will be delayed. The new KVM_REQ_IMMEDIATE_EXIT allows us
> to request that L2 will be entered, as necessary, but will exit as soon as
> possible after entry.
> 
> Our implementation of this request uses smp_send_reschedule() to send a
> self-IPI, with interrupts disabled. The interrupts remain disabled until the
> guest is entered, and then, after the entry is complete (often including
> processing an injection and jumping to the relevant handler), the physical
> interrupt is noticed and causes an exit.
> 
> On recent Intel processors, we could have achieved the same goal by using
> MTF instead of a self-IPI. Another technique worth considering in the future
> is to use VM_EXIT_ACK_INTR_ON_EXIT and a highest-priority vector IPI - to
> slightly improve performance by avoiding the useless interrupt handler
> which ends up being called when smp_send_reschedule() is used.
> 
> Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
> ---
>  arch/x86/kvm/vmx.c       |   11 +++++++----
>  arch/x86/kvm/x86.c       |    6 ++++++
>  include/linux/kvm_host.h |    1 +
>  3 files changed, 14 insertions(+), 4 deletions(-)
> 
> --- .before/include/linux/kvm_host.h	2011-09-22 13:51:31.000000000 +0300
> +++ .after/include/linux/kvm_host.h	2011-09-22 13:51:31.000000000 +0300
> @@ -48,6 +48,7 @@
>  #define KVM_REQ_EVENT             11
>  #define KVM_REQ_APF_HALT          12
>  #define KVM_REQ_STEAL_UPDATE      13
> +#define KVM_REQ_IMMEDIATE_EXIT    14
>  
>  #define KVM_USERSPACE_IRQ_SOURCE_ID	0
>  
> --- .before/arch/x86/kvm/x86.c	2011-09-22 13:51:31.000000000 +0300
> +++ .after/arch/x86/kvm/x86.c	2011-09-22 13:51:31.000000000 +0300
> @@ -5610,6 +5610,7 @@ static int vcpu_enter_guest(struct kvm_v
>  	bool nmi_pending;
>  	bool req_int_win = !irqchip_in_kernel(vcpu->kvm) &&
>  		vcpu->run->request_interrupt_window;
> +	bool req_immediate_exit = 0;
>  
>  	if (vcpu->requests) {
>  		if (kvm_check_request(KVM_REQ_MMU_RELOAD, vcpu))
> @@ -5647,6 +5648,8 @@ static int vcpu_enter_guest(struct kvm_v
>  		}
>  		if (kvm_check_request(KVM_REQ_STEAL_UPDATE, vcpu))
>  			record_steal_time(vcpu);
> +		req_immediate_exit =
> +			kvm_check_request(KVM_REQ_IMMEDIATE_EXIT, vcpu);

The immediate exit information can be lost if entry decides to bail out.

You can do 

        req_immediate_exit = kvm_check_request(KVM_REQ_IMMEDIATE_EXIT)

after preempt_disable()

and then transfer back the bit in the bail out case in

if (vcpu->mode == EXITING_GUEST_MODE || vcpu->requests
...


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] nVMX: Add KVM_REQ_IMMEDIATE_EXIT
  2011-09-23 12:36   ` Marcelo Tosatti
@ 2011-09-25  8:13     ` Nadav Har'El
  2011-09-26 11:02       ` Marcelo Tosatti
  0 siblings, 1 reply; 7+ messages in thread
From: Nadav Har'El @ 2011-09-25  8:13 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Avi Kivity, kvm, Dave Allan, Federico Simoncelli, Abel Gordon

On Fri, Sep 23, 2011, Marcelo Tosatti wrote about "Re: [PATCH 1/2] nVMX: Add KVM_REQ_IMMEDIATE_EXIT":
> On Thu, Sep 22, 2011 at 01:52:56PM +0300, Nadav Har'El wrote:
> > This patch adds a new vcpu->requests bit, KVM_REQ_IMMEDIATE_EXIT.
> > This bit requests that when next entering the guest, we should run it only
> > for as little as possible, and exit again.
> > 
> > We use this new option in nested VMX: When L1 launches L2, but L0 wishes L1
>...
> > @@ -5647,6 +5648,8 @@ static int vcpu_enter_guest(struct kvm_v
> >  		}
> >  		if (kvm_check_request(KVM_REQ_STEAL_UPDATE, vcpu))
> >  			record_steal_time(vcpu);
> > +		req_immediate_exit =
> > +			kvm_check_request(KVM_REQ_IMMEDIATE_EXIT, vcpu);
>...
> The immediate exit information can be lost if entry decides to bail out.
> You can do 
> 
>         req_immediate_exit = kvm_check_request(KVM_REQ_IMMEDIATE_EXIT)
> after preempt_disable()
> and then transfer back the bit in the bail out case in
> if (vcpu->mode == EXITING_GUEST_MODE || vcpu->requests

Thanks.

But thinking about this a bit, it seems to me that in my case *losing* this
bit on a canceled entry is the correct thing to do, as turning on this bit was
decided in the injection phase (in enable_irq_window()), and next time, if
the reason to turn on this bit still exists (i.e., L0 has something to inject
to L1, but L2 needs to run), we will turn it on again.

-- 
Nadav Har'El                        |                    Sunday, Sep 25 2011, 
nyh@math.technion.ac.il             |-----------------------------------------
Phone +972-523-790466, ICQ 13349191 |Guarantee: this email is 100% free of
http://nadav.harel.org.il           |magnetic monopoles, or your money back!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] nVMX: Add KVM_REQ_IMMEDIATE_EXIT
  2011-09-25  8:13     ` Nadav Har'El
@ 2011-09-26 11:02       ` Marcelo Tosatti
  0 siblings, 0 replies; 7+ messages in thread
From: Marcelo Tosatti @ 2011-09-26 11:02 UTC (permalink / raw)
  To: Nadav Har'El
  Cc: Avi Kivity, kvm, Dave Allan, Federico Simoncelli, Abel Gordon

On Sun, Sep 25, 2011 at 11:13:06AM +0300, Nadav Har'El wrote:
> On Fri, Sep 23, 2011, Marcelo Tosatti wrote about "Re: [PATCH 1/2] nVMX: Add KVM_REQ_IMMEDIATE_EXIT":
> > On Thu, Sep 22, 2011 at 01:52:56PM +0300, Nadav Har'El wrote:
> > > This patch adds a new vcpu->requests bit, KVM_REQ_IMMEDIATE_EXIT.
> > > This bit requests that when next entering the guest, we should run it only
> > > for as little as possible, and exit again.
> > > 
> > > We use this new option in nested VMX: When L1 launches L2, but L0 wishes L1
> >...
> > > @@ -5647,6 +5648,8 @@ static int vcpu_enter_guest(struct kvm_v
> > >  		}
> > >  		if (kvm_check_request(KVM_REQ_STEAL_UPDATE, vcpu))
> > >  			record_steal_time(vcpu);
> > > +		req_immediate_exit =
> > > +			kvm_check_request(KVM_REQ_IMMEDIATE_EXIT, vcpu);
> >...
> > The immediate exit information can be lost if entry decides to bail out.
> > You can do 
> > 
> >         req_immediate_exit = kvm_check_request(KVM_REQ_IMMEDIATE_EXIT)
> > after preempt_disable()
> > and then transfer back the bit in the bail out case in
> > if (vcpu->mode == EXITING_GUEST_MODE || vcpu->requests
> 
> Thanks.
> 
> But thinking about this a bit, it seems to me that in my case *losing* this
> bit on a canceled entry is the correct thing to do, as turning on this bit was
> decided in the injection phase (in enable_irq_window()), and next time, if
> the reason to turn on this bit still exists (i.e., L0 has something to inject
> to L1, but L2 needs to run), we will turn it on again.

Correct, the loss is irrelevant.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] nVMX injection corrections
  2011-09-22 10:52 [PATCH 0/2] nVMX injection corrections Nadav Har'El
  2011-09-22 10:52 ` [PATCH 1/2] nVMX: Add KVM_REQ_IMMEDIATE_EXIT Nadav Har'El
  2011-09-22 10:53 ` [PATCH 2/2] nVMX: Fix warning-causing idt-vectoring-info behavior Nadav Har'El
@ 2011-09-26 17:01 ` Avi Kivity
  2 siblings, 0 replies; 7+ messages in thread
From: Avi Kivity @ 2011-09-26 17:01 UTC (permalink / raw)
  To: Nadav Har'El; +Cc: kvm, Dave Allan, Federico Simoncelli, Abel Gordon

On 09/22/2011 01:52 PM, Nadav Har'El wrote:
> The following two patches solve two injection-related nested VMX issues:
>
>   1. When we must run L2 next (namely on L1's VMLAUNCH/VMRESUME), injection
>      into L1 was delayed for an unknown amount of time - until L2 exits.
>      We now force (using a self IPI) an exit immediately after entry to L2,
>      so that the injection into L1 happens promptly.
>
>   2. "unexpected, valid vectoring info" warnings appeared in L1.
>      These are fixed by correcting the emulation of concurrent L0->L1 and
>      L1->L2 injections: We cannot inject into L1 until the injection into L2
>      has been processed.
>
>

Applied, thanks.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-09-26 17:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-22 10:52 [PATCH 0/2] nVMX injection corrections Nadav Har'El
2011-09-22 10:52 ` [PATCH 1/2] nVMX: Add KVM_REQ_IMMEDIATE_EXIT Nadav Har'El
2011-09-23 12:36   ` Marcelo Tosatti
2011-09-25  8:13     ` Nadav Har'El
2011-09-26 11:02       ` Marcelo Tosatti
2011-09-22 10:53 ` [PATCH 2/2] nVMX: Fix warning-causing idt-vectoring-info behavior Nadav Har'El
2011-09-26 17:01 ` [PATCH 0/2] nVMX injection corrections Avi Kivity

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.