All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Tian, Kevin" <kevin.tian@intel.com>
To: Nadav Har'El <nyh@il.ibm.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>
Cc: "gleb@redhat.com" <gleb@redhat.com>, "avi@redhat.com" <avi@redhat.com>
Subject: RE: [PATCH 23/31] nVMX: Correct handling of interrupt injection
Date: Wed, 25 May 2011 16:39:49 +0800	[thread overview]
Message-ID: <625BA99ED14B2D499DC4E29D8138F1505C9BFA3A70@shsmsx502.ccr.corp.intel.com> (raw)
In-Reply-To: <201105161955.p4GJtgKc001996@rice.haifa.ibm.com>

> From: Nadav Har'El
> Sent: Tuesday, May 17, 2011 3:56 AM
> 
> The code in this patch correctly emulates external-interrupt injection
> while a nested guest L2 is running.
> 
> Because of this code's relative un-obviousness, I include here a longer-than-
> usual justification for what it does - much longer than the code itself ;-)
> 
> To understand how to correctly emulate interrupt injection while L2 is
> running, let's look first at what we need to emulate: How would things look
> like if the extra L0 hypervisor layer is removed, and instead of L0 injecting
> an interrupt, we had hardware delivering an interrupt?
> 
> Now we have L1 running on bare metal with a guest L2, and the hardware
> generates an interrupt. Assuming that L1 set PIN_BASED_EXT_INTR_MASK to
> 1, and
> VM_EXIT_ACK_INTR_ON_EXIT to 0 (we'll revisit these assumptions below),
> what
> happens now is this: The processor exits from L2 to L1, with an external-
> interrupt exit reason but without an interrupt vector. L1 runs, with
> interrupts disabled, and it doesn't yet know what the interrupt was. Soon
> after, it enables interrupts and only at that moment, it gets the interrupt
> from the processor. when L1 is KVM, Linux handles this interrupt.
> 
> Now we need exactly the same thing to happen when that L1->L2 system runs
> on top of L0, instead of real hardware. This is how we do this:
> 
> When L0 wants to inject an interrupt, it needs to exit from L2 to L1, with
> external-interrupt exit reason (with an invalid interrupt vector), and run L1.
> Just like in the bare metal case, it likely can't deliver the interrupt to
> L1 now because L1 is running with interrupts disabled, in which case it turns
> on the interrupt window when running L1 after the exit. L1 will soon enable
> interrupts, and at that point L0 will gain control again and inject the
> interrupt to L1.
> 
> Finally, there is an extra complication in the code: when nested_run_pending,
> we cannot return to L1 now, and must launch L2. We need to remember the
> interrupt we wanted to inject (and not clear it now), and do it on the
> next exit.
> 
> The above explanation shows that the relative strangeness of the nested
> interrupt injection code in this patch, and the extra interrupt-window
> exit incurred, are in fact necessary for accurate emulation, and are not
> just an unoptimized implementation.
> 
> Let's revisit now the two assumptions made above:
> 
> If L1 turns off PIN_BASED_EXT_INTR_MASK (no hypervisor that I know
> does, by the way), things are simple: L0 may inject the interrupt directly
> to the L2 guest - using the normal code path that injects to any guest.
> We support this case in the code below.
> 
> If L1 turns on VM_EXIT_ACK_INTR_ON_EXIT (again, no hypervisor that I know
> does), things look very different from the description above: L1 expects

Type-1 bare metal hypervisor may enable this bit, such as Xen. This bit is
really prepared for L2 hypervisor since normally L2 hypervisor is tricky to
touch generic interrupt logic, and thus better to not ack it until interrupt
is enabled and then hardware will gear to the kernel interrupt handler
automatically.

> to see an exit from L2 with the interrupt vector already filled in the exit
> information, and does not expect to be interrupted again with this interrupt.
> The current code does not (yet) support this case, so we do not allow the
> VM_EXIT_ACK_INTR_ON_EXIT exit-control to be turned on by L1.

Then just fill the interrupt vector field with the highest unmasked bit
from pending vIRR.

> 
> Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
> ---
>  arch/x86/kvm/vmx.c |   36 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 36 insertions(+)
> 
> --- .before/arch/x86/kvm/vmx.c	2011-05-16 22:36:49.000000000 +0300
> +++ .after/arch/x86/kvm/vmx.c	2011-05-16 22:36:49.000000000 +0300
> @@ -1788,6 +1788,7 @@ static __init void nested_vmx_setup_ctls
> 
>  	/* exit controls */
>  	nested_vmx_exit_ctls_low = 0;
> +	/* Note that guest use of VM_EXIT_ACK_INTR_ON_EXIT is not supported.
> */
>  #ifdef CONFIG_X86_64
>  	nested_vmx_exit_ctls_high = VM_EXIT_HOST_ADDR_SPACE_SIZE;
>  #else
> @@ -3733,9 +3734,25 @@ out:
>  	return ret;
>  }
> 
> +/*
> + * In nested virtualization, check if L1 asked to exit on external interrupts.
> + * For most existing hypervisors, this will always return true.
> + */
> +static bool nested_exit_on_intr(struct kvm_vcpu *vcpu)
> +{
> +	return get_vmcs12(vcpu)->pin_based_vm_exec_control &
> +		PIN_BASED_EXT_INTR_MASK;
> +}
> +

could be a similar common wrapper like nested_cpu_has...

Thanks,
Kevin

  reply	other threads:[~2011-05-25  8:40 UTC|newest]

Thread overview: 119+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-16 19:43 [PATCH 0/31] nVMX: Nested VMX, v10 Nadav Har'El
2011-05-16 19:44 ` [PATCH 01/31] nVMX: Add "nested" module option to kvm_intel Nadav Har'El
2011-05-16 19:44 ` [PATCH 02/31] nVMX: Implement VMXON and VMXOFF Nadav Har'El
2011-05-20  7:58   ` Tian, Kevin
2011-05-16 19:45 ` [PATCH 03/31] nVMX: Allow setting the VMXE bit in CR4 Nadav Har'El
2011-05-16 19:45 ` [PATCH 04/31] nVMX: Introduce vmcs12: a VMCS structure for L1 Nadav Har'El
2011-05-16 19:46 ` [PATCH 05/31] nVMX: Implement reading and writing of VMX MSRs Nadav Har'El
2011-05-16 19:46 ` [PATCH 06/31] nVMX: Decoding memory operands of VMX instructions Nadav Har'El
2011-05-16 19:47 ` [PATCH 07/31] nVMX: Introduce vmcs02: VMCS used to run L2 Nadav Har'El
2011-05-20  8:04   ` Tian, Kevin
2011-05-20  8:48     ` Tian, Kevin
2011-05-20 20:32       ` Nadav Har'El
2011-05-22  2:00         ` Tian, Kevin
2011-05-22  7:22           ` Nadav Har'El
2011-05-24  0:54             ` Tian, Kevin
2011-05-22  8:29     ` Nadav Har'El
2011-05-24  1:03       ` Tian, Kevin
2011-05-16 19:48 ` [PATCH 08/31] nVMX: Fix local_vcpus_link handling Nadav Har'El
2011-05-17 13:19   ` Marcelo Tosatti
2011-05-17 13:35     ` Avi Kivity
2011-05-17 14:35       ` Nadav Har'El
2011-05-17 14:42         ` Marcelo Tosatti
2011-05-17 17:57           ` Nadav Har'El
2011-05-17 15:11         ` Avi Kivity
2011-05-17 18:11           ` Nadav Har'El
2011-05-17 18:43             ` Marcelo Tosatti
2011-05-17 19:30               ` Nadav Har'El
2011-05-17 19:52                 ` Marcelo Tosatti
2011-05-18  5:52                   ` Nadav Har'El
2011-05-18  8:31                     ` Avi Kivity
2011-05-18  9:02                       ` Nadav Har'El
2011-05-18  9:16                         ` Avi Kivity
2011-05-18 12:08                     ` Marcelo Tosatti
2011-05-18 12:19                       ` Nadav Har'El
2011-05-22  8:57                       ` Nadav Har'El
2011-05-23 15:49                         ` Avi Kivity
2011-05-23 16:17                           ` Gleb Natapov
2011-05-23 18:59                             ` Nadav Har'El
2011-05-23 19:03                               ` Gleb Natapov
2011-05-23 16:43                           ` Roedel, Joerg
2011-05-23 16:51                             ` Avi Kivity
2011-05-24  9:22                               ` Roedel, Joerg
2011-05-24  9:28                                 ` Nadav Har'El
2011-05-24  9:57                                   ` Roedel, Joerg
2011-05-24 10:08                                     ` Avi Kivity
2011-05-24 10:12                                     ` Nadav Har'El
2011-05-23 18:51                           ` Nadav Har'El
2011-05-24  2:22                             ` Tian, Kevin
2011-05-24  7:56                               ` Nadav Har'El
2011-05-24  8:20                                 ` Tian, Kevin
2011-05-24 11:05                                   ` Avi Kivity
2011-05-24 11:20                                     ` Tian, Kevin
2011-05-24 11:27                                       ` Avi Kivity
2011-05-24 11:30                                         ` Tian, Kevin
2011-05-24 11:36                                           ` Avi Kivity
2011-05-24 11:40                                             ` Tian, Kevin
2011-05-24 11:59                                               ` Nadav Har'El
2011-05-24  0:57                           ` Tian, Kevin
2011-05-18  8:29                   ` Avi Kivity
2011-05-16 19:48 ` [PATCH 09/31] nVMX: Add VMCS fields to the vmcs12 Nadav Har'El
2011-05-20  8:22   ` Tian, Kevin
2011-05-16 19:49 ` [PATCH 10/31] nVMX: Success/failure of VMX instructions Nadav Har'El
2011-05-16 19:49 ` [PATCH 11/31] nVMX: Implement VMCLEAR Nadav Har'El
2011-05-16 19:50 ` [PATCH 12/31] nVMX: Implement VMPTRLD Nadav Har'El
2011-05-16 19:50 ` [PATCH 13/31] nVMX: Implement VMPTRST Nadav Har'El
2011-05-16 19:51 ` [PATCH 14/31] nVMX: Implement VMREAD and VMWRITE Nadav Har'El
2011-05-16 19:51 ` [PATCH 15/31] nVMX: Move host-state field setup to a function Nadav Har'El
2011-05-16 19:52 ` [PATCH 16/31] nVMX: Move control field setup to functions Nadav Har'El
2011-05-16 19:52 ` [PATCH 17/31] nVMX: Prepare vmcs02 from vmcs01 and vmcs12 Nadav Har'El
2011-05-24  8:02   ` Tian, Kevin
2011-05-24  9:19     ` Nadav Har'El
2011-05-24 10:52       ` Tian, Kevin
2011-05-16 19:53 ` [PATCH 18/31] nVMX: Implement VMLAUNCH and VMRESUME Nadav Har'El
2011-05-24  8:45   ` Tian, Kevin
2011-05-24  9:45     ` Nadav Har'El
2011-05-24 10:54       ` Tian, Kevin
2011-05-25  8:00   ` Tian, Kevin
2011-05-25 13:26     ` Nadav Har'El
2011-05-26  0:42       ` Tian, Kevin
2011-05-16 19:53 ` [PATCH 19/31] nVMX: No need for handle_vmx_insn function any more Nadav Har'El
2011-05-16 19:54 ` [PATCH 20/31] nVMX: Exiting from L2 to L1 Nadav Har'El
2011-05-24 12:58   ` Tian, Kevin
2011-05-24 13:43     ` Nadav Har'El
2011-05-25  0:55       ` Tian, Kevin
2011-05-25  8:06         ` Nadav Har'El
2011-05-25  8:23           ` Tian, Kevin
2011-05-25  2:43   ` Tian, Kevin
2011-05-25 13:21     ` Nadav Har'El
2011-05-26  0:41       ` Tian, Kevin
2011-05-16 19:54 ` [PATCH 21/31] nVMX: vmcs12 checks on nested entry Nadav Har'El
2011-05-25  3:01   ` Tian, Kevin
2011-05-25  5:38     ` Nadav Har'El
2011-05-25  7:33       ` Tian, Kevin
2011-05-16 19:55 ` [PATCH 22/31] nVMX: Deciding if L0 or L1 should handle an L2 exit Nadav Har'El
2011-05-25  7:56   ` Tian, Kevin
2011-05-25 13:45     ` Nadav Har'El
2011-05-16 19:55 ` [PATCH 23/31] nVMX: Correct handling of interrupt injection Nadav Har'El
2011-05-25  8:39   ` Tian, Kevin [this message]
2011-05-25  8:45     ` Tian, Kevin
2011-05-25 10:56     ` Nadav Har'El
2011-05-25  9:18   ` Tian, Kevin
2011-05-25 12:33     ` Nadav Har'El
2011-05-25 12:55       ` Tian, Kevin
2011-05-16 19:56 ` [PATCH 24/31] nVMX: Correct handling of exception injection Nadav Har'El
2011-05-16 19:56 ` [PATCH 25/31] nVMX: Correct handling of idt vectoring info Nadav Har'El
2011-05-25 10:02   ` Tian, Kevin
2011-05-25 10:13     ` Nadav Har'El
2011-05-25 10:17       ` Tian, Kevin
2011-05-16 19:57 ` [PATCH 26/31] nVMX: Handling of CR0 and CR4 modifying instructions Nadav Har'El
2011-05-16 19:57 ` [PATCH 27/31] nVMX: Further fixes for lazy FPU loading Nadav Har'El
2011-05-16 19:58 ` [PATCH 28/31] nVMX: Additional TSC-offset handling Nadav Har'El
2011-05-16 19:58 ` [PATCH 29/31] nVMX: Add VMX to list of supported cpuid features Nadav Har'El
2011-05-16 19:59 ` [PATCH 30/31] nVMX: Miscellenous small corrections Nadav Har'El
2011-05-16 19:59 ` [PATCH 31/31] nVMX: Documentation Nadav Har'El
2011-05-25 10:33   ` Tian, Kevin
2011-05-25 11:54     ` Nadav Har'El
2011-05-25 12:11       ` Tian, Kevin
2011-05-25 12:13     ` Muli Ben-Yehuda
2011-05-25 20:01 [PATCH 0/31] nVMX: Nested VMX, v11 Nadav Har'El
2011-05-25 20:13 ` [PATCH 23/31] nVMX: Correct handling of interrupt injection Nadav Har'El

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=625BA99ED14B2D499DC4E29D8138F1505C9BFA3A70@shsmsx502.ccr.corp.intel.com \
    --to=kevin.tian@intel.com \
    --cc=avi@redhat.com \
    --cc=gleb@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=nyh@il.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.