All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: "Nadav Har'El" <nyh@il.ibm.com>
Cc: kvm@vger.kernel.org, gleb@redhat.com, avi@redhat.com
Subject: Re: [PATCH 08/31] nVMX: Fix local_vcpus_link handling
Date: Tue, 17 May 2011 10:19:18 -0300	[thread overview]
Message-ID: <20110517131918.GA3809@amt.cnet> (raw)
In-Reply-To: <201105161948.p4GJm1as001742@rice.haifa.ibm.com>

On Mon, May 16, 2011 at 10:48:01PM +0300, Nadav Har'El wrote:
> In VMX, before we bring down a CPU we must VMCLEAR all VMCSs loaded on it
> because (at least in theory) the processor might not have written all of its
> content back to memory. Since a patch from June 26, 2008, this is done using
> a per-cpu "vcpus_on_cpu" linked list of vcpus loaded on each CPU.
> 
> The problem is that with nested VMX, we no longer have the concept of a
> vcpu being loaded on a cpu: A vcpu has multiple VMCSs (one for L1, a pool for
> L2s), and each of those may be have been last loaded on a different cpu.
> 
> Our solution is to hold, in addition to vcpus_on_cpu, a second linked list
> saved_vmcss_on_cpu, which holds the current list of "saved" VMCSs, i.e.,
> VMCSs which are loaded on this CPU but are not the vmx->vmcs of any of
> the vcpus. These saved VMCSs include L1's VMCS while L2 is running
> (saved_vmcs01), and L2 VMCSs not currently used - because L1 is running or
> because the vmcs02_pool contains more than one entry.
> 
> When we will switch between L1's and L2's VMCSs, they need to be moved
> between vcpus_on_cpu and saved_vmcs_on_cpu lists and vice versa. A new
> function, nested_maintain_per_cpu_lists(), takes care of that.
> 
> Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
> ---
>  arch/x86/kvm/vmx.c |   67 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 67 insertions(+)
> 
> --- .before/arch/x86/kvm/vmx.c	2011-05-16 22:36:47.000000000 +0300
> +++ .after/arch/x86/kvm/vmx.c	2011-05-16 22:36:47.000000000 +0300
> @@ -181,6 +181,7 @@ struct saved_vmcs {
>  	struct vmcs *vmcs;
>  	int cpu;
>  	int launched;
> +	struct list_head local_saved_vmcss_link; /* see saved_vmcss_on_cpu */
>  };
>  
>  /* Used to remember the last vmcs02 used for some recently used vmcs12s */
> @@ -315,7 +316,20 @@ static int vmx_set_tss_addr(struct kvm *
>  
>  static DEFINE_PER_CPU(struct vmcs *, vmxarea);
>  static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
> +/*
> + * We maintain a per-CPU linked-list vcpus_on_cpu, holding for each CPU a list
> + * of vcpus whose VMCS are loaded on that CPU. This is needed when a CPU is
> + * brought down, and we need to VMCLEAR all VMCSs loaded on it.
> + *
> + * With nested VMX, we have additional VMCSs which are not the current
> + * vmx->vmcs of any vcpu, but may also be loaded on some CPU: While L2 is
> + * running, L1's VMCS is loaded but not the VMCS of any vcpu; While L1 is
> + * running, a previously used L2 VMCS might still be around and loaded on some
> + * CPU, somes even more than one such L2 VMCSs is kept (see VMCS02_POOL_SIZE).
> + * The list of these additional VMCSs is kept on cpu saved_vmcss_on_cpu.
> + */
>  static DEFINE_PER_CPU(struct list_head, vcpus_on_cpu);
> +static DEFINE_PER_CPU(struct list_head, saved_vmcss_on_cpu);
>  static DEFINE_PER_CPU(struct desc_ptr, host_gdt);
>  
>  static unsigned long *vmx_io_bitmap_a;
> @@ -1818,6 +1832,7 @@ static int hardware_enable(void *garbage
>  		return -EBUSY;
>  
>  	INIT_LIST_HEAD(&per_cpu(vcpus_on_cpu, cpu));
> +	INIT_LIST_HEAD(&per_cpu(saved_vmcss_on_cpu, cpu));
>  	rdmsrl(MSR_IA32_FEATURE_CONTROL, old);
>  
>  	test_bits = FEATURE_CONTROL_LOCKED;
> @@ -1860,10 +1875,13 @@ static void kvm_cpu_vmxoff(void)
>  	asm volatile (__ex(ASM_VMX_VMXOFF) : : : "cc");
>  }
>  
> +static void vmclear_local_saved_vmcss(void);
> +
>  static void hardware_disable(void *garbage)
>  {
>  	if (vmm_exclusive) {
>  		vmclear_local_vcpus();
> +		vmclear_local_saved_vmcss();
>  		kvm_cpu_vmxoff();
>  	}
>  	write_cr4(read_cr4() & ~X86_CR4_VMXE);
> @@ -4248,6 +4266,8 @@ static void __nested_free_saved_vmcs(voi
>  	vmcs_clear(saved_vmcs->vmcs);
>  	if (per_cpu(current_vmcs, saved_vmcs->cpu) == saved_vmcs->vmcs)
>  		per_cpu(current_vmcs, saved_vmcs->cpu) = NULL;
> +	list_del(&saved_vmcs->local_saved_vmcss_link);
> +	saved_vmcs->cpu = -1;
>  }
>  
>  /*
> @@ -4265,6 +4285,21 @@ static void nested_free_saved_vmcs(struc
>  	free_vmcs(saved_vmcs->vmcs);
>  }
>  
> +/*
> + * VMCLEAR all the currently unused (not vmx->vmcs on any vcpu) saved_vmcss
> + * which were loaded on the current CPU. See also vmclear_load_vcpus(), which
> + * does the same for VMCS currently used in vcpus.
> + */
> +static void vmclear_local_saved_vmcss(void)
> +{
> +	int cpu = raw_smp_processor_id();
> +	struct saved_vmcs *v, *n;
> +
> +	list_for_each_entry_safe(v, n, &per_cpu(saved_vmcss_on_cpu, cpu),
> +				 local_saved_vmcss_link)
> +		__nested_free_saved_vmcs(v);
> +}
> +
>  /* Free and remove from pool a vmcs02 saved for a vmcs12 (if there is one) */
>  static void nested_free_vmcs02(struct vcpu_vmx *vmx, gpa_t vmptr)
>  {
> @@ -5143,6 +5178,38 @@ static void vmx_set_supported_cpuid(u32 
>  {
>  }
>  
> +/*
> + * Maintain the vcpus_on_cpu and saved_vmcss_on_cpu lists of vcpus and
> + * inactive saved_vmcss on nested entry (L1->L2) or nested exit (L2->L1).
> + *
> + * nested_maintain_per_cpu_lists should be called after the VMCS was switched
> + * to the new one, with parameters giving both the new on (after the entry
> + * or exit) and the old one, in that order.
> + */
> +static void nested_maintain_per_cpu_lists(struct vcpu_vmx *vmx,
> +		struct saved_vmcs *new_vmcs,
> +		struct saved_vmcs *old_vmcs)
> +{
> +	/*
> +	 * When a vcpus's old vmcs is saved, we need to drop it from
> +	 * vcpus_on_cpu and put it on saved_vmcss_on_cpu.
> +	 */
> +	if (old_vmcs->cpu != -1) {
> +		list_del(&vmx->local_vcpus_link);
> +		list_add(&old_vmcs->local_saved_vmcss_link,
> +			 &per_cpu(saved_vmcss_on_cpu, old_vmcs->cpu));
> +	}

This new handling of vmcs could be simplified (local_vcpus_link must be
manipulated with interrupts disabled, BTW).

What about having a per-CPU VMCS list instead of per-CPU vcpu list?
"local_vmcs_link" list node could be in "struct saved_vmcs" (and 
a current_saved_vmcs pointer in "struct vcpu_vmx").

vmx_vcpu_load would then add to this list at

        if (per_cpu(current_vmcs, cpu) != vmx->vmcs) {
                per_cpu(current_vmcs, cpu) = vmx->vmcs;
                vmcs_load(vmx->vmcs);
        }

  reply	other threads:[~2011-05-17 13:19 UTC|newest]

Thread overview: 118+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-16 19:43 [PATCH 0/31] nVMX: Nested VMX, v10 Nadav Har'El
2011-05-16 19:44 ` [PATCH 01/31] nVMX: Add "nested" module option to kvm_intel Nadav Har'El
2011-05-16 19:44 ` [PATCH 02/31] nVMX: Implement VMXON and VMXOFF Nadav Har'El
2011-05-20  7:58   ` Tian, Kevin
2011-05-16 19:45 ` [PATCH 03/31] nVMX: Allow setting the VMXE bit in CR4 Nadav Har'El
2011-05-16 19:45 ` [PATCH 04/31] nVMX: Introduce vmcs12: a VMCS structure for L1 Nadav Har'El
2011-05-16 19:46 ` [PATCH 05/31] nVMX: Implement reading and writing of VMX MSRs Nadav Har'El
2011-05-16 19:46 ` [PATCH 06/31] nVMX: Decoding memory operands of VMX instructions Nadav Har'El
2011-05-16 19:47 ` [PATCH 07/31] nVMX: Introduce vmcs02: VMCS used to run L2 Nadav Har'El
2011-05-20  8:04   ` Tian, Kevin
2011-05-20  8:48     ` Tian, Kevin
2011-05-20 20:32       ` Nadav Har'El
2011-05-22  2:00         ` Tian, Kevin
2011-05-22  7:22           ` Nadav Har'El
2011-05-24  0:54             ` Tian, Kevin
2011-05-22  8:29     ` Nadav Har'El
2011-05-24  1:03       ` Tian, Kevin
2011-05-16 19:48 ` [PATCH 08/31] nVMX: Fix local_vcpus_link handling Nadav Har'El
2011-05-17 13:19   ` Marcelo Tosatti [this message]
2011-05-17 13:35     ` Avi Kivity
2011-05-17 14:35       ` Nadav Har'El
2011-05-17 14:42         ` Marcelo Tosatti
2011-05-17 17:57           ` Nadav Har'El
2011-05-17 15:11         ` Avi Kivity
2011-05-17 18:11           ` Nadav Har'El
2011-05-17 18:43             ` Marcelo Tosatti
2011-05-17 19:30               ` Nadav Har'El
2011-05-17 19:52                 ` Marcelo Tosatti
2011-05-18  5:52                   ` Nadav Har'El
2011-05-18  8:31                     ` Avi Kivity
2011-05-18  9:02                       ` Nadav Har'El
2011-05-18  9:16                         ` Avi Kivity
2011-05-18 12:08                     ` Marcelo Tosatti
2011-05-18 12:19                       ` Nadav Har'El
2011-05-22  8:57                       ` Nadav Har'El
2011-05-23 15:49                         ` Avi Kivity
2011-05-23 16:17                           ` Gleb Natapov
2011-05-23 18:59                             ` Nadav Har'El
2011-05-23 19:03                               ` Gleb Natapov
2011-05-23 16:43                           ` Roedel, Joerg
2011-05-23 16:51                             ` Avi Kivity
2011-05-24  9:22                               ` Roedel, Joerg
2011-05-24  9:28                                 ` Nadav Har'El
2011-05-24  9:57                                   ` Roedel, Joerg
2011-05-24 10:08                                     ` Avi Kivity
2011-05-24 10:12                                     ` Nadav Har'El
2011-05-23 18:51                           ` Nadav Har'El
2011-05-24  2:22                             ` Tian, Kevin
2011-05-24  7:56                               ` Nadav Har'El
2011-05-24  8:20                                 ` Tian, Kevin
2011-05-24 11:05                                   ` Avi Kivity
2011-05-24 11:20                                     ` Tian, Kevin
2011-05-24 11:27                                       ` Avi Kivity
2011-05-24 11:30                                         ` Tian, Kevin
2011-05-24 11:36                                           ` Avi Kivity
2011-05-24 11:40                                             ` Tian, Kevin
2011-05-24 11:59                                               ` Nadav Har'El
2011-05-24  0:57                           ` Tian, Kevin
2011-05-18  8:29                   ` Avi Kivity
2011-05-16 19:48 ` [PATCH 09/31] nVMX: Add VMCS fields to the vmcs12 Nadav Har'El
2011-05-20  8:22   ` Tian, Kevin
2011-05-16 19:49 ` [PATCH 10/31] nVMX: Success/failure of VMX instructions Nadav Har'El
2011-05-16 19:49 ` [PATCH 11/31] nVMX: Implement VMCLEAR Nadav Har'El
2011-05-16 19:50 ` [PATCH 12/31] nVMX: Implement VMPTRLD Nadav Har'El
2011-05-16 19:50 ` [PATCH 13/31] nVMX: Implement VMPTRST Nadav Har'El
2011-05-16 19:51 ` [PATCH 14/31] nVMX: Implement VMREAD and VMWRITE Nadav Har'El
2011-05-16 19:51 ` [PATCH 15/31] nVMX: Move host-state field setup to a function Nadav Har'El
2011-05-16 19:52 ` [PATCH 16/31] nVMX: Move control field setup to functions Nadav Har'El
2011-05-16 19:52 ` [PATCH 17/31] nVMX: Prepare vmcs02 from vmcs01 and vmcs12 Nadav Har'El
2011-05-24  8:02   ` Tian, Kevin
2011-05-24  9:19     ` Nadav Har'El
2011-05-24 10:52       ` Tian, Kevin
2011-05-16 19:53 ` [PATCH 18/31] nVMX: Implement VMLAUNCH and VMRESUME Nadav Har'El
2011-05-24  8:45   ` Tian, Kevin
2011-05-24  9:45     ` Nadav Har'El
2011-05-24 10:54       ` Tian, Kevin
2011-05-25  8:00   ` Tian, Kevin
2011-05-25 13:26     ` Nadav Har'El
2011-05-26  0:42       ` Tian, Kevin
2011-05-16 19:53 ` [PATCH 19/31] nVMX: No need for handle_vmx_insn function any more Nadav Har'El
2011-05-16 19:54 ` [PATCH 20/31] nVMX: Exiting from L2 to L1 Nadav Har'El
2011-05-24 12:58   ` Tian, Kevin
2011-05-24 13:43     ` Nadav Har'El
2011-05-25  0:55       ` Tian, Kevin
2011-05-25  8:06         ` Nadav Har'El
2011-05-25  8:23           ` Tian, Kevin
2011-05-25  2:43   ` Tian, Kevin
2011-05-25 13:21     ` Nadav Har'El
2011-05-26  0:41       ` Tian, Kevin
2011-05-16 19:54 ` [PATCH 21/31] nVMX: vmcs12 checks on nested entry Nadav Har'El
2011-05-25  3:01   ` Tian, Kevin
2011-05-25  5:38     ` Nadav Har'El
2011-05-25  7:33       ` Tian, Kevin
2011-05-16 19:55 ` [PATCH 22/31] nVMX: Deciding if L0 or L1 should handle an L2 exit Nadav Har'El
2011-05-25  7:56   ` Tian, Kevin
2011-05-25 13:45     ` Nadav Har'El
2011-05-16 19:55 ` [PATCH 23/31] nVMX: Correct handling of interrupt injection Nadav Har'El
2011-05-25  8:39   ` Tian, Kevin
2011-05-25  8:45     ` Tian, Kevin
2011-05-25 10:56     ` Nadav Har'El
2011-05-25  9:18   ` Tian, Kevin
2011-05-25 12:33     ` Nadav Har'El
2011-05-25 12:55       ` Tian, Kevin
2011-05-16 19:56 ` [PATCH 24/31] nVMX: Correct handling of exception injection Nadav Har'El
2011-05-16 19:56 ` [PATCH 25/31] nVMX: Correct handling of idt vectoring info Nadav Har'El
2011-05-25 10:02   ` Tian, Kevin
2011-05-25 10:13     ` Nadav Har'El
2011-05-25 10:17       ` Tian, Kevin
2011-05-16 19:57 ` [PATCH 26/31] nVMX: Handling of CR0 and CR4 modifying instructions Nadav Har'El
2011-05-16 19:57 ` [PATCH 27/31] nVMX: Further fixes for lazy FPU loading Nadav Har'El
2011-05-16 19:58 ` [PATCH 28/31] nVMX: Additional TSC-offset handling Nadav Har'El
2011-05-16 19:58 ` [PATCH 29/31] nVMX: Add VMX to list of supported cpuid features Nadav Har'El
2011-05-16 19:59 ` [PATCH 30/31] nVMX: Miscellenous small corrections Nadav Har'El
2011-05-16 19:59 ` [PATCH 31/31] nVMX: Documentation Nadav Har'El
2011-05-25 10:33   ` Tian, Kevin
2011-05-25 11:54     ` Nadav Har'El
2011-05-25 12:11       ` Tian, Kevin
2011-05-25 12:13     ` Muli Ben-Yehuda

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110517131918.GA3809@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=avi@redhat.com \
    --cc=gleb@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=nyh@il.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.