All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] EPT: Flush running cpus, add mask to flush when scheduled in
@ 2009-09-21 18:04 George Dunlap
  2009-09-21 18:04 ` George Dunlap
  0 siblings, 1 reply; 9+ messages in thread
From: George Dunlap @ 2009-09-21 18:04 UTC (permalink / raw)
  To: Keir Fraser
  Cc: Xiaohui Xin, xen-devel, Tim Deegan, Paul Durrant, Xin Li, Jun Nakajima

The attached patch modifies ept_sync_domain() to make it more
efficient wrt flushing ept translations.

Specifically:
* It synchronously flushes only cpus on which the domain is currently
running (d->domain_dirty_cpumask)
* It introduces a new vmx-specific mask, "ept_needs_flush", set to the
complement of d->domain_dirty_cpumask
* In vmx_ctxt_switch_to(), if the cpu is set in ept_needs_flush, it
flushes ept before running.

Main change I'd like reviewed: in order to avoid a potential race
condition described below, I had to re-order the setting of
domain_dirty_cpumask and the calling of arch.ctxt_switch_to() in
__context_switch().

Potential race without the re-ordering:
p1: change ept
p2: check ept_needs_flush, finds it false, doesn't flush
p1: sets ept_needs_flush
p1: checks domain_dirty, finds p2 not in it
p2: sets domain_dirty

Thus the vcpu on p2 is scheduled without its ept translations being flushed.

I've tested this with a 2-vcpu VM doing a parallel compile and
ballooning, no problems.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] EPT: Flush running cpus, add mask to flush when scheduled in
  2009-09-21 18:04 [PATCH] EPT: Flush running cpus, add mask to flush when scheduled in George Dunlap
@ 2009-09-21 18:04 ` George Dunlap
  2009-09-22  7:07   ` Jan Beulich
  0 siblings, 1 reply; 9+ messages in thread
From: George Dunlap @ 2009-09-21 18:04 UTC (permalink / raw)
  To: Keir Fraser
  Cc: Xiaohui Xin, xen-devel, Tim Deegan, Paul Durrant, Xin Li, Jun Nakajima

[-- Attachment #1: Type: text/plain, Size: 1233 bytes --]

Oops, patch...
 -George

On Mon, Sep 21, 2009 at 7:04 PM, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
> The attached patch modifies ept_sync_domain() to make it more
> efficient wrt flushing ept translations.
>
> Specifically:
> * It synchronously flushes only cpus on which the domain is currently
> running (d->domain_dirty_cpumask)
> * It introduces a new vmx-specific mask, "ept_needs_flush", set to the
> complement of d->domain_dirty_cpumask
> * In vmx_ctxt_switch_to(), if the cpu is set in ept_needs_flush, it
> flushes ept before running.
>
> Main change I'd like reviewed: in order to avoid a potential race
> condition described below, I had to re-order the setting of
> domain_dirty_cpumask and the calling of arch.ctxt_switch_to() in
> __context_switch().
>
> Potential race without the re-ordering:
> p1: change ept
> p2: check ept_needs_flush, finds it false, doesn't flush
> p1: sets ept_needs_flush
> p1: checks domain_dirty, finds p2 not in it
> p2: sets domain_dirty
>
> Thus the vcpu on p2 is scheduled without its ept translations being flushed.
>
> I've tested this with a 2-vcpu VM doing a parallel compile and
> ballooning, no problems.
>
> Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
>

[-- Attachment #2: 20090921-ept-careful-flushing.diff --]
[-- Type: text/x-diff, Size: 2569 bytes --]

diff -r 58ef1439ca98 xen/arch/x86/domain.c
--- a/xen/arch/x86/domain.c	Tue Sep 15 10:08:12 2009 +0100
+++ b/xen/arch/x86/domain.c	Mon Sep 21 18:52:26 2009 +0100
@@ -1324,6 +1324,13 @@
         p->arch.ctxt_switch_from(p);
     }
 
+    if ( p->domain != n->domain )
+        cpu_set(cpu, n->domain->domain_dirty_cpumask);
+    cpu_set(cpu, n->vcpu_dirty_cpumask);
+
+    /* Run arch.ctxt_switch_to() afrter domain_dirty_cpumask
+     * to make sure arch-dependent things (like ept flushing)
+     * happens at-least-once without locks */
     if ( !is_idle_vcpu(n) )
     {
         memcpy(stack_regs,
@@ -1332,10 +1339,6 @@
         n->arch.ctxt_switch_to(n);
     }
 
-    if ( p->domain != n->domain )
-        cpu_set(cpu, n->domain->domain_dirty_cpumask);
-    cpu_set(cpu, n->vcpu_dirty_cpumask);
-
     gdt = !is_pv_32on64_vcpu(n) ? per_cpu(gdt_table, cpu) :
                                   per_cpu(compat_gdt_table, cpu);
     if ( need_full_gdt(n) )
diff -r 58ef1439ca98 xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c	Tue Sep 15 10:08:12 2009 +0100
+++ b/xen/arch/x86/hvm/vmx/vmx.c	Mon Sep 21 18:52:26 2009 +0100
@@ -666,10 +666,17 @@
 
 static void vmx_ctxt_switch_to(struct vcpu *v)
 {
+    struct domain *d = v->domain;
+
     /* HOST_CR4 in VMCS is always mmu_cr4_features. Sync CR4 now. */
     if ( unlikely(read_cr4() != mmu_cr4_features) )
         write_cr4(mmu_cr4_features);
 
+    if ( d->arch.hvm_domain.hap_enabled
+         && cpu_test_and_clear(smp_processor_id(),
+                               d->arch.hvm_domain.vmx.ept_needs_flush)) 
+        __invept(1, d->arch.hvm_domain.vmx.ept_control.eptp, 0);
+
     vmx_restore_guest_msrs(v);
     vmx_restore_dr(v);
     vpmu_load(v);
@@ -1219,7 +1226,11 @@
     if ( d->arch.hvm_domain.hap_enabled && d->vcpu && d->vcpu[0] )
     {
         ASSERT(local_irq_is_enabled());
-        on_each_cpu(__ept_sync_domain, d, 1);
+        /* Mark cpus that may need flushing on next schedule */
+        cpus_complement(d->arch.hvm_domain.vmx.ept_needs_flush,
+                        d->domain_dirty_cpumask);
+        /* And flush on actively-running processors */
+        on_selected_cpus(&d->domain_dirty_cpumask, __ept_sync_domain, d, 1);
     }
 }
 
diff -r 58ef1439ca98 xen/include/asm-x86/hvm/vmx/vmcs.h
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h	Tue Sep 15 10:08:12 2009 +0100
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h	Mon Sep 21 18:52:26 2009 +0100
@@ -67,6 +67,7 @@
         };
         u64 eptp;
     } ept_control;
+    cpumask_t ept_needs_flush;
 };
 
 struct arch_vmx_struct {

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] EPT: Flush running cpus, add mask to flush when scheduled in
  2009-09-21 18:04 ` George Dunlap
@ 2009-09-22  7:07   ` Jan Beulich
  2009-09-22  8:07     ` Keir Fraser
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2009-09-22  7:07 UTC (permalink / raw)
  To: George Dunlap
  Cc: Xiaohui Xin, Xin Li, Tim Deegan, Paul Durrant, Keir Fraser,
	Jun Nakajima, xen-devel

>--- a/xen/arch/x86/hvm/vmx/vmx.c	Tue Sep 15 10:08:12 2009 +0100
>+++ b/xen/arch/x86/hvm/vmx/vmx.c	Mon Sep 21 18:52:26 2009 +0100
>...
>@@ -1219,7 +1226,11 @@
>     if ( d->arch.hvm_domain.hap_enabled && d->vcpu && d->vcpu[0] )
>     {
>         ASSERT(local_irq_is_enabled());
>-        on_each_cpu(__ept_sync_domain, d, 1);
>+        /* Mark cpus that may need flushing on next schedule */
>+        cpus_complement(d->arch.hvm_domain.vmx.ept_needs_flush,
>+                        d->domain_dirty_cpumask);
>+        /* And flush on actively-running processors */
>+        on_selected_cpus(&d->domain_dirty_cpumask, __ept_sync_domain, d, 1);
>     }
> }
> 

Passing a pointer to the global cpu mask looks racy here: What if a CPU
disappears from domain_dirty_cpumask under your feet?

Also, merely using cpus_complement() here seem inefficient: It should be
possible to accumulate the flush activity, and avoid re-flushing on a CPU
which e.g. got flushed on the second from the last run through this code
(and not dirtied afterwards). I.e. I'd think there should be cpus_andnot()
here, and setting of the bits as CPUs get dirtied.

Jan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: [PATCH] EPT: Flush running cpus, add mask to flush when scheduled in
  2009-09-22  7:07   ` Jan Beulich
@ 2009-09-22  8:07     ` Keir Fraser
  2009-09-22  8:20       ` Keir Fraser
  2009-09-22  8:23       ` Jan Beulich
  0 siblings, 2 replies; 9+ messages in thread
From: Keir Fraser @ 2009-09-22  8:07 UTC (permalink / raw)
  To: Jan Beulich, George Dunlap
  Cc: Xiaohui Xin, Xin Li, Tim Deegan, Paul Durrant, Jun Nakajima, xen-devel

On 22/09/2009 08:07, "Jan Beulich" <JBeulich@novell.com> wrote:

> Passing a pointer to the global cpu mask looks racy here: What if a CPU
> disappears from domain_dirty_cpumask under your feet?

I'm fixing this race before I apply the patch.

> Also, merely using cpus_complement() here seem inefficient: It should be
> possible to accumulate the flush activity, and avoid re-flushing on a CPU
> which e.g. got flushed on the second from the last run through this code
> (and not dirtied afterwards). I.e. I'd think there should be cpus_andnot()
> here, and setting of the bits as CPUs get dirtied.

I don't see how that is possible, as domain_dirty_cpumask can have changed
arbitrarily since the previous invocation of ept_sync_domain(), as can the
EPT tables. We have to assume every CPU has potentially stale cached
mappings.

 -- Keir

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: [PATCH] EPT: Flush running cpus, add mask to flush when scheduled in
  2009-09-22  8:07     ` Keir Fraser
@ 2009-09-22  8:20       ` Keir Fraser
  2009-09-22  9:02         ` Jan Beulich
  2009-09-22  8:23       ` Jan Beulich
  1 sibling, 1 reply; 9+ messages in thread
From: Keir Fraser @ 2009-09-22  8:20 UTC (permalink / raw)
  To: Jan Beulich, George Dunlap
  Cc: Xiaohui Xin, Xin Li, Tim Deegan, Paul Durrant, Jun Nakajima, xen-devel

On 22/09/2009 09:07, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:

> On 22/09/2009 08:07, "Jan Beulich" <JBeulich@novell.com> wrote:
> 
>> Passing a pointer to the global cpu mask looks racy here: What if a CPU
>> disappears from domain_dirty_cpumask under your feet?
> 
> I'm fixing this race before I apply the patch.

George, Jan,

Please see what you think of xen-unstable:20244.

 -- Keir

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: [PATCH] EPT: Flush running cpus, add mask to  flush when scheduled in
  2009-09-22  8:07     ` Keir Fraser
  2009-09-22  8:20       ` Keir Fraser
@ 2009-09-22  8:23       ` Jan Beulich
  2009-09-22  8:33         ` Keir Fraser
  1 sibling, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2009-09-22  8:23 UTC (permalink / raw)
  To: George Dunlap, Keir Fraser
  Cc: Xiaohui Xin, Xin Li, Tim Deegan, Paul Durrant, Jun Nakajima, xen-devel

>>> Keir Fraser <keir.fraser@eu.citrix.com> 22.09.09 10:07 >>>
>On 22/09/2009 08:07, "Jan Beulich" <JBeulich@novell.com> wrote:
>> Also, merely using cpus_complement() here seem inefficient: It should be
>> possible to accumulate the flush activity, and avoid re-flushing on a CPU
>> which e.g. got flushed on the second from the last run through this code
>> (and not dirtied afterwards). I.e. I'd think there should be cpus_andnot()
>> here, and setting of the bits as CPUs get dirtied.
>
>I don't see how that is possible, as domain_dirty_cpumask can have changed
>arbitrarily since the previous invocation of ept_sync_domain(), as can the
>EPT tables. We have to assume every CPU has potentially stale cached
>mappings.

Why? It ought to be possible to know which CPUs the guest has run on
for any period of time. Any CPU it hasn't run on wouldn't need flushing,
would it?

Jan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: [PATCH] EPT: Flush running cpus, add mask  to flush when scheduled in
  2009-09-22  8:23       ` Jan Beulich
@ 2009-09-22  8:33         ` Keir Fraser
  0 siblings, 0 replies; 9+ messages in thread
From: Keir Fraser @ 2009-09-22  8:33 UTC (permalink / raw)
  To: Jan Beulich, George Dunlap
  Cc: Xiaohui Xin, Xin Li, Tim Deegan, Paul Durrant, Jun Nakajima, xen-devel

On 22/09/2009 09:23, "Jan Beulich" <JBeulich@novell.com> wrote:

>> I don't see how that is possible, as domain_dirty_cpumask can have changed
>> arbitrarily since the previous invocation of ept_sync_domain(), as can the
>> EPT tables. We have to assume every CPU has potentially stale cached
>> mappings.
> 
> Why? It ought to be possible to know which CPUs the guest has run on
> for any period of time. Any CPU it hasn't run on wouldn't need flushing,
> would it?

Probably not, as long as VMCLEAR is sufficient to stop a particular EPTP's
entries being cached? We'd need to maintain an extra cpumask to track this
however, and it's not clear it's worth it. This is really just about
optimising large numbers of almost back-to-back invocations of
ept_sync_domain(), where the # flushes of not-currently-active cpus will
probably collapse to 1 with the current logic in the patch. The time between
batches of calls should be large enough that the extra flushes caused by not
tracking VCPU movements between the batches really shouldn't matter very
much.

 -- Keir

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: [PATCH] EPT: Flush running cpus, add mask to flush when scheduled in
  2009-09-22  8:20       ` Keir Fraser
@ 2009-09-22  9:02         ` Jan Beulich
  2009-09-22  9:17           ` Keir Fraser
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2009-09-22  9:02 UTC (permalink / raw)
  To: Keir Fraser
  Cc: Xiaohui Xin, Xin Li, George Dunlap, Tim Deegan, Paul Durrant,
	Jun Nakajima, xen-devel

>>> Keir Fraser <keir.fraser@eu.citrix.com> 22.09.09 10:20 >>>
>On 22/09/2009 09:07, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:
>
>> On 22/09/2009 08:07, "Jan Beulich" <JBeulich@novell.com> wrote:
>> 
>>> Passing a pointer to the global cpu mask looks racy here: What if a CPU
>>> disappears from domain_dirty_cpumask under your feet?
>> 
>> I'm fixing this race before I apply the patch.
>
>George, Jan,
>
>Please see what you think of xen-unstable:20244.

With no assertion in ept_sync_domain() on any locks held, is it guaranteed
that the function cannot be entered twice at the same time for a given
guest? If not, passing a pointer to the new ept_synced member isn't any
better than passing the one to domain_dirty_cpumask.

Jan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: [PATCH] EPT: Flush running cpus, add mask  to flush when scheduled in
  2009-09-22  9:02         ` Jan Beulich
@ 2009-09-22  9:17           ` Keir Fraser
  0 siblings, 0 replies; 9+ messages in thread
From: Keir Fraser @ 2009-09-22  9:17 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Xiaohui Xin, Xin Li, George Dunlap, Tim Deegan, Paul Durrant,
	Jun Nakajima, xen-devel

On 22/09/2009 10:02, "Jan Beulich" <JBeulich@novell.com> wrote:

>> Please see what you think of xen-unstable:20244.
> 
> With no assertion in ept_sync_domain() on any locks held, is it guaranteed
> that the function cannot be entered twice at the same time for a given
> guest? If not, passing a pointer to the new ept_synced member isn't any
> better than passing the one to domain_dirty_cpumask.

I assume George is knowledgeable on that area. If calls to ept_sync_domain()
are not serialised then I think synchronisation around the
ept_needs_flush/ept_synced cpumask is indeed pretty suspect. If there isn't
such a serialising lock, we could add one to ept_sync_domain() quite safely.

 -- Keir

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-09-22  9:17 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-21 18:04 [PATCH] EPT: Flush running cpus, add mask to flush when scheduled in George Dunlap
2009-09-21 18:04 ` George Dunlap
2009-09-22  7:07   ` Jan Beulich
2009-09-22  8:07     ` Keir Fraser
2009-09-22  8:20       ` Keir Fraser
2009-09-22  9:02         ` Jan Beulich
2009-09-22  9:17           ` Keir Fraser
2009-09-22  8:23       ` Jan Beulich
2009-09-22  8:33         ` Keir Fraser

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.