All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCHv1 0/3]: x86/ept: reduce translation invalidation impact
@ 2015-11-06 17:37 David Vrabel
  2015-11-06 17:37 ` [PATCHv1 1/3] x86/ept: remove unnecessary sync after resolving misconfigured entries David Vrabel
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: David Vrabel @ 2015-11-06 17:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Kevin Tian, Jan Beulich, David Vrabel, Jun Nakajima

This RFC series improves the performance of EPT by reducing the impact
of the translation invalidations (ept_sync_domain()).  Two approaches
are used:

a) Removing unnecessary invalidations after fixing misconfigured
   entries (after a type change).

b) Deferring invalidations until the p2m write lock is released.

Prior to this change a 16 VCPU guest could not be successfully
migrated on an (admittedly slow) 160 PCPU box because the p2m write
lock was held for such extended periods of time.  This starved the
read lock needed (by the toolstack) to map the domain's memory,
triggering the watchdog.

After this change a 64 VCPU guest could be successfully migrated.

ept_sync_domain() is very expensive because:

a) it uses on_selected_cpus() and the IPI cost can be particularly
   high for a multi-socket machine.

b) on_selected_cpus() is serialized by its own spin lock.

On this particular box, ept_sync_domain() could take ~3-5 ms.

Simply using a fair rw lock was not sufficient to resolve this (but it
was an improvement) as the cost of the ept_sync_domain calls() was
still delaying the read locks enough for the watchdog to trigger (the
toolstack maps a batch of 1024 GFNs at a time, which means trying to
acquire the p2m read lock 1024 times).

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCHv1 1/3] x86/ept: remove unnecessary sync after resolving misconfigured entries
  2015-11-06 17:37 [RFC PATCHv1 0/3]: x86/ept: reduce translation invalidation impact David Vrabel
@ 2015-11-06 17:37 ` David Vrabel
  2015-11-06 18:29   ` Andrew Cooper
  2015-11-10 12:22   ` Jan Beulich
  2015-11-06 17:37 ` [PATCHv1 2/3] mm: don't free pages until mm locks are released David Vrabel
  2015-11-06 17:37 ` [PATCHv1 3/3] x86/ept: defer the invalidation until the p2m lock is released David Vrabel
  2 siblings, 2 replies; 11+ messages in thread
From: David Vrabel @ 2015-11-06 17:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Kevin Tian, Jan Beulich, David Vrabel, Jun Nakajima

When using EPT, type changes are done with the following steps:

1. Set entry as invalid (misconfigured) by settings a reserved memory
type.

2. Flush all EPT and combined translations (ept_sync_domain()).

3. Fixup misconfigured entries as required (on EPT_MISCONFIG vmexits or
when explicitly setting an entry.

Since resolve_misconfig() only updates entries that were misconfigured,
there is no need to invalidate any translations since the hardware
does not cache misconfigured translations (vol 3, section 28.3.2).

Remove the unnecessary (and very expensive) ept_sync_domain() calls).

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 xen/arch/x86/mm/p2m-ept.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 86440fc..a41d7d2 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -644,7 +644,6 @@ bool_t ept_handle_misconfig(uint64_t gpa)
     spurious = curr->arch.hvm_vmx.ept_spurious_misconfig;
     rc = resolve_misconfig(p2m, PFN_DOWN(gpa));
     curr->arch.hvm_vmx.ept_spurious_misconfig = 0;
-    ept_sync_domain(p2m);
 
     p2m_unlock(p2m);
 
@@ -692,12 +691,7 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
     /* Carry out any eventually pending earlier changes first. */
     ret = resolve_misconfig(p2m, gfn);
     if ( ret < 0 )
-    {
-        ept_sync_domain(p2m);
         return ret;
-    }
-    if ( ret > 0 )
-        needs_sync = sync_on;
 
     ASSERT((target == 2 && hap_has_1gb) ||
            (target == 1 && hap_has_2mb) ||
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCHv1 2/3] mm: don't free pages until mm locks are released
  2015-11-06 17:37 [RFC PATCHv1 0/3]: x86/ept: reduce translation invalidation impact David Vrabel
  2015-11-06 17:37 ` [PATCHv1 1/3] x86/ept: remove unnecessary sync after resolving misconfigured entries David Vrabel
@ 2015-11-06 17:37 ` David Vrabel
  2015-11-06 17:37 ` [PATCHv1 3/3] x86/ept: defer the invalidation until the p2m lock is released David Vrabel
  2 siblings, 0 replies; 11+ messages in thread
From: David Vrabel @ 2015-11-06 17:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Kevin Tian, Jan Beulich, David Vrabel, Jun Nakajima

If a page is freed without translations being invalidated, and the page is
subsequently allocated to another domain, a guest with a cached
translation will still be able to access the page.

Currently translations are invalidated before releasing the page ref, but
while still holding the mm locks.  To allow translations to be invalidated
without holding the mm locks, we need to keep a reference to the page
for a bit longer in some cases.

[ This seems difficult to a) verify as correct; and b) difficult to get
correct in the future.  A better suggestion would be useful.  Perhaps
using something like pg->tlbflush_needed mechanism that already exists
for pages from PV guests? ]

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 xen/arch/x86/mm/p2m.c | 4 ++++
 xen/common/memory.c   | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index ed0bbd7..e13672d 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -2758,6 +2758,7 @@ int p2m_add_foreign(struct domain *tdom, unsigned long fgfn,
     p2m_type_t p2mt, p2mt_prev;
     unsigned long prev_mfn, mfn;
     struct page_info *page;
+    struct page_info *prev_page = NULL;
     int rc;
     struct domain *fdom;
 
@@ -2805,6 +2806,9 @@ int p2m_add_foreign(struct domain *tdom, unsigned long fgfn,
     prev_mfn = mfn_x(get_gfn(tdom, gpfn, &p2mt_prev));
     if ( mfn_valid(_mfn(prev_mfn)) )
     {
+        prev_page = mfn_to_page(_mfn(prev_mfn));
+        get_page(prev_page, tdom);
+
         if ( is_xen_heap_mfn(prev_mfn) )
             /* Xen heap frames are simply unhooked from this phys slot */
             guest_physmap_remove_page(tdom, gpfn, prev_mfn, 0);
diff --git a/xen/common/memory.c b/xen/common/memory.c
index a3bffb7..571c754 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -272,8 +272,8 @@ int guest_remove_page(struct domain *d, unsigned long gmfn)
 
     guest_physmap_remove_page(d, gmfn, mfn, 0);
 
-    put_page(page);
     put_gfn(d, gmfn);
+    put_page(page);
 
     return 1;
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCHv1 3/3] x86/ept: defer the invalidation until the p2m lock is released
  2015-11-06 17:37 [RFC PATCHv1 0/3]: x86/ept: reduce translation invalidation impact David Vrabel
  2015-11-06 17:37 ` [PATCHv1 1/3] x86/ept: remove unnecessary sync after resolving misconfigured entries David Vrabel
  2015-11-06 17:37 ` [PATCHv1 2/3] mm: don't free pages until mm locks are released David Vrabel
@ 2015-11-06 17:37 ` David Vrabel
  2015-11-06 18:39   ` Andrew Cooper
                     ` (2 more replies)
  2 siblings, 3 replies; 11+ messages in thread
From: David Vrabel @ 2015-11-06 17:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Kevin Tian, Jan Beulich, David Vrabel, Jun Nakajima

Holding the p2m lock while calling ept_sync_domain() is very expensive
since it does a on_selected_cpus() call.  IPIs on many socket machines
can be very slows and on_selected_cpus() is serialized.

Defer the invalidate until the p2m lock is released.  Since the processor
may cache partial translations, we also need to make sure any page table
pages to be freed are not freed until the invalidate is complete.  Such
pages are temporarily stored an per-PCPU list.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 xen/arch/x86/mm/mm-locks.h | 23 +++++++++++++++--------
 xen/arch/x86/mm/p2m-ept.c  | 17 ++++++++++++++++-
 xen/arch/x86/mm/p2m.c      | 37 +++++++++++++++++++++++++++++++++++--
 xen/include/asm-x86/p2m.h  |  6 ++++++
 4 files changed, 72 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h
index 76c7217..473aaab 100644
--- a/xen/arch/x86/mm/mm-locks.h
+++ b/xen/arch/x86/mm/mm-locks.h
@@ -263,14 +263,21 @@ declare_mm_lock(altp2mlist)
  */
 
 declare_mm_rwlock(altp2m);
-#define p2m_lock(p)                         \
-{                                           \
-    if ( p2m_is_altp2m(p) )                 \
-        mm_write_lock(altp2m, &(p)->lock);  \
-    else                                    \
-        mm_write_lock(p2m, &(p)->lock);     \
-}
-#define p2m_unlock(p)         mm_write_unlock(&(p)->lock);
+#define p2m_lock(p)                             \
+    do {                                        \
+        if ( p2m_is_altp2m(p) )                 \
+            mm_write_lock(altp2m, &(p)->lock);  \
+        else                                    \
+            mm_write_lock(p2m, &(p)->lock);     \
+        (p)->defer_flush++;                     \
+    } while (0)
+#define p2m_unlock(p)                                                   \
+    do {                                                                \
+        bool_t need_flush = --(p)->defer_flush == 0 && (p)->need_flush; \
+        mm_write_unlock(&(p)->lock);                                    \
+        if (need_flush && (p)->flush)                                   \
+            (p)->flush(p);                                              \
+    } while (0)
 #define gfn_lock(p,g,o)       p2m_lock(p)
 #define gfn_unlock(p,g,o)     p2m_unlock(p)
 #define p2m_read_lock(p)      mm_read_lock(p2m, &(p)->lock)
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index a41d7d2..a573c14 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -263,7 +263,7 @@ static void ept_free_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry, int l
         unmap_domain_page(epte);
     }
     
-    p2m_free_ptp(p2m, mfn_to_page(ept_entry->mfn));
+    p2m_free_ptp_defer(p2m, mfn_to_page(ept_entry->mfn));
 }
 
 static bool_t ept_split_super_page(struct p2m_domain *p2m,
@@ -1103,6 +1103,14 @@ void ept_sync_domain(struct p2m_domain *p2m)
     if ( !paging_mode_hap(d) || !d->vcpu || !d->vcpu[0] )
         return;
 
+    if ( p2m->defer_flush )
+    {
+        p2m->need_flush = 1;
+        return;
+    }
+    else
+        p2m->need_flush = 0;
+
     ASSERT(local_irq_is_enabled());
 
     if ( nestedhvm_enabled(d) && !p2m_is_nestedp2m(p2m) )
@@ -1121,6 +1129,12 @@ void ept_sync_domain(struct p2m_domain *p2m)
                      __ept_sync_domain, p2m, 1);
 }
 
+static void ept_flush(struct p2m_domain *p2m)
+{
+    ept_sync_domain(p2m);
+    p2m_free_deferred_ptp(p2m);
+}
+
 static void ept_enable_pml(struct p2m_domain *p2m)
 {
     /* Domain must have been paused */
@@ -1169,6 +1183,7 @@ int ept_p2m_init(struct p2m_domain *p2m)
     p2m->change_entry_type_range = ept_change_entry_type_range;
     p2m->memory_type_changed = ept_memory_type_changed;
     p2m->audit_p2m = NULL;
+    p2m->flush = ept_flush;
 
     /* Set the memory type used when accessing EPT paging structures. */
     ept->ept_mt = EPT_DEFAULT_MT;
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index e13672d..2ad1de4 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -504,6 +504,26 @@ void p2m_free_ptp(struct p2m_domain *p2m, struct page_info *pg)
     return;
 }
 
+DEFINE_PER_CPU(struct page_list_head, p2m_deferred_free_pages);
+
+void p2m_free_ptp_defer(struct p2m_domain *p2m, struct page_info *pg)
+{
+    page_list_del(pg, &p2m->pages);
+    page_list_add(pg, &this_cpu(p2m_deferred_free_pages));
+}
+
+void p2m_free_deferred_ptp(struct p2m_domain *p2m)
+{
+    struct page_list_head *list = &this_cpu(p2m_deferred_free_pages);
+    struct page_info *pg, *tmp;
+
+    page_list_for_each_safe(pg, tmp, list)
+    {
+        page_list_del(pg, list);
+        p2m->domain->arch.paging.free_page(p2m->domain, pg);
+    }
+}
+
 /*
  * Allocate a new p2m table for a domain.
  *
@@ -2827,20 +2847,33 @@ int p2m_add_foreign(struct domain *tdom, unsigned long fgfn,
                  "gpfn:%lx mfn:%lx fgfn:%lx td:%d fd:%d\n",
                  gpfn, mfn, fgfn, tdom->domain_id, fdom->domain_id);
 
-    put_page(page);
-
     /*
      * This put_gfn for the above get_gfn for prev_mfn.  We must do this
      * after set_foreign_p2m_entry so another cpu doesn't populate the gpfn
      * before us.
      */
     put_gfn(tdom, gpfn);
+    if ( prev_page )
+        put_page(prev_page);
+    put_page(page);
 
 out:
     if ( fdom )
         rcu_unlock_domain(fdom);
     return rc;
 }
+
+int p2m_setup(void)
+{
+    unsigned int cpu;
+
+    for_each_present_cpu(cpu)
+        INIT_PAGE_LIST_HEAD(&per_cpu(p2m_deferred_free_pages, cpu));
+
+    return 0;
+}
+__initcall(p2m_setup);
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index d748557..f572a1b 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -254,6 +254,10 @@ struct p2m_domain {
                                           unsigned long gfn, l1_pgentry_t *p,
                                           l1_pgentry_t new, unsigned int level);
     long               (*audit_p2m)(struct p2m_domain *p2m);
+    void               (*flush)(struct p2m_domain *p2m);
+
+    unsigned int defer_flush;
+    bool_t need_flush;
 
     /* Default P2M access type for each page in the the domain: new pages,
      * swapped in pages, cleared pages, and pages that are ambiguously
@@ -681,6 +685,8 @@ static inline bool_t p2m_mem_access_sanity_check(struct domain *d)
 
 struct page_info *p2m_alloc_ptp(struct p2m_domain *p2m, unsigned long type);
 void p2m_free_ptp(struct p2m_domain *p2m, struct page_info *pg);
+void p2m_free_ptp_defer(struct p2m_domain *p2m, struct page_info *pg);
+void p2m_free_deferred_ptp(struct p2m_domain *p2m);
 
 /* Directly set a p2m entry: only for use by p2m code. Does not need
  * a call to put_gfn afterwards/ */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCHv1 1/3] x86/ept: remove unnecessary sync after resolving misconfigured entries
  2015-11-06 17:37 ` [PATCHv1 1/3] x86/ept: remove unnecessary sync after resolving misconfigured entries David Vrabel
@ 2015-11-06 18:29   ` Andrew Cooper
  2015-11-10 12:22   ` Jan Beulich
  1 sibling, 0 replies; 11+ messages in thread
From: Andrew Cooper @ 2015-11-06 18:29 UTC (permalink / raw)
  To: David Vrabel, xen-devel; +Cc: Kevin Tian, Jan Beulich, Jun Nakajima

On 06/11/15 17:37, David Vrabel wrote:
> When using EPT, type changes are done with the following steps:
>
> 1. Set entry as invalid (misconfigured) by settings a reserved memory
> type.
>
> 2. Flush all EPT and combined translations (ept_sync_domain()).
>
> 3. Fixup misconfigured entries as required (on EPT_MISCONFIG vmexits or
> when explicitly setting an entry.
>
> Since resolve_misconfig() only updates entries that were misconfigured,
> there is no need to invalidate any translations since the hardware
> does not cache misconfigured translations (vol 3, section 28.3.2).
>
> Remove the unnecessary (and very expensive) ept_sync_domain() calls).
>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHv1 3/3] x86/ept: defer the invalidation until the p2m lock is released
  2015-11-06 17:37 ` [PATCHv1 3/3] x86/ept: defer the invalidation until the p2m lock is released David Vrabel
@ 2015-11-06 18:39   ` Andrew Cooper
  2015-11-09 14:13   ` Jan Beulich
  2015-11-10 13:35   ` Tim Deegan
  2 siblings, 0 replies; 11+ messages in thread
From: Andrew Cooper @ 2015-11-06 18:39 UTC (permalink / raw)
  To: David Vrabel, xen-devel; +Cc: Kevin Tian, Jan Beulich, Jun Nakajima

On 06/11/15 17:37, David Vrabel wrote:
>  out:
>      if ( fdom )
>          rcu_unlock_domain(fdom);
>      return rc;
>  }
> +
> +int p2m_setup(void)
> +{
> +    unsigned int cpu;
> +
> +    for_each_present_cpu(cpu)
> +        INIT_PAGE_LIST_HEAD(&per_cpu(p2m_deferred_free_pages, cpu));
> +
> +    return 0;
> +}
> +__initcall(p2m_setup);
> +

This is going to create problems for cpu hotplug, and some power
governers.  The INIT_PAGE_LIST_HEAD() needs to be in a CPU_PREPARE notifier.

It might also be wise to have the DYING side confirm that the list is empty.

~Andrew

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHv1 3/3] x86/ept: defer the invalidation until the p2m lock is released
  2015-11-06 17:37 ` [PATCHv1 3/3] x86/ept: defer the invalidation until the p2m lock is released David Vrabel
  2015-11-06 18:39   ` Andrew Cooper
@ 2015-11-09 14:13   ` Jan Beulich
  2015-11-10 13:35   ` Tim Deegan
  2 siblings, 0 replies; 11+ messages in thread
From: Jan Beulich @ 2015-11-09 14:13 UTC (permalink / raw)
  To: David Vrabel; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, xen-devel

>>> On 06.11.15 at 18:37, <david.vrabel@citrix.com> wrote:
> --- a/xen/arch/x86/mm/mm-locks.h
> +++ b/xen/arch/x86/mm/mm-locks.h
> @@ -263,14 +263,21 @@ declare_mm_lock(altp2mlist)
>   */
>  
>  declare_mm_rwlock(altp2m);
> -#define p2m_lock(p)                         \
> -{                                           \
> -    if ( p2m_is_altp2m(p) )                 \
> -        mm_write_lock(altp2m, &(p)->lock);  \
> -    else                                    \
> -        mm_write_lock(p2m, &(p)->lock);     \
> -}
> -#define p2m_unlock(p)         mm_write_unlock(&(p)->lock);
> +#define p2m_lock(p)                             \
> +    do {                                        \
> +        if ( p2m_is_altp2m(p) )                 \
> +            mm_write_lock(altp2m, &(p)->lock);  \
> +        else                                    \
> +            mm_write_lock(p2m, &(p)->lock);     \
> +        (p)->defer_flush++;                     \
> +    } while (0)
> +#define p2m_unlock(p)                                                   \
> +    do {                                                                \
> +        bool_t need_flush = --(p)->defer_flush == 0 && (p)->need_flush; \
> +        mm_write_unlock(&(p)->lock);                                    \
> +        if (need_flush && (p)->flush)                                   \

Coding style. Also couldn't you imply (or ASSERT()) ->flush to be
non-NULL when need_flush is true?

> +            (p)->flush(p);                                              \

The p2m lock guards EPT's synced_mask afaict, and hence dropping
the lock before calling ->flush() breaks things.

> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -504,6 +504,26 @@ void p2m_free_ptp(struct p2m_domain *p2m, struct page_info *pg)
>      return;
>  }
>  
> +DEFINE_PER_CPU(struct page_list_head, p2m_deferred_free_pages);

static

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHv1 1/3] x86/ept: remove unnecessary sync after resolving misconfigured entries
  2015-11-06 17:37 ` [PATCHv1 1/3] x86/ept: remove unnecessary sync after resolving misconfigured entries David Vrabel
  2015-11-06 18:29   ` Andrew Cooper
@ 2015-11-10 12:22   ` Jan Beulich
  2015-11-12 16:18     ` David Vrabel
  1 sibling, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2015-11-10 12:22 UTC (permalink / raw)
  To: David Vrabel; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, xen-devel

>>> On 06.11.15 at 18:37, <david.vrabel@citrix.com> wrote:
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -644,7 +644,6 @@ bool_t ept_handle_misconfig(uint64_t gpa)
>      spurious = curr->arch.hvm_vmx.ept_spurious_misconfig;
>      rc = resolve_misconfig(p2m, PFN_DOWN(gpa));
>      curr->arch.hvm_vmx.ept_spurious_misconfig = 0;
> -    ept_sync_domain(p2m);
>  
>      p2m_unlock(p2m);
>  
> @@ -692,12 +691,7 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
>      /* Carry out any eventually pending earlier changes first. */
>      ret = resolve_misconfig(p2m, gfn);
>      if ( ret < 0 )
> -    {
> -        ept_sync_domain(p2m);

Is avoiding this sync really a win? It shouldn't be needed according
to your analysis, I agree, but if it doesn't do any harm I'd prefer it
to be kept (and the deletion above to be converted to a similar,
conditional sync too). After all there also shouldn't be any error
here, yet if there was one, wanting to be on the safe side calls for
doing a sync imo.

>          return ret;
> -    }
> -    if ( ret > 0 )
> -        needs_sync = sync_on;

This deletes the only use of sync_on, i.e. the patch should change
needs_sync back to bool_t at once.

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHv1 3/3] x86/ept: defer the invalidation until the p2m lock is released
  2015-11-06 17:37 ` [PATCHv1 3/3] x86/ept: defer the invalidation until the p2m lock is released David Vrabel
  2015-11-06 18:39   ` Andrew Cooper
  2015-11-09 14:13   ` Jan Beulich
@ 2015-11-10 13:35   ` Tim Deegan
  2 siblings, 0 replies; 11+ messages in thread
From: Tim Deegan @ 2015-11-10 13:35 UTC (permalink / raw)
  To: David Vrabel
  Cc: xen-devel, Kevin Tian, Jun Nakajima, Jan Beulich, Andrew Cooper

At 17:37 +0000 on 06 Nov (1446831437), David Vrabel wrote:
> Holding the p2m lock while calling ept_sync_domain() is very expensive
> since it does a on_selected_cpus() call.  IPIs on many socket machines
> can be very slows and on_selected_cpus() is serialized.
> 
> Defer the invalidate until the p2m lock is released.  Since the processor
> may cache partial translations, we also need to make sure any page table
> pages to be freed are not freed until the invalidate is complete.  Such
> pages are temporarily stored an per-PCPU list.

That seems like it could get messy if a PCPU might have multiple p2ms
locked at once (e.g. nested HAP, altp2m).  I think a per-p2m list
would be more obviously correct, and not too expensive.

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHv1 1/3] x86/ept: remove unnecessary sync after resolving misconfigured entries
  2015-11-10 12:22   ` Jan Beulich
@ 2015-11-12 16:18     ` David Vrabel
  2015-11-12 16:30       ` Jan Beulich
  0 siblings, 1 reply; 11+ messages in thread
From: David Vrabel @ 2015-11-12 16:18 UTC (permalink / raw)
  To: Jan Beulich, David Vrabel
  Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, xen-devel



On 10/11/2015 12:22, Jan Beulich wrote:
>>>> On 06.11.15 at 18:37, <david.vrabel@citrix.com> wrote:
>> --- a/xen/arch/x86/mm/p2m-ept.c
>> +++ b/xen/arch/x86/mm/p2m-ept.c
>> @@ -644,7 +644,6 @@ bool_t ept_handle_misconfig(uint64_t gpa)
>>      spurious = curr->arch.hvm_vmx.ept_spurious_misconfig;
>>      rc = resolve_misconfig(p2m, PFN_DOWN(gpa));
>>      curr->arch.hvm_vmx.ept_spurious_misconfig = 0;
>> -    ept_sync_domain(p2m);
>>  
>>      p2m_unlock(p2m);
>>  
>> @@ -692,12 +691,7 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
>>      /* Carry out any eventually pending earlier changes first. */
>>      ret = resolve_misconfig(p2m, gfn);
>>      if ( ret < 0 )
>> -    {
>> -        ept_sync_domain(p2m);
> 
> Is avoiding this sync really a win? It shouldn't be needed according
> to your analysis, I agree, but if it doesn't do any harm I'd prefer it
> to be kept (and the deletion above to be converted to a similar,
> conditional sync too). After all there also shouldn't be any error
> here, yet if there was one, wanting to be on the safe side calls for
> doing a sync imo.

Unnecessary calls to ept_domain_sync() are confusing.

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHv1 1/3] x86/ept: remove unnecessary sync after resolving misconfigured entries
  2015-11-12 16:18     ` David Vrabel
@ 2015-11-12 16:30       ` Jan Beulich
  0 siblings, 0 replies; 11+ messages in thread
From: Jan Beulich @ 2015-11-12 16:30 UTC (permalink / raw)
  To: David Vrabel, David Vrabel
  Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, xen-devel

>>> On 12.11.15 at 17:18, <dvrabel@cantab.net> wrote:

> 
> On 10/11/2015 12:22, Jan Beulich wrote:
>>>>> On 06.11.15 at 18:37, <david.vrabel@citrix.com> wrote:
>>> --- a/xen/arch/x86/mm/p2m-ept.c
>>> +++ b/xen/arch/x86/mm/p2m-ept.c
>>> @@ -644,7 +644,6 @@ bool_t ept_handle_misconfig(uint64_t gpa)
>>>      spurious = curr->arch.hvm_vmx.ept_spurious_misconfig;
>>>      rc = resolve_misconfig(p2m, PFN_DOWN(gpa));
>>>      curr->arch.hvm_vmx.ept_spurious_misconfig = 0;
>>> -    ept_sync_domain(p2m);
>>>  
>>>      p2m_unlock(p2m);
>>>  
>>> @@ -692,12 +691,7 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
>>>      /* Carry out any eventually pending earlier changes first. */
>>>      ret = resolve_misconfig(p2m, gfn);
>>>      if ( ret < 0 )
>>> -    {
>>> -        ept_sync_domain(p2m);
>> 
>> Is avoiding this sync really a win? It shouldn't be needed according
>> to your analysis, I agree, but if it doesn't do any harm I'd prefer it
>> to be kept (and the deletion above to be converted to a similar,
>> conditional sync too). After all there also shouldn't be any error
>> here, yet if there was one, wanting to be on the safe side calls for
>> doing a sync imo.
> 
> Unnecessary calls to ept_domain_sync() are confusing.

Again - if there was an error, how do you know it's unnecessary?
But anyway - I'll let the VM maintainers decide.

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-11-12 16:30 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-06 17:37 [RFC PATCHv1 0/3]: x86/ept: reduce translation invalidation impact David Vrabel
2015-11-06 17:37 ` [PATCHv1 1/3] x86/ept: remove unnecessary sync after resolving misconfigured entries David Vrabel
2015-11-06 18:29   ` Andrew Cooper
2015-11-10 12:22   ` Jan Beulich
2015-11-12 16:18     ` David Vrabel
2015-11-12 16:30       ` Jan Beulich
2015-11-06 17:37 ` [PATCHv1 2/3] mm: don't free pages until mm locks are released David Vrabel
2015-11-06 17:37 ` [PATCHv1 3/3] x86/ept: defer the invalidation until the p2m lock is released David Vrabel
2015-11-06 18:39   ` Andrew Cooper
2015-11-09 14:13   ` Jan Beulich
2015-11-10 13:35   ` Tim Deegan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.