All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0 of 3] Miscellaneous populate-on-demand bugs
@ 2011-01-17 14:36 George Dunlap
  2011-01-17 14:36 ` [PATCH 1 of 3] p2m: Allow l2 pages to be replaced by superpages George Dunlap
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: George Dunlap @ 2011-01-17 14:36 UTC (permalink / raw)
  To: xen-devel; +Cc: george.dunlap

This patch series includes a series of bugs related to p2m, ept, and
PoD code which were found as part of our XenServer product testing.

Each of these fixes actual bugs, and the 3.4-based version of the patch
has been tested thoroughly.  (There may be bugs in porting the patches,
but most of them are simple enough as to make it unlikely.)

Each patch is conceptually independent, so they can each be considered
for inclusion individually.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1 of 3] p2m: Allow l2 pages to be replaced by superpages
  2011-01-17 14:36 [PATCH 0 of 3] Miscellaneous populate-on-demand bugs George Dunlap
@ 2011-01-17 14:36 ` George Dunlap
  2011-01-19 12:22   ` Tim Deegan
  2011-01-17 14:36 ` [PATCH 2 of 3] PoD: Allow pod_set_cache_target hypercall to be preempted George Dunlap
  2011-01-17 14:36 ` [PATCH 3 of 3] PoD, hap: Fix logdirty mode when using hardware assisted paging George Dunlap
  2 siblings, 1 reply; 9+ messages in thread
From: George Dunlap @ 2011-01-17 14:36 UTC (permalink / raw)
  To: xen-devel; +Cc: george.dunlap

# HG changeset patch
# User George Dunlap <george.dunlap@eu.citrix.com>
# Date 1295274250 0
# Node ID 366d675630fd6ecbd6228426b3f7723d8a9dd944
# Parent  75b6287626ee0b852d725543568001e99b13be5b
p2m: Allow l2 pages to be replaced by superpages

Allow second-level p2m pages to be replaced with superpage entries,
freeing the p2m table page properly.  This allows, for example, a
sequence of 512 singleton zero pages to be replaced with a superpage
populate-on-demand entry.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>

diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/hap/hap.c
--- a/xen/arch/x86/mm/hap/hap.c	Fri Jan 14 16:38:51 2011 +0000
+++ b/xen/arch/x86/mm/hap/hap.c	Mon Jan 17 14:24:10 2011 +0000
@@ -333,9 +333,11 @@
 
     ASSERT(page_get_owner(pg) == d);
     /* Should have just the one ref we gave it in alloc_p2m_page() */
-    if ( (pg->count_info & PGC_count_mask) != 1 )
-        HAP_ERROR("Odd p2m page count c=%#lx t=%"PRtype_info"\n",
-                  pg->count_info, pg->u.inuse.type_info);
+    if ( (pg->count_info & PGC_count_mask) != 1 ) {
+        HAP_ERROR("Odd p2m page %p count c=%#lx t=%"PRtype_info"\n",
+                     pg, pg->count_info, pg->u.inuse.type_info);
+        WARN();
+    }
     pg->count_info &= ~PGC_count_mask;
     /* Free should not decrement domain's total allocation, since
      * these pages were allocated without an owner. */
diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/hap/p2m-ept.c
--- a/xen/arch/x86/mm/hap/p2m-ept.c	Fri Jan 14 16:38:51 2011 +0000
+++ b/xen/arch/x86/mm/hap/p2m-ept.c	Mon Jan 17 14:24:10 2011 +0000
@@ -317,6 +317,7 @@
     int vtd_pte_present = 0;
     int needs_sync = 1;
     struct domain *d = p2m->domain;
+    struct page_info *intermediate_table = NULL;
 
     /*
      * the caller must make sure:
@@ -369,6 +370,12 @@
         /* No need to flush if the old entry wasn't valid */
         if ( !is_epte_present(ept_entry) )
             needs_sync = 0;
+        else if ( order == 9 && ! ept_entry->sp )
+        {
+            /* We're replacing a non-SP page with a superpage.  Make sure to
+             * handle freeing the table properly. */
+            intermediate_table = mfn_to_page(ept_entry->mfn);
+        }
 
         if ( mfn_valid(mfn_x(mfn)) || direct_mmio || p2m_is_paged(p2mt) ||
              (p2mt == p2m_ram_paging_in_start) )
@@ -487,6 +494,17 @@
         }
     }
 
+    if ( intermediate_table )
+    {
+        /* Release the old intermediate table.  This has to be the
+           last thing we do, after the ept_sync_domain() and removal
+           from the iommu tables, so as to avoid a potential
+           use-after-free. */
+
+        page_list_del(intermediate_table, &d->arch.p2m->pages);
+        d->arch.paging.free_page(d, intermediate_table);
+    }
+
     return rv;
 }
 
diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/p2m.c
--- a/xen/arch/x86/mm/p2m.c	Fri Jan 14 16:38:51 2011 +0000
+++ b/xen/arch/x86/mm/p2m.c	Mon Jan 17 14:24:10 2011 +0000
@@ -1361,9 +1361,15 @@
         if ( (l1e_get_flags(*p2m_entry) & _PAGE_PRESENT) &&
              !(l1e_get_flags(*p2m_entry) & _PAGE_PSE) )
         {
-            P2M_ERROR("configure P2M table 4KB L2 entry with large page\n");
-            domain_crash(p2m->domain);
-            goto out;
+            struct page_info *pg;
+
+            /* We're replacing a non-SP page with a superpage.  Make sure to
+             * handle freeing the table properly. */
+            pg = mfn_to_page(_mfn(l1e_get_pfn(*p2m_entry)));
+            paging_write_p2m_entry(p2m->domain, gfn, p2m_entry, table_mfn,
+                                   l1e_empty(), 2);
+            page_list_del(pg, &p2m->pages);
+            p2m->domain->arch.paging.free_page(p2m->domain, pg);
         }
         
         ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 2 of 3] PoD: Allow pod_set_cache_target hypercall to be preempted
  2011-01-17 14:36 [PATCH 0 of 3] Miscellaneous populate-on-demand bugs George Dunlap
  2011-01-17 14:36 ` [PATCH 1 of 3] p2m: Allow l2 pages to be replaced by superpages George Dunlap
@ 2011-01-17 14:36 ` George Dunlap
  2011-01-19 12:41   ` Tim Deegan
  2011-01-17 14:36 ` [PATCH 3 of 3] PoD, hap: Fix logdirty mode when using hardware assisted paging George Dunlap
  2 siblings, 1 reply; 9+ messages in thread
From: George Dunlap @ 2011-01-17 14:36 UTC (permalink / raw)
  To: xen-devel; +Cc: george.dunlap

# HG changeset patch
# User George Dunlap <george.dunlap@eu.citrix.com>
# Date 1295274253 0
# Node ID 55e123a24da84f3b83caa7a7332699df73aaa90d
# Parent  366d675630fd6ecbd6228426b3f7723d8a9dd944
PoD: Allow pod_set_cache_target hypercall to be preempted

For very large VMs, setting the cache target can take long enough that
dom0 complains of soft lockups.  Allow the hypercall to be preempted.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>

diff -r 366d675630fd -r 55e123a24da8 xen/arch/x86/domain.c
--- a/xen/arch/x86/domain.c	Mon Jan 17 14:24:10 2011 +0000
+++ b/xen/arch/x86/domain.c	Mon Jan 17 14:24:13 2011 +0000
@@ -1653,8 +1653,8 @@
     unsigned long nval = 0;
     va_list args;
 
-    BUG_ON(*id > 5);
-    BUG_ON(mask & (1U << *id));
+    BUG_ON(id && *id > 5);
+    BUG_ON(id && (mask & (1U << *id)));
 
     va_start(args, mask);
 
diff -r 366d675630fd -r 55e123a24da8 xen/arch/x86/mm.c
--- a/xen/arch/x86/mm.c	Mon Jan 17 14:24:10 2011 +0000
+++ b/xen/arch/x86/mm.c	Mon Jan 17 14:24:13 2011 +0000
@@ -4799,15 +4799,23 @@
             rc = p2m_pod_set_mem_target(d, target.target_pages);
         }
 
-        p2m = p2m_get_hostp2m(d);
-        target.tot_pages       = d->tot_pages;
-        target.pod_cache_pages = p2m->pod.count;
-        target.pod_entries     = p2m->pod.entry_count;
-
-        if ( copy_to_guest(arg, &target, 1) )
+        if ( rc == -EAGAIN )
         {
-            rc= -EFAULT;
-            goto pod_target_out_unlock;
+            rc = hypercall_create_continuation(
+                __HYPERVISOR_memory_op, "lh", op, arg);
+        }
+        else if ( rc >= 0 )
+        {
+            p2m = p2m_get_hostp2m(d);
+            target.tot_pages       = d->tot_pages;
+            target.pod_cache_pages = p2m->pod.count;
+            target.pod_entries     = p2m->pod.entry_count;
+
+            if ( copy_to_guest(arg, &target, 1) )
+            {
+                rc= -EFAULT;
+                goto pod_target_out_unlock;
+            }
         }
         
     pod_target_out_unlock:
diff -r 366d675630fd -r 55e123a24da8 xen/arch/x86/mm/p2m.c
--- a/xen/arch/x86/mm/p2m.c	Mon Jan 17 14:24:10 2011 +0000
+++ b/xen/arch/x86/mm/p2m.c	Mon Jan 17 14:24:13 2011 +0000
@@ -435,7 +435,7 @@
 
 /* Set the size of the cache, allocating or freeing as necessary. */
 static int
-p2m_pod_set_cache_target(struct p2m_domain *p2m, unsigned long pod_target)
+p2m_pod_set_cache_target(struct p2m_domain *p2m, unsigned long pod_target, int preemptible)
 {
     struct domain *d = p2m->domain;
     int ret = 0;
@@ -468,6 +468,12 @@
         }
 
         p2m_pod_cache_add(p2m, page, order);
+
+        if ( hypercall_preempt_check() && preemptible )
+        {
+            ret = -EAGAIN;
+            goto out;
+        }
     }
 
     /* Decreasing the target */
@@ -512,6 +518,12 @@
                 put_page(page+i);
 
             put_page(page+i);
+
+            if ( hypercall_preempt_check() && preemptible )
+            {
+                ret = -EAGAIN;
+                goto out;
+            }
         }
     }
 
@@ -589,7 +601,7 @@
 
     ASSERT( pod_target >= p2m->pod.count );
 
-    ret = p2m_pod_set_cache_target(p2m, pod_target);
+    ret = p2m_pod_set_cache_target(p2m, pod_target, 1/*preemptible*/);
 
 out:
     p2m_unlock(p2m);
@@ -753,7 +765,7 @@
     /* If we've reduced our "liabilities" beyond our "assets", free some */
     if ( p2m->pod.entry_count < p2m->pod.count )
     {
-        p2m_pod_set_cache_target(p2m, p2m->pod.entry_count);
+        p2m_pod_set_cache_target(p2m, p2m->pod.entry_count, 0/*can't preempt*/);
     }
 
 out_unlock:
diff -r 366d675630fd -r 55e123a24da8 xen/arch/x86/x86_64/compat/mm.c
--- a/xen/arch/x86/x86_64/compat/mm.c	Mon Jan 17 14:24:10 2011 +0000
+++ b/xen/arch/x86/x86_64/compat/mm.c	Mon Jan 17 14:24:13 2011 +0000
@@ -127,6 +127,9 @@
         if ( rc < 0 )
             break;
 
+        if ( rc == __HYPERVISOR_memory_op )
+            hypercall_xlat_continuation(NULL, 0x2, nat, arg);
+
         XLAT_pod_target(&cmp, nat);
 
         if ( copy_to_guest(arg, &cmp, 1) )

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 3 of 3] PoD, hap: Fix logdirty mode when using hardware assisted paging
  2011-01-17 14:36 [PATCH 0 of 3] Miscellaneous populate-on-demand bugs George Dunlap
  2011-01-17 14:36 ` [PATCH 1 of 3] p2m: Allow l2 pages to be replaced by superpages George Dunlap
  2011-01-17 14:36 ` [PATCH 2 of 3] PoD: Allow pod_set_cache_target hypercall to be preempted George Dunlap
@ 2011-01-17 14:36 ` George Dunlap
  2011-01-19 12:41   ` Tim Deegan
  2 siblings, 1 reply; 9+ messages in thread
From: George Dunlap @ 2011-01-17 14:36 UTC (permalink / raw)
  To: xen-devel; +Cc: george.dunlap

# HG changeset patch
# User George Dunlap <george.dunlap@eu.citrix.com>
# Date 1295274541 0
# Node ID dd81c7acd9b323e9d2e4e3a41b81783fc7df5342
# Parent  55e123a24da84f3b83caa7a7332699df73aaa90d
PoD,hap: Fix logdirty mode when using hardware assisted paging

When writing a writable p2m entry for a pfn, we need to mark the pfn
dirty to avoid corruption when doing live migration.

Marking the page dirty exposes another issue, where there are
excessive sweeps for zero pages if there's a mismatch between PoD
entries and cache entries.  Only sweep for zero pages if we actually
need more memory.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>

diff -r 55e123a24da8 -r dd81c7acd9b3 xen/arch/x86/mm/p2m.c
--- a/xen/arch/x86/mm/p2m.c	Mon Jan 17 14:24:13 2011 +0000
+++ b/xen/arch/x86/mm/p2m.c	Mon Jan 17 14:29:01 2011 +0000
@@ -1142,14 +1142,22 @@
         return 0;
     }
 
-    /* If we're low, start a sweep */
-    if ( order == 9 && page_list_empty(&p2m->pod.super) )
-        p2m_pod_emergency_sweep_super(p2m);
-
-    if ( page_list_empty(&p2m->pod.single) &&
-         ( ( order == 0 )
-           || (order == 9 && page_list_empty(&p2m->pod.super) ) ) )
-        p2m_pod_emergency_sweep(p2m);
+    /* Once we've ballooned down enough that we can fill the remaining
+     * PoD entries from the cache, don't sweep even if the particular
+     * list we want to use is empty: that can lead to thrashing zero pages 
+     * through the cache for no good reason.  */
+    if ( p2m->pod.entry_count > p2m->pod.count )
+    {
+
+        /* If we're low, start a sweep */
+        if ( order == 9 && page_list_empty(&p2m->pod.super) )
+            p2m_pod_emergency_sweep_super(p2m);
+
+        if ( page_list_empty(&p2m->pod.single) &&
+             ( ( order == 0 )
+               || (order == 9 && page_list_empty(&p2m->pod.super) ) ) )
+            p2m_pod_emergency_sweep(p2m);
+    }
 
     /* Keep track of the highest gfn demand-populated by a guest fault */
     if ( q == p2m_guest && gfn > p2m->pod.max_guest )
@@ -1176,7 +1184,10 @@
     set_p2m_entry(p2m, gfn_aligned, mfn, order, p2m_ram_rw, p2m->default_access);
 
     for( i = 0; i < (1UL << order); i++ )
+    {
         set_gpfn_from_mfn(mfn_x(mfn) + i, gfn_aligned + i);
+        paging_mark_dirty(d, mfn_x(mfn) + i);
+    }
     
     p2m->pod.entry_count -= (1 << order); /* Lock: p2m */
     BUG_ON(p2m->pod.entry_count < 0);

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1 of 3] p2m: Allow l2 pages to be replaced by superpages
  2011-01-17 14:36 ` [PATCH 1 of 3] p2m: Allow l2 pages to be replaced by superpages George Dunlap
@ 2011-01-19 12:22   ` Tim Deegan
  2011-01-19 14:49     ` George Dunlap
  0 siblings, 1 reply; 9+ messages in thread
From: Tim Deegan @ 2011-01-19 12:22 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel

Hi, 

At 14:36 +0000 on 17 Jan (1295274993), George Dunlap wrote:
> # HG changeset patch
> # User George Dunlap <george.dunlap@eu.citrix.com>
> # Date 1295274250 0
> # Node ID 366d675630fd6ecbd6228426b3f7723d8a9dd944
> # Parent  75b6287626ee0b852d725543568001e99b13be5b
> p2m: Allow l2 pages to be replaced by superpages
> 
> Allow second-level p2m pages to be replaced with superpage entries,
> freeing the p2m table page properly.  This allows, for example, a
> sequence of 512 singleton zero pages to be replaced with a superpage
> populate-on-demand entry.

This problem became more general under your feet - xen 4.1 supports 1GiB
superpages as well, so the same thing needs to be fixed for them.  
(I understand that PoD only uses 2MiB superpages but to half-fix it
invites future bugs.)

Also, although in the EPT code you're careful to do all the flushing &c
before freeing the old page, in the vanilla p2m code you free the page
even before writing the new l2e!

Cheers,

Tim.

> Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
> 
> diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/hap/hap.c
> --- a/xen/arch/x86/mm/hap/hap.c	Fri Jan 14 16:38:51 2011 +0000
> +++ b/xen/arch/x86/mm/hap/hap.c	Mon Jan 17 14:24:10 2011 +0000
> @@ -333,9 +333,11 @@
>  
>      ASSERT(page_get_owner(pg) == d);
>      /* Should have just the one ref we gave it in alloc_p2m_page() */
> -    if ( (pg->count_info & PGC_count_mask) != 1 )
> -        HAP_ERROR("Odd p2m page count c=%#lx t=%"PRtype_info"\n",
> -                  pg->count_info, pg->u.inuse.type_info);
> +    if ( (pg->count_info & PGC_count_mask) != 1 ) {
> +        HAP_ERROR("Odd p2m page %p count c=%#lx t=%"PRtype_info"\n",
> +                     pg, pg->count_info, pg->u.inuse.type_info);
> +        WARN();
> +    }
>      pg->count_info &= ~PGC_count_mask;
>      /* Free should not decrement domain's total allocation, since
>       * these pages were allocated without an owner. */
> diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/hap/p2m-ept.c
> --- a/xen/arch/x86/mm/hap/p2m-ept.c	Fri Jan 14 16:38:51 2011 +0000
> +++ b/xen/arch/x86/mm/hap/p2m-ept.c	Mon Jan 17 14:24:10 2011 +0000
> @@ -317,6 +317,7 @@
>      int vtd_pte_present = 0;
>      int needs_sync = 1;
>      struct domain *d = p2m->domain;
> +    struct page_info *intermediate_table = NULL;
>  
>      /*
>       * the caller must make sure:
> @@ -369,6 +370,12 @@
>          /* No need to flush if the old entry wasn't valid */
>          if ( !is_epte_present(ept_entry) )
>              needs_sync = 0;
> +        else if ( order == 9 && ! ept_entry->sp )
> +        {
> +            /* We're replacing a non-SP page with a superpage.  Make sure to
> +             * handle freeing the table properly. */
> +            intermediate_table = mfn_to_page(ept_entry->mfn);
> +        }
>  
>          if ( mfn_valid(mfn_x(mfn)) || direct_mmio || p2m_is_paged(p2mt) ||
>               (p2mt == p2m_ram_paging_in_start) )
> @@ -487,6 +494,17 @@
>          }
>      }
>  
> +    if ( intermediate_table )
> +    {
> +        /* Release the old intermediate table.  This has to be the
> +           last thing we do, after the ept_sync_domain() and removal
> +           from the iommu tables, so as to avoid a potential
> +           use-after-free. */
> +
> +        page_list_del(intermediate_table, &d->arch.p2m->pages);
> +        d->arch.paging.free_page(d, intermediate_table);
> +    }
> +
>      return rv;
>  }
>  
> diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/p2m.c
> --- a/xen/arch/x86/mm/p2m.c	Fri Jan 14 16:38:51 2011 +0000
> +++ b/xen/arch/x86/mm/p2m.c	Mon Jan 17 14:24:10 2011 +0000
> @@ -1361,9 +1361,15 @@
>          if ( (l1e_get_flags(*p2m_entry) & _PAGE_PRESENT) &&
>               !(l1e_get_flags(*p2m_entry) & _PAGE_PSE) )
>          {
> -            P2M_ERROR("configure P2M table 4KB L2 entry with large page\n");
> -            domain_crash(p2m->domain);
> -            goto out;
> +            struct page_info *pg;
> +
> +            /* We're replacing a non-SP page with a superpage.  Make sure to
> +             * handle freeing the table properly. */
> +            pg = mfn_to_page(_mfn(l1e_get_pfn(*p2m_entry)));
> +            paging_write_p2m_entry(p2m->domain, gfn, p2m_entry, table_mfn,
> +                                   l1e_empty(), 2);
> +            page_list_del(pg, &p2m->pages);
> +            p2m->domain->arch.paging.free_page(p2m->domain, pg);
>          }
>          
>          ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2 of 3] PoD: Allow pod_set_cache_target hypercall to be preempted
  2011-01-17 14:36 ` [PATCH 2 of 3] PoD: Allow pod_set_cache_target hypercall to be preempted George Dunlap
@ 2011-01-19 12:41   ` Tim Deegan
  0 siblings, 0 replies; 9+ messages in thread
From: Tim Deegan @ 2011-01-19 12:41 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel

At 14:36 +0000 on 17 Jan (1295274994), George Dunlap wrote:
> PoD: Allow pod_set_cache_target hypercall to be preempted
> 
> For very large VMs, setting the cache target can take long enough that
> dom0 complains of soft lockups.  Allow the hypercall to be preempted.

Applied, thanks.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3 of 3] PoD, hap: Fix logdirty mode when using hardware assisted paging
  2011-01-17 14:36 ` [PATCH 3 of 3] PoD, hap: Fix logdirty mode when using hardware assisted paging George Dunlap
@ 2011-01-19 12:41   ` Tim Deegan
  0 siblings, 0 replies; 9+ messages in thread
From: Tim Deegan @ 2011-01-19 12:41 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel

At 14:36 +0000 on 17 Jan (1295274995), George Dunlap wrote:
> PoD,hap: Fix logdirty mode when using hardware assisted paging
> 
> When writing a writable p2m entry for a pfn, we need to mark the pfn
> dirty to avoid corruption when doing live migration.
> 
> Marking the page dirty exposes another issue, where there are
> excessive sweeps for zero pages if there's a mismatch between PoD
> entries and cache entries.  Only sweep for zero pages if we actually
> need more memory.

Applied, thanks.

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1 of 3] p2m: Allow l2 pages to be replaced by superpages
  2011-01-19 12:22   ` Tim Deegan
@ 2011-01-19 14:49     ` George Dunlap
  2011-01-19 14:53       ` Tim Deegan
  0 siblings, 1 reply; 9+ messages in thread
From: George Dunlap @ 2011-01-19 14:49 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel

Just to be clear, should I read your response this as something like below?

"Please rework this patch to do the following:
* Generalize for 1GB pages
* Make the p2m case as careful as the ept case"

 -George

On Wed, Jan 19, 2011 at 12:22 PM, Tim Deegan <Tim.Deegan@citrix.com> wrote:
> Hi,
>
> At 14:36 +0000 on 17 Jan (1295274993), George Dunlap wrote:
>> # HG changeset patch
>> # User George Dunlap <george.dunlap@eu.citrix.com>
>> # Date 1295274250 0
>> # Node ID 366d675630fd6ecbd6228426b3f7723d8a9dd944
>> # Parent  75b6287626ee0b852d725543568001e99b13be5b
>> p2m: Allow l2 pages to be replaced by superpages
>>
>> Allow second-level p2m pages to be replaced with superpage entries,
>> freeing the p2m table page properly.  This allows, for example, a
>> sequence of 512 singleton zero pages to be replaced with a superpage
>> populate-on-demand entry.
>
> This problem became more general under your feet - xen 4.1 supports 1GiB
> superpages as well, so the same thing needs to be fixed for them.
> (I understand that PoD only uses 2MiB superpages but to half-fix it
> invites future bugs.)
>
> Also, although in the EPT code you're careful to do all the flushing &c
> before freeing the old page, in the vanilla p2m code you free the page
> even before writing the new l2e!
>
> Cheers,
>
> Tim.
>
>> Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
>>
>> diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/hap/hap.c
>> --- a/xen/arch/x86/mm/hap/hap.c       Fri Jan 14 16:38:51 2011 +0000
>> +++ b/xen/arch/x86/mm/hap/hap.c       Mon Jan 17 14:24:10 2011 +0000
>> @@ -333,9 +333,11 @@
>>
>>      ASSERT(page_get_owner(pg) == d);
>>      /* Should have just the one ref we gave it in alloc_p2m_page() */
>> -    if ( (pg->count_info & PGC_count_mask) != 1 )
>> -        HAP_ERROR("Odd p2m page count c=%#lx t=%"PRtype_info"\n",
>> -                  pg->count_info, pg->u.inuse.type_info);
>> +    if ( (pg->count_info & PGC_count_mask) != 1 ) {
>> +        HAP_ERROR("Odd p2m page %p count c=%#lx t=%"PRtype_info"\n",
>> +                     pg, pg->count_info, pg->u.inuse.type_info);
>> +        WARN();
>> +    }
>>      pg->count_info &= ~PGC_count_mask;
>>      /* Free should not decrement domain's total allocation, since
>>       * these pages were allocated without an owner. */
>> diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/hap/p2m-ept.c
>> --- a/xen/arch/x86/mm/hap/p2m-ept.c   Fri Jan 14 16:38:51 2011 +0000
>> +++ b/xen/arch/x86/mm/hap/p2m-ept.c   Mon Jan 17 14:24:10 2011 +0000
>> @@ -317,6 +317,7 @@
>>      int vtd_pte_present = 0;
>>      int needs_sync = 1;
>>      struct domain *d = p2m->domain;
>> +    struct page_info *intermediate_table = NULL;
>>
>>      /*
>>       * the caller must make sure:
>> @@ -369,6 +370,12 @@
>>          /* No need to flush if the old entry wasn't valid */
>>          if ( !is_epte_present(ept_entry) )
>>              needs_sync = 0;
>> +        else if ( order == 9 && ! ept_entry->sp )
>> +        {
>> +            /* We're replacing a non-SP page with a superpage.  Make sure to
>> +             * handle freeing the table properly. */
>> +            intermediate_table = mfn_to_page(ept_entry->mfn);
>> +        }
>>
>>          if ( mfn_valid(mfn_x(mfn)) || direct_mmio || p2m_is_paged(p2mt) ||
>>               (p2mt == p2m_ram_paging_in_start) )
>> @@ -487,6 +494,17 @@
>>          }
>>      }
>>
>> +    if ( intermediate_table )
>> +    {
>> +        /* Release the old intermediate table.  This has to be the
>> +           last thing we do, after the ept_sync_domain() and removal
>> +           from the iommu tables, so as to avoid a potential
>> +           use-after-free. */
>> +
>> +        page_list_del(intermediate_table, &d->arch.p2m->pages);
>> +        d->arch.paging.free_page(d, intermediate_table);
>> +    }
>> +
>>      return rv;
>>  }
>>
>> diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/p2m.c
>> --- a/xen/arch/x86/mm/p2m.c   Fri Jan 14 16:38:51 2011 +0000
>> +++ b/xen/arch/x86/mm/p2m.c   Mon Jan 17 14:24:10 2011 +0000
>> @@ -1361,9 +1361,15 @@
>>          if ( (l1e_get_flags(*p2m_entry) & _PAGE_PRESENT) &&
>>               !(l1e_get_flags(*p2m_entry) & _PAGE_PSE) )
>>          {
>> -            P2M_ERROR("configure P2M table 4KB L2 entry with large page\n");
>> -            domain_crash(p2m->domain);
>> -            goto out;
>> +            struct page_info *pg;
>> +
>> +            /* We're replacing a non-SP page with a superpage.  Make sure to
>> +             * handle freeing the table properly. */
>> +            pg = mfn_to_page(_mfn(l1e_get_pfn(*p2m_entry)));
>> +            paging_write_p2m_entry(p2m->domain, gfn, p2m_entry, table_mfn,
>> +                                   l1e_empty(), 2);
>> +            page_list_del(pg, &p2m->pages);
>> +            p2m->domain->arch.paging.free_page(p2m->domain, pg);
>>          }
>>
>>          ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>
> --
> Tim Deegan <Tim.Deegan@citrix.com>
> Principal Software Engineer, Xen Platform Team
> Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1 of 3] p2m: Allow l2 pages to be replaced by superpages
  2011-01-19 14:49     ` George Dunlap
@ 2011-01-19 14:53       ` Tim Deegan
  0 siblings, 0 replies; 9+ messages in thread
From: Tim Deegan @ 2011-01-19 14:53 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel

At 14:49 +0000 on 19 Jan (1295448547), George Dunlap wrote:
> Just to be clear, should I read your response this as something like below?
> 
> "Please rework this patch to do the following:
> * Generalize for 1GB pages
> * Make the p2m case as careful as the ept case"

Yes, please. 

Tim.

> On Wed, Jan 19, 2011 at 12:22 PM, Tim Deegan <Tim.Deegan@citrix.com> wrote:
> > Hi,
> >
> > At 14:36 +0000 on 17 Jan (1295274993), George Dunlap wrote:
> >> # HG changeset patch
> >> # User George Dunlap <george.dunlap@eu.citrix.com>
> >> # Date 1295274250 0
> >> # Node ID 366d675630fd6ecbd6228426b3f7723d8a9dd944
> >> # Parent  75b6287626ee0b852d725543568001e99b13be5b
> >> p2m: Allow l2 pages to be replaced by superpages
> >>
> >> Allow second-level p2m pages to be replaced with superpage entries,
> >> freeing the p2m table page properly.  This allows, for example, a
> >> sequence of 512 singleton zero pages to be replaced with a superpage
> >> populate-on-demand entry.
> >
> > This problem became more general under your feet - xen 4.1 supports 1GiB
> > superpages as well, so the same thing needs to be fixed for them.
> > (I understand that PoD only uses 2MiB superpages but to half-fix it
> > invites future bugs.)
> >
> > Also, although in the EPT code you're careful to do all the flushing &c
> > before freeing the old page, in the vanilla p2m code you free the page
> > even before writing the new l2e!
> >
> > Cheers,
> >
> > Tim.
> >
> >> Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
> >>
> >> diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/hap/hap.c
> >> --- a/xen/arch/x86/mm/hap/hap.c       Fri Jan 14 16:38:51 2011 +0000
> >> +++ b/xen/arch/x86/mm/hap/hap.c       Mon Jan 17 14:24:10 2011 +0000
> >> @@ -333,9 +333,11 @@
> >>
> >>      ASSERT(page_get_owner(pg) == d);
> >>      /* Should have just the one ref we gave it in alloc_p2m_page() */
> >> -    if ( (pg->count_info & PGC_count_mask) != 1 )
> >> -        HAP_ERROR("Odd p2m page count c=%#lx t=%"PRtype_info"\n",
> >> -                  pg->count_info, pg->u.inuse.type_info);
> >> +    if ( (pg->count_info & PGC_count_mask) != 1 ) {
> >> +        HAP_ERROR("Odd p2m page %p count c=%#lx t=%"PRtype_info"\n",
> >> +                     pg, pg->count_info, pg->u.inuse.type_info);
> >> +        WARN();
> >> +    }
> >>      pg->count_info &= ~PGC_count_mask;
> >>      /* Free should not decrement domain's total allocation, since
> >>       * these pages were allocated without an owner. */
> >> diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/hap/p2m-ept.c
> >> --- a/xen/arch/x86/mm/hap/p2m-ept.c   Fri Jan 14 16:38:51 2011 +0000
> >> +++ b/xen/arch/x86/mm/hap/p2m-ept.c   Mon Jan 17 14:24:10 2011 +0000
> >> @@ -317,6 +317,7 @@
> >>      int vtd_pte_present = 0;
> >>      int needs_sync = 1;
> >>      struct domain *d = p2m->domain;
> >> +    struct page_info *intermediate_table = NULL;
> >>
> >>      /*
> >>       * the caller must make sure:
> >> @@ -369,6 +370,12 @@
> >>          /* No need to flush if the old entry wasn't valid */
> >>          if ( !is_epte_present(ept_entry) )
> >>              needs_sync = 0;
> >> +        else if ( order == 9 && ! ept_entry->sp )
> >> +        {
> >> +            /* We're replacing a non-SP page with a superpage.  Make sure to
> >> +             * handle freeing the table properly. */
> >> +            intermediate_table = mfn_to_page(ept_entry->mfn);
> >> +        }
> >>
> >>          if ( mfn_valid(mfn_x(mfn)) || direct_mmio || p2m_is_paged(p2mt) ||
> >>               (p2mt == p2m_ram_paging_in_start) )
> >> @@ -487,6 +494,17 @@
> >>          }
> >>      }
> >>
> >> +    if ( intermediate_table )
> >> +    {
> >> +        /* Release the old intermediate table.  This has to be the
> >> +           last thing we do, after the ept_sync_domain() and removal
> >> +           from the iommu tables, so as to avoid a potential
> >> +           use-after-free. */
> >> +
> >> +        page_list_del(intermediate_table, &d->arch.p2m->pages);
> >> +        d->arch.paging.free_page(d, intermediate_table);
> >> +    }
> >> +
> >>      return rv;
> >>  }
> >>
> >> diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/p2m.c
> >> --- a/xen/arch/x86/mm/p2m.c   Fri Jan 14 16:38:51 2011 +0000
> >> +++ b/xen/arch/x86/mm/p2m.c   Mon Jan 17 14:24:10 2011 +0000
> >> @@ -1361,9 +1361,15 @@
> >>          if ( (l1e_get_flags(*p2m_entry) & _PAGE_PRESENT) &&
> >>               !(l1e_get_flags(*p2m_entry) & _PAGE_PSE) )
> >>          {
> >> -            P2M_ERROR("configure P2M table 4KB L2 entry with large page\n");
> >> -            domain_crash(p2m->domain);
> >> -            goto out;
> >> +            struct page_info *pg;
> >> +
> >> +            /* We're replacing a non-SP page with a superpage.  Make sure to
> >> +             * handle freeing the table properly. */
> >> +            pg = mfn_to_page(_mfn(l1e_get_pfn(*p2m_entry)));
> >> +            paging_write_p2m_entry(p2m->domain, gfn, p2m_entry, table_mfn,
> >> +                                   l1e_empty(), 2);
> >> +            page_list_del(pg, &p2m->pages);
> >> +            p2m->domain->arch.paging.free_page(p2m->domain, pg);
> >>          }
> >>
> >>          ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
> >>
> >> _______________________________________________
> >> Xen-devel mailing list
> >> Xen-devel@lists.xensource.com
> >> http://lists.xensource.com/xen-devel
> >
> > --
> > Tim Deegan <Tim.Deegan@citrix.com>
> > Principal Software Engineer, Xen Platform Team
> > Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
> >

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-01-19 14:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-17 14:36 [PATCH 0 of 3] Miscellaneous populate-on-demand bugs George Dunlap
2011-01-17 14:36 ` [PATCH 1 of 3] p2m: Allow l2 pages to be replaced by superpages George Dunlap
2011-01-19 12:22   ` Tim Deegan
2011-01-19 14:49     ` George Dunlap
2011-01-19 14:53       ` Tim Deegan
2011-01-17 14:36 ` [PATCH 2 of 3] PoD: Allow pod_set_cache_target hypercall to be preempted George Dunlap
2011-01-19 12:41   ` Tim Deegan
2011-01-17 14:36 ` [PATCH 3 of 3] PoD, hap: Fix logdirty mode when using hardware assisted paging George Dunlap
2011-01-19 12:41   ` Tim Deegan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.