All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] x86/mem_sharing: reorder when pages are unlocked and released
@ 2019-04-12  4:29 ` Tamas K Lengyel
  0 siblings, 0 replies; 26+ messages in thread
From: Tamas K Lengyel @ 2019-04-12  4:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Tamas K Lengyel, Wei Liu, George Dunlap, Andrew Cooper,
	Jan Beulich, Roger Pau Monne

Patch 0502e0adae2 "x86: correct instances of PGC_allocated clearing" introduced
grabbing extra references for pages that drop references tied to PGC_allocated.
However, the way these extra references were grabbed were incorrect, resulting
in both share_pages and unshare_pages failing. There actually is no need to
grab extra references, only a reordering was needed for when the existing
references are being released. This is in accordance to the XSA-242
recommendation of not calling _put_page_type while also holding the page_lock
for that page.

Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Roger Pau Monne <roger.pau@citrix.com>
---
 xen/arch/x86/mm/mem_sharing.c | 31 ++++++++-----------------------
 1 file changed, 8 insertions(+), 23 deletions(-)

diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index 5ac9d8f54c..345a1778f9 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -900,6 +900,7 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
     p2m_type_t smfn_type, cmfn_type;
     struct two_gfns tg;
     struct rmap_iterator ri;
+    unsigned long put_count = 0;
 
     get_two_gfns(sd, sgfn, &smfn_type, NULL, &smfn,
                  cd, cgfn, &cmfn_type, NULL, &cmfn, 0, &tg);
@@ -964,15 +965,6 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
         goto err_out;
     }
 
-    /* Acquire an extra reference, for the freeing below to be safe. */
-    if ( !get_page(cpage, cd) )
-    {
-        ret = -EOVERFLOW;
-        mem_sharing_page_unlock(secondpg);
-        mem_sharing_page_unlock(firstpg);
-        goto err_out;
-    }
-
     /* Merge the lists together */
     rmap_seed_iterator(cpage, &ri);
     while ( (gfn = rmap_iterate(cpage, &ri)) != NULL)
@@ -984,7 +976,7 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
          * Don't change the type of rmap for the client page. */
         rmap_del(gfn, cpage, 0);
         rmap_add(gfn, spage);
-        put_page_and_type(cpage);
+        put_count++;
         d = get_domain_by_id(gfn->domain);
         BUG_ON(!d);
         BUG_ON(set_shared_p2m_entry(d, gfn->gfn, smfn));
@@ -1002,7 +994,9 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
     /* Free the client page */
     if(test_and_clear_bit(_PGC_allocated, &cpage->count_info))
         put_page(cpage);
-    put_page(cpage);
+
+    while(put_count--)
+        put_page_and_type(cpage);
 
     /* We managed to free a domain page. */
     atomic_dec(&nr_shared_mfns);
@@ -1167,20 +1161,11 @@ int __mem_sharing_unshare_page(struct domain *d,
     {
         if ( !last_gfn )
             mem_sharing_gfn_destroy(page, d, gfn_info);
-        put_page_and_type(page);
         mem_sharing_page_unlock(page);
-        if ( last_gfn )
-        {
-            if ( !get_page(page, d) )
-            {
-                put_gfn(d, gfn);
-                domain_crash(d);
-                return -EOVERFLOW;
-            }
-            if ( test_and_clear_bit(_PGC_allocated, &page->count_info) )
-                put_page(page);
+        if ( last_gfn &&
+            test_and_clear_bit(_PGC_allocated, &page->count_info) )
             put_page(page);
-        }
+        put_page_and_type(page);
         put_gfn(d, gfn);
 
         return 0;
-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Xen-devel] [PATCH 1/2] x86/mem_sharing: reorder when pages are unlocked and released
@ 2019-04-12  4:29 ` Tamas K Lengyel
  0 siblings, 0 replies; 26+ messages in thread
From: Tamas K Lengyel @ 2019-04-12  4:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Tamas K Lengyel, Wei Liu, George Dunlap, Andrew Cooper,
	Jan Beulich, Roger Pau Monne

Patch 0502e0adae2 "x86: correct instances of PGC_allocated clearing" introduced
grabbing extra references for pages that drop references tied to PGC_allocated.
However, the way these extra references were grabbed were incorrect, resulting
in both share_pages and unshare_pages failing. There actually is no need to
grab extra references, only a reordering was needed for when the existing
references are being released. This is in accordance to the XSA-242
recommendation of not calling _put_page_type while also holding the page_lock
for that page.

Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Roger Pau Monne <roger.pau@citrix.com>
---
 xen/arch/x86/mm/mem_sharing.c | 31 ++++++++-----------------------
 1 file changed, 8 insertions(+), 23 deletions(-)

diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index 5ac9d8f54c..345a1778f9 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -900,6 +900,7 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
     p2m_type_t smfn_type, cmfn_type;
     struct two_gfns tg;
     struct rmap_iterator ri;
+    unsigned long put_count = 0;
 
     get_two_gfns(sd, sgfn, &smfn_type, NULL, &smfn,
                  cd, cgfn, &cmfn_type, NULL, &cmfn, 0, &tg);
@@ -964,15 +965,6 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
         goto err_out;
     }
 
-    /* Acquire an extra reference, for the freeing below to be safe. */
-    if ( !get_page(cpage, cd) )
-    {
-        ret = -EOVERFLOW;
-        mem_sharing_page_unlock(secondpg);
-        mem_sharing_page_unlock(firstpg);
-        goto err_out;
-    }
-
     /* Merge the lists together */
     rmap_seed_iterator(cpage, &ri);
     while ( (gfn = rmap_iterate(cpage, &ri)) != NULL)
@@ -984,7 +976,7 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
          * Don't change the type of rmap for the client page. */
         rmap_del(gfn, cpage, 0);
         rmap_add(gfn, spage);
-        put_page_and_type(cpage);
+        put_count++;
         d = get_domain_by_id(gfn->domain);
         BUG_ON(!d);
         BUG_ON(set_shared_p2m_entry(d, gfn->gfn, smfn));
@@ -1002,7 +994,9 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
     /* Free the client page */
     if(test_and_clear_bit(_PGC_allocated, &cpage->count_info))
         put_page(cpage);
-    put_page(cpage);
+
+    while(put_count--)
+        put_page_and_type(cpage);
 
     /* We managed to free a domain page. */
     atomic_dec(&nr_shared_mfns);
@@ -1167,20 +1161,11 @@ int __mem_sharing_unshare_page(struct domain *d,
     {
         if ( !last_gfn )
             mem_sharing_gfn_destroy(page, d, gfn_info);
-        put_page_and_type(page);
         mem_sharing_page_unlock(page);
-        if ( last_gfn )
-        {
-            if ( !get_page(page, d) )
-            {
-                put_gfn(d, gfn);
-                domain_crash(d);
-                return -EOVERFLOW;
-            }
-            if ( test_and_clear_bit(_PGC_allocated, &page->count_info) )
-                put_page(page);
+        if ( last_gfn &&
+            test_and_clear_bit(_PGC_allocated, &page->count_info) )
             put_page(page);
-        }
+        put_page_and_type(page);
         put_gfn(d, gfn);
 
         return 0;
-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/2] RFC: x86/mm: conditionally check page_lock/page_unlock ownership
@ 2019-04-12  4:29   ` Tamas K Lengyel
  0 siblings, 0 replies; 26+ messages in thread
From: Tamas K Lengyel @ 2019-04-12  4:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Tamas K Lengyel, Wei Liu, George Dunlap, Andrew Cooper,
	Jan Beulich, Roger Pau Monne

Patch cf4b30dca0a "Add debug code to detect illegal page_lock and put_page_type
ordering" added extra sanity checking to page_lock/page_unlock for debug builds
with the assumption that no hypervisor path ever locks two pages at once.

This assumption doesn't hold during memory sharing.

This is only an RFC as I currently don't have a better idea how to resolve this
issue while also keeping the sanity-check in place.

Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Roger Pau Monne <roger.pau@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
---
 xen/arch/x86/domain.c           |  4 ++--
 xen/arch/x86/mm.c               | 20 +++++++++++---------
 xen/arch/x86/mm/mem_sharing.c   |  4 ++--
 xen/arch/x86/pv/grant_table.c   | 12 ++++++------
 xen/arch/x86/pv/ro-page-fault.c |  6 +++---
 xen/include/asm-x86/mm.h        |  4 ++--
 6 files changed, 26 insertions(+), 24 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 8d579e2cf9..93cda1ccdd 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -995,13 +995,13 @@ int arch_set_info_guest(
             {
                 struct page_info *page = page_list_remove_head(&d->page_list);
 
-                if ( page_lock(page) )
+                if ( page_lock(page, 1) )
                 {
                     if ( (page->u.inuse.type_info & PGT_type_mask) ==
                          PGT_l4_page_table )
                         done = !fill_ro_mpt(page_to_mfn(page));
 
-                    page_unlock(page);
+                    page_unlock(page, 1);
                 }
 
                 page_list_add_tail(page, &d->page_list);
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index a88cd9ce7c..ff734e362c 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2030,11 +2030,12 @@ static inline bool current_locked_page_ne_check(struct page_info *page) {
 #define current_locked_page_ne_check(x) true
 #endif
 
-int page_lock(struct page_info *page)
+int page_lock(struct page_info *page, bool lock_check)
 {
     unsigned long x, nx;
 
-    ASSERT(current_locked_page_check(NULL));
+    if ( lock_check )
+        ASSERT(current_locked_page_check(NULL));
 
     do {
         while ( (x = page->u.inuse.type_info) & PGT_locked )
@@ -2051,11 +2052,12 @@ int page_lock(struct page_info *page)
     return 1;
 }
 
-void page_unlock(struct page_info *page)
+void page_unlock(struct page_info *page, bool lock_check)
 {
     unsigned long x, nx, y = page->u.inuse.type_info;
 
-    ASSERT(current_locked_page_check(page));
+    if ( lock_check )
+        ASSERT(current_locked_page_check(page));
 
     do {
         x = y;
@@ -3897,7 +3899,7 @@ long do_mmu_update(
             }
             va = _p(((unsigned long)va & PAGE_MASK) + (req.ptr & ~PAGE_MASK));
 
-            if ( page_lock(page) )
+            if ( page_lock(page, 1) )
             {
                 switch ( page->u.inuse.type_info & PGT_type_mask )
                 {
@@ -3954,7 +3956,7 @@ long do_mmu_update(
                         rc = 0;
                     break;
                 }
-                page_unlock(page);
+                page_unlock(page, 1);
                 if ( rc == -EINTR )
                     rc = -ERESTART;
             }
@@ -4247,7 +4249,7 @@ static int __do_update_va_mapping(
     if ( unlikely(!gl1pg) )
         goto out;
 
-    if ( !page_lock(gl1pg) )
+    if ( !page_lock(gl1pg, 1) )
     {
         put_page(gl1pg);
         goto out;
@@ -4255,7 +4257,7 @@ static int __do_update_va_mapping(
 
     if ( (gl1pg->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
     {
-        page_unlock(gl1pg);
+        page_unlock(gl1pg, 1);
         put_page(gl1pg);
         goto out;
     }
@@ -4263,7 +4265,7 @@ static int __do_update_va_mapping(
     rc = mod_l1_entry(pl1e, val, mfn_x(gl1mfn), MMU_NORMAL_PT_UPDATE, v,
                       pg_owner);
 
-    page_unlock(gl1pg);
+    page_unlock(gl1pg, 1);
     put_page(gl1pg);
 
  out:
diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index 345a1778f9..777af7f7c7 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -118,7 +118,7 @@ static inline int mem_sharing_page_lock(struct page_info *pg)
     pg_lock_data_t *pld = &(this_cpu(__pld));
 
     page_sharing_mm_pre_lock();
-    rc = page_lock(pg);
+    rc = page_lock(pg, 0);
     if ( rc )
     {
         preempt_disable();
@@ -135,7 +135,7 @@ static inline void mem_sharing_page_unlock(struct page_info *pg)
     page_sharing_mm_unlock(pld->mm_unlock_level, 
                            &pld->recurse_count);
     preempt_enable();
-    page_unlock(pg);
+    page_unlock(pg, 0);
 }
 
 static inline shr_handle_t get_next_handle(void)
diff --git a/xen/arch/x86/pv/grant_table.c b/xen/arch/x86/pv/grant_table.c
index 5180334f42..be9bbe7c4c 100644
--- a/xen/arch/x86/pv/grant_table.c
+++ b/xen/arch/x86/pv/grant_table.c
@@ -101,7 +101,7 @@ int create_grant_pv_mapping(uint64_t addr, mfn_t frame,
             goto out_unmap;
     }
 
-    if ( !page_lock(page) )
+    if ( !page_lock(page, 1) )
         goto out_put;
 
     if ( (page->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
@@ -112,7 +112,7 @@ int create_grant_pv_mapping(uint64_t addr, mfn_t frame,
         rc = GNTST_okay;
 
  out_unlock:
-    page_unlock(page);
+    page_unlock(page, 1);
  out_put:
     put_page(page);
  out_unmap:
@@ -158,7 +158,7 @@ static bool steal_linear_address(unsigned long linear, l1_pgentry_t *out)
     if ( !page )
         goto out_unmap;
 
-    if ( !page_lock(page) )
+    if ( !page_lock(page, 1) )
         goto out_put;
 
     if ( (page->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
@@ -171,7 +171,7 @@ static bool steal_linear_address(unsigned long linear, l1_pgentry_t *out)
         *out = ol1e;
 
  out_unlock:
-    page_unlock(page);
+    page_unlock(page, 1);
  out_put:
     put_page(page);
  out_unmap:
@@ -264,7 +264,7 @@ int replace_grant_pv_mapping(uint64_t addr, mfn_t frame,
             goto out_unmap;
     }
 
-    if ( !page_lock(page) )
+    if ( !page_lock(page, 1) )
         goto out_put;
 
     if ( (page->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
@@ -297,7 +297,7 @@ int replace_grant_pv_mapping(uint64_t addr, mfn_t frame,
         rc = GNTST_okay;
 
  out_unlock:
-    page_unlock(page);
+    page_unlock(page, 1);
  out_put:
     put_page(page);
  out_unmap:
diff --git a/xen/arch/x86/pv/ro-page-fault.c b/xen/arch/x86/pv/ro-page-fault.c
index e7a7179dda..8e90e82f0b 100644
--- a/xen/arch/x86/pv/ro-page-fault.c
+++ b/xen/arch/x86/pv/ro-page-fault.c
@@ -277,7 +277,7 @@ static int ptwr_do_page_fault(struct x86_emulate_ctxt *ctxt,
     if ( !page )
         return X86EMUL_UNHANDLEABLE;
 
-    if ( !page_lock(page) )
+    if ( !page_lock(page, 1) )
     {
         put_page(page);
         return X86EMUL_UNHANDLEABLE;
@@ -285,7 +285,7 @@ static int ptwr_do_page_fault(struct x86_emulate_ctxt *ctxt,
 
     if ( (page->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
     {
-        page_unlock(page);
+        page_unlock(page, 1);
         put_page(page);
         return X86EMUL_UNHANDLEABLE;
     }
@@ -293,7 +293,7 @@ static int ptwr_do_page_fault(struct x86_emulate_ctxt *ctxt,
     ctxt->data = &ptwr_ctxt;
     rc = x86_emulate(ctxt, &ptwr_emulate_ops);
 
-    page_unlock(page);
+    page_unlock(page, 1);
     put_page(page);
 
     return rc;
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 6faa563167..dae52e8880 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -375,8 +375,8 @@ const struct platform_bad_page *get_platform_badpages(unsigned int *array_size);
  * These two users (pte serialization and memory sharing) do not collide, since
  * sharing is only supported for hvm guests, which do not perform pv pte updates.
  */
-int page_lock(struct page_info *page);
-void page_unlock(struct page_info *page);
+int page_lock(struct page_info *page, bool lock_check);
+void page_unlock(struct page_info *page, bool lock_check);
 
 void put_page_type(struct page_info *page);
 int  get_page_type(struct page_info *page, unsigned long type);
-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Xen-devel] [PATCH 2/2] RFC: x86/mm: conditionally check page_lock/page_unlock ownership
@ 2019-04-12  4:29   ` Tamas K Lengyel
  0 siblings, 0 replies; 26+ messages in thread
From: Tamas K Lengyel @ 2019-04-12  4:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Tamas K Lengyel, Wei Liu, George Dunlap, Andrew Cooper,
	Jan Beulich, Roger Pau Monne

Patch cf4b30dca0a "Add debug code to detect illegal page_lock and put_page_type
ordering" added extra sanity checking to page_lock/page_unlock for debug builds
with the assumption that no hypervisor path ever locks two pages at once.

This assumption doesn't hold during memory sharing.

This is only an RFC as I currently don't have a better idea how to resolve this
issue while also keeping the sanity-check in place.

Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Roger Pau Monne <roger.pau@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
---
 xen/arch/x86/domain.c           |  4 ++--
 xen/arch/x86/mm.c               | 20 +++++++++++---------
 xen/arch/x86/mm/mem_sharing.c   |  4 ++--
 xen/arch/x86/pv/grant_table.c   | 12 ++++++------
 xen/arch/x86/pv/ro-page-fault.c |  6 +++---
 xen/include/asm-x86/mm.h        |  4 ++--
 6 files changed, 26 insertions(+), 24 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 8d579e2cf9..93cda1ccdd 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -995,13 +995,13 @@ int arch_set_info_guest(
             {
                 struct page_info *page = page_list_remove_head(&d->page_list);
 
-                if ( page_lock(page) )
+                if ( page_lock(page, 1) )
                 {
                     if ( (page->u.inuse.type_info & PGT_type_mask) ==
                          PGT_l4_page_table )
                         done = !fill_ro_mpt(page_to_mfn(page));
 
-                    page_unlock(page);
+                    page_unlock(page, 1);
                 }
 
                 page_list_add_tail(page, &d->page_list);
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index a88cd9ce7c..ff734e362c 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2030,11 +2030,12 @@ static inline bool current_locked_page_ne_check(struct page_info *page) {
 #define current_locked_page_ne_check(x) true
 #endif
 
-int page_lock(struct page_info *page)
+int page_lock(struct page_info *page, bool lock_check)
 {
     unsigned long x, nx;
 
-    ASSERT(current_locked_page_check(NULL));
+    if ( lock_check )
+        ASSERT(current_locked_page_check(NULL));
 
     do {
         while ( (x = page->u.inuse.type_info) & PGT_locked )
@@ -2051,11 +2052,12 @@ int page_lock(struct page_info *page)
     return 1;
 }
 
-void page_unlock(struct page_info *page)
+void page_unlock(struct page_info *page, bool lock_check)
 {
     unsigned long x, nx, y = page->u.inuse.type_info;
 
-    ASSERT(current_locked_page_check(page));
+    if ( lock_check )
+        ASSERT(current_locked_page_check(page));
 
     do {
         x = y;
@@ -3897,7 +3899,7 @@ long do_mmu_update(
             }
             va = _p(((unsigned long)va & PAGE_MASK) + (req.ptr & ~PAGE_MASK));
 
-            if ( page_lock(page) )
+            if ( page_lock(page, 1) )
             {
                 switch ( page->u.inuse.type_info & PGT_type_mask )
                 {
@@ -3954,7 +3956,7 @@ long do_mmu_update(
                         rc = 0;
                     break;
                 }
-                page_unlock(page);
+                page_unlock(page, 1);
                 if ( rc == -EINTR )
                     rc = -ERESTART;
             }
@@ -4247,7 +4249,7 @@ static int __do_update_va_mapping(
     if ( unlikely(!gl1pg) )
         goto out;
 
-    if ( !page_lock(gl1pg) )
+    if ( !page_lock(gl1pg, 1) )
     {
         put_page(gl1pg);
         goto out;
@@ -4255,7 +4257,7 @@ static int __do_update_va_mapping(
 
     if ( (gl1pg->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
     {
-        page_unlock(gl1pg);
+        page_unlock(gl1pg, 1);
         put_page(gl1pg);
         goto out;
     }
@@ -4263,7 +4265,7 @@ static int __do_update_va_mapping(
     rc = mod_l1_entry(pl1e, val, mfn_x(gl1mfn), MMU_NORMAL_PT_UPDATE, v,
                       pg_owner);
 
-    page_unlock(gl1pg);
+    page_unlock(gl1pg, 1);
     put_page(gl1pg);
 
  out:
diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index 345a1778f9..777af7f7c7 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -118,7 +118,7 @@ static inline int mem_sharing_page_lock(struct page_info *pg)
     pg_lock_data_t *pld = &(this_cpu(__pld));
 
     page_sharing_mm_pre_lock();
-    rc = page_lock(pg);
+    rc = page_lock(pg, 0);
     if ( rc )
     {
         preempt_disable();
@@ -135,7 +135,7 @@ static inline void mem_sharing_page_unlock(struct page_info *pg)
     page_sharing_mm_unlock(pld->mm_unlock_level, 
                            &pld->recurse_count);
     preempt_enable();
-    page_unlock(pg);
+    page_unlock(pg, 0);
 }
 
 static inline shr_handle_t get_next_handle(void)
diff --git a/xen/arch/x86/pv/grant_table.c b/xen/arch/x86/pv/grant_table.c
index 5180334f42..be9bbe7c4c 100644
--- a/xen/arch/x86/pv/grant_table.c
+++ b/xen/arch/x86/pv/grant_table.c
@@ -101,7 +101,7 @@ int create_grant_pv_mapping(uint64_t addr, mfn_t frame,
             goto out_unmap;
     }
 
-    if ( !page_lock(page) )
+    if ( !page_lock(page, 1) )
         goto out_put;
 
     if ( (page->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
@@ -112,7 +112,7 @@ int create_grant_pv_mapping(uint64_t addr, mfn_t frame,
         rc = GNTST_okay;
 
  out_unlock:
-    page_unlock(page);
+    page_unlock(page, 1);
  out_put:
     put_page(page);
  out_unmap:
@@ -158,7 +158,7 @@ static bool steal_linear_address(unsigned long linear, l1_pgentry_t *out)
     if ( !page )
         goto out_unmap;
 
-    if ( !page_lock(page) )
+    if ( !page_lock(page, 1) )
         goto out_put;
 
     if ( (page->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
@@ -171,7 +171,7 @@ static bool steal_linear_address(unsigned long linear, l1_pgentry_t *out)
         *out = ol1e;
 
  out_unlock:
-    page_unlock(page);
+    page_unlock(page, 1);
  out_put:
     put_page(page);
  out_unmap:
@@ -264,7 +264,7 @@ int replace_grant_pv_mapping(uint64_t addr, mfn_t frame,
             goto out_unmap;
     }
 
-    if ( !page_lock(page) )
+    if ( !page_lock(page, 1) )
         goto out_put;
 
     if ( (page->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
@@ -297,7 +297,7 @@ int replace_grant_pv_mapping(uint64_t addr, mfn_t frame,
         rc = GNTST_okay;
 
  out_unlock:
-    page_unlock(page);
+    page_unlock(page, 1);
  out_put:
     put_page(page);
  out_unmap:
diff --git a/xen/arch/x86/pv/ro-page-fault.c b/xen/arch/x86/pv/ro-page-fault.c
index e7a7179dda..8e90e82f0b 100644
--- a/xen/arch/x86/pv/ro-page-fault.c
+++ b/xen/arch/x86/pv/ro-page-fault.c
@@ -277,7 +277,7 @@ static int ptwr_do_page_fault(struct x86_emulate_ctxt *ctxt,
     if ( !page )
         return X86EMUL_UNHANDLEABLE;
 
-    if ( !page_lock(page) )
+    if ( !page_lock(page, 1) )
     {
         put_page(page);
         return X86EMUL_UNHANDLEABLE;
@@ -285,7 +285,7 @@ static int ptwr_do_page_fault(struct x86_emulate_ctxt *ctxt,
 
     if ( (page->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table )
     {
-        page_unlock(page);
+        page_unlock(page, 1);
         put_page(page);
         return X86EMUL_UNHANDLEABLE;
     }
@@ -293,7 +293,7 @@ static int ptwr_do_page_fault(struct x86_emulate_ctxt *ctxt,
     ctxt->data = &ptwr_ctxt;
     rc = x86_emulate(ctxt, &ptwr_emulate_ops);
 
-    page_unlock(page);
+    page_unlock(page, 1);
     put_page(page);
 
     return rc;
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 6faa563167..dae52e8880 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -375,8 +375,8 @@ const struct platform_bad_page *get_platform_badpages(unsigned int *array_size);
  * These two users (pte serialization and memory sharing) do not collide, since
  * sharing is only supported for hvm guests, which do not perform pv pte updates.
  */
-int page_lock(struct page_info *page);
-void page_unlock(struct page_info *page);
+int page_lock(struct page_info *page, bool lock_check);
+void page_unlock(struct page_info *page, bool lock_check);
 
 void put_page_type(struct page_info *page);
 int  get_page_type(struct page_info *page, unsigned long type);
-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] x86/mem_sharing: reorder when pages are unlocked and released
@ 2019-04-12 11:55   ` Jan Beulich
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Beulich @ 2019-04-12 11:55 UTC (permalink / raw)
  To: Tamas K Lengyel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
> Patch 0502e0adae2 "x86: correct instances of PGC_allocated clearing" introduced
> grabbing extra references for pages that drop references tied to PGC_allocated.
> However, the way these extra references were grabbed were incorrect, resulting
> in both share_pages and unshare_pages failing.

I'm sorry for this.

> @@ -984,7 +976,7 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
>           * Don't change the type of rmap for the client page. */
>          rmap_del(gfn, cpage, 0);
>          rmap_add(gfn, spage);
> -        put_page_and_type(cpage);
> +        put_count++;

Since this sits in a while(), ...

> @@ -1002,7 +994,9 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
>      /* Free the client page */
>      if(test_and_clear_bit(_PGC_allocated, &cpage->count_info))
>          put_page(cpage);
> -    put_page(cpage);
> +
> +    while(put_count--)
> +        put_page_and_type(cpage);

... put_count may be zero before this while(). At which point the
earlier put_page() would still be unsafe.

Also, despite the if() in context suggesting the style you used,
the rest of the function looks to use proper Xen style, so would
you please add the three missing blanks to the while() you add?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] x86/mem_sharing: reorder when pages are unlocked and released
@ 2019-04-12 11:55   ` Jan Beulich
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Beulich @ 2019-04-12 11:55 UTC (permalink / raw)
  To: Tamas K Lengyel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
> Patch 0502e0adae2 "x86: correct instances of PGC_allocated clearing" introduced
> grabbing extra references for pages that drop references tied to PGC_allocated.
> However, the way these extra references were grabbed were incorrect, resulting
> in both share_pages and unshare_pages failing.

I'm sorry for this.

> @@ -984,7 +976,7 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
>           * Don't change the type of rmap for the client page. */
>          rmap_del(gfn, cpage, 0);
>          rmap_add(gfn, spage);
> -        put_page_and_type(cpage);
> +        put_count++;

Since this sits in a while(), ...

> @@ -1002,7 +994,9 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
>      /* Free the client page */
>      if(test_and_clear_bit(_PGC_allocated, &cpage->count_info))
>          put_page(cpage);
> -    put_page(cpage);
> +
> +    while(put_count--)
> +        put_page_and_type(cpage);

... put_count may be zero before this while(). At which point the
earlier put_page() would still be unsafe.

Also, despite the if() in context suggesting the style you used,
the rest of the function looks to use proper Xen style, so would
you please add the three missing blanks to the while() you add?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] RFC: x86/mm: conditionally check page_lock/page_unlock ownership
@ 2019-04-12 12:01     ` Jan Beulich
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Beulich @ 2019-04-12 12:01 UTC (permalink / raw)
  To: Tamas K Lengyel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
> Patch cf4b30dca0a "Add debug code to detect illegal page_lock and put_page_type
> ordering" added extra sanity checking to page_lock/page_unlock for debug builds
> with the assumption that no hypervisor path ever locks two pages at once.
> 
> This assumption doesn't hold during memory sharing.

A fundamental question is - doesn't mem-sharing abuse page_lock()?
I've never understood why it uses it, and back when it was introduced
the function clearly wasn't meant to be used outside of PV memory
management code.

One minimal remark on the patch itself: Please use true/false instead
of 1/0 when passing bool arguments.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] RFC: x86/mm: conditionally check page_lock/page_unlock ownership
@ 2019-04-12 12:01     ` Jan Beulich
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Beulich @ 2019-04-12 12:01 UTC (permalink / raw)
  To: Tamas K Lengyel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
> Patch cf4b30dca0a "Add debug code to detect illegal page_lock and put_page_type
> ordering" added extra sanity checking to page_lock/page_unlock for debug builds
> with the assumption that no hypervisor path ever locks two pages at once.
> 
> This assumption doesn't hold during memory sharing.

A fundamental question is - doesn't mem-sharing abuse page_lock()?
I've never understood why it uses it, and back when it was introduced
the function clearly wasn't meant to be used outside of PV memory
management code.

One minimal remark on the patch itself: Please use true/false instead
of 1/0 when passing bool arguments.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] x86/mem_sharing: reorder when pages are unlocked and released
@ 2019-04-12 13:41     ` Tamas K Lengyel
  0 siblings, 0 replies; 26+ messages in thread
From: Tamas K Lengyel @ 2019-04-12 13:41 UTC (permalink / raw)
  To: Jan Beulich
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

On Fri, Apr 12, 2019 at 5:55 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
> > Patch 0502e0adae2 "x86: correct instances of PGC_allocated clearing" introduced
> > grabbing extra references for pages that drop references tied to PGC_allocated.
> > However, the way these extra references were grabbed were incorrect, resulting
> > in both share_pages and unshare_pages failing.
>
> I'm sorry for this.

It's my bad for not catching it earlier when I acked it. Reading the
patch it looked fine and made sense but evidently that's no substitute
for actually testing it.

>
> > @@ -984,7 +976,7 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
> >           * Don't change the type of rmap for the client page. */
> >          rmap_del(gfn, cpage, 0);
> >          rmap_add(gfn, spage);
> > -        put_page_and_type(cpage);
> > +        put_count++;
>
> Since this sits in a while(), ...
>
> > @@ -1002,7 +994,9 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
> >      /* Free the client page */
> >      if(test_and_clear_bit(_PGC_allocated, &cpage->count_info))
> >          put_page(cpage);
> > -    put_page(cpage);
> > +
> > +    while(put_count--)
> > +        put_page_and_type(cpage);
>
> ... put_count may be zero before this while(). At which point the
> earlier put_page() would still be unsafe.

The page should have at least one reference it got before when the
page went through page_make_sharable function. We also verified that
the page is still sharable so that reference is still there, so this
count is at least 1. We could certainly add an ASSERT(put_count)
before though.

>
> Also, despite the if() in context suggesting the style you used,
> the rest of the function looks to use proper Xen style, so would
> you please add the three missing blanks to the while() you add?

Ack.

Thanks,
Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] x86/mem_sharing: reorder when pages are unlocked and released
@ 2019-04-12 13:41     ` Tamas K Lengyel
  0 siblings, 0 replies; 26+ messages in thread
From: Tamas K Lengyel @ 2019-04-12 13:41 UTC (permalink / raw)
  To: Jan Beulich
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

On Fri, Apr 12, 2019 at 5:55 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
> > Patch 0502e0adae2 "x86: correct instances of PGC_allocated clearing" introduced
> > grabbing extra references for pages that drop references tied to PGC_allocated.
> > However, the way these extra references were grabbed were incorrect, resulting
> > in both share_pages and unshare_pages failing.
>
> I'm sorry for this.

It's my bad for not catching it earlier when I acked it. Reading the
patch it looked fine and made sense but evidently that's no substitute
for actually testing it.

>
> > @@ -984,7 +976,7 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
> >           * Don't change the type of rmap for the client page. */
> >          rmap_del(gfn, cpage, 0);
> >          rmap_add(gfn, spage);
> > -        put_page_and_type(cpage);
> > +        put_count++;
>
> Since this sits in a while(), ...
>
> > @@ -1002,7 +994,9 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
> >      /* Free the client page */
> >      if(test_and_clear_bit(_PGC_allocated, &cpage->count_info))
> >          put_page(cpage);
> > -    put_page(cpage);
> > +
> > +    while(put_count--)
> > +        put_page_and_type(cpage);
>
> ... put_count may be zero before this while(). At which point the
> earlier put_page() would still be unsafe.

The page should have at least one reference it got before when the
page went through page_make_sharable function. We also verified that
the page is still sharable so that reference is still there, so this
count is at least 1. We could certainly add an ASSERT(put_count)
before though.

>
> Also, despite the if() in context suggesting the style you used,
> the rest of the function looks to use proper Xen style, so would
> you please add the three missing blanks to the while() you add?

Ack.

Thanks,
Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] RFC: x86/mm: conditionally check page_lock/page_unlock ownership
@ 2019-04-12 13:55       ` Tamas K Lengyel
  0 siblings, 0 replies; 26+ messages in thread
From: Tamas K Lengyel @ 2019-04-12 13:55 UTC (permalink / raw)
  To: Jan Beulich
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

On Fri, Apr 12, 2019 at 6:01 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
> > Patch cf4b30dca0a "Add debug code to detect illegal page_lock and put_page_type
> > ordering" added extra sanity checking to page_lock/page_unlock for debug builds
> > with the assumption that no hypervisor path ever locks two pages at once.
> >
> > This assumption doesn't hold during memory sharing.
>
> A fundamental question is - doesn't mem-sharing abuse page_lock()?
> I've never understood why it uses it, and back when it was introduced
> the function clearly wasn't meant to be used outside of PV memory
> management code.

I have limited understanding of what the limitations are of page_lock
but here is the reasoning for taking the lock for two pages.
Mem_sharing takes the pages it wants to be sharable and assigns them
to dom_cow at least with one reference being held (since we just took
these pages from a live domain). Those pages may then be assigned to
several actual domains at the same time, and each domain could be
destroyed at any given point. There is no requirement that the
original domain still be alive since the owner of these pages is now
dom_cow. The backing pages owned by dom_cow only get dropped when the
reference count hits zero, ie. no actual domain actually uses those
pages anymore. This could happen if all domains unshare the gfn's that
use this page as the backing or if all domains that use them for
mem_sharing get destroyed. When we share pages, we want to make sure
those don't get dropped under our feet while we are in the middle of
this function. So we lock both pages.

Also, I don't see why it would be fundamentally wrong to hold the
locks to two pages at once. Is this a problem with the lock
implementation itself?

>
> One minimal remark on the patch itself: Please use true/false instead
> of 1/0 when passing bool arguments.

Ack.

Thanks,
Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] RFC: x86/mm: conditionally check page_lock/page_unlock ownership
@ 2019-04-12 13:55       ` Tamas K Lengyel
  0 siblings, 0 replies; 26+ messages in thread
From: Tamas K Lengyel @ 2019-04-12 13:55 UTC (permalink / raw)
  To: Jan Beulich
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

On Fri, Apr 12, 2019 at 6:01 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
> > Patch cf4b30dca0a "Add debug code to detect illegal page_lock and put_page_type
> > ordering" added extra sanity checking to page_lock/page_unlock for debug builds
> > with the assumption that no hypervisor path ever locks two pages at once.
> >
> > This assumption doesn't hold during memory sharing.
>
> A fundamental question is - doesn't mem-sharing abuse page_lock()?
> I've never understood why it uses it, and back when it was introduced
> the function clearly wasn't meant to be used outside of PV memory
> management code.

I have limited understanding of what the limitations are of page_lock
but here is the reasoning for taking the lock for two pages.
Mem_sharing takes the pages it wants to be sharable and assigns them
to dom_cow at least with one reference being held (since we just took
these pages from a live domain). Those pages may then be assigned to
several actual domains at the same time, and each domain could be
destroyed at any given point. There is no requirement that the
original domain still be alive since the owner of these pages is now
dom_cow. The backing pages owned by dom_cow only get dropped when the
reference count hits zero, ie. no actual domain actually uses those
pages anymore. This could happen if all domains unshare the gfn's that
use this page as the backing or if all domains that use them for
mem_sharing get destroyed. When we share pages, we want to make sure
those don't get dropped under our feet while we are in the middle of
this function. So we lock both pages.

Also, I don't see why it would be fundamentally wrong to hold the
locks to two pages at once. Is this a problem with the lock
implementation itself?

>
> One minimal remark on the patch itself: Please use true/false instead
> of 1/0 when passing bool arguments.

Ack.

Thanks,
Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] x86/mem_sharing: reorder when pages are unlocked and released
@ 2019-04-12 14:00       ` Andrew Cooper
  0 siblings, 0 replies; 26+ messages in thread
From: Andrew Cooper @ 2019-04-12 14:00 UTC (permalink / raw)
  To: Tamas K Lengyel, Jan Beulich
  Cc: George Dunlap, xen-devel, Wei Liu, Roger Pau Monne

On 12/04/2019 14:41, Tamas K Lengyel wrote:
> On Fri, Apr 12, 2019 at 5:55 AM Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
>>> Patch 0502e0adae2 "x86: correct instances of PGC_allocated clearing" introduced
>>> grabbing extra references for pages that drop references tied to PGC_allocated.
>>> However, the way these extra references were grabbed were incorrect, resulting
>>> in both share_pages and unshare_pages failing.
>> I'm sorry for this.
> It's my bad for not catching it earlier when I acked it. Reading the
> patch it looked fine and made sense but evidently that's no substitute
> for actually testing it.

As an aside, do you have a one-paragraph introduction to how you use
mem_sharing?

With any luck, we can try and avoid breaking it as often as we seem to.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] x86/mem_sharing: reorder when pages are unlocked and released
@ 2019-04-12 14:00       ` Andrew Cooper
  0 siblings, 0 replies; 26+ messages in thread
From: Andrew Cooper @ 2019-04-12 14:00 UTC (permalink / raw)
  To: Tamas K Lengyel, Jan Beulich
  Cc: George Dunlap, xen-devel, Wei Liu, Roger Pau Monne

On 12/04/2019 14:41, Tamas K Lengyel wrote:
> On Fri, Apr 12, 2019 at 5:55 AM Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
>>> Patch 0502e0adae2 "x86: correct instances of PGC_allocated clearing" introduced
>>> grabbing extra references for pages that drop references tied to PGC_allocated.
>>> However, the way these extra references were grabbed were incorrect, resulting
>>> in both share_pages and unshare_pages failing.
>> I'm sorry for this.
> It's my bad for not catching it earlier when I acked it. Reading the
> patch it looked fine and made sense but evidently that's no substitute
> for actually testing it.

As an aside, do you have a one-paragraph introduction to how you use
mem_sharing?

With any luck, we can try and avoid breaking it as often as we seem to.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] x86/mem_sharing: reorder when pages are unlocked and released
@ 2019-04-12 14:13         ` Tamas K Lengyel
  0 siblings, 0 replies; 26+ messages in thread
From: Tamas K Lengyel @ 2019-04-12 14:13 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: George Dunlap, xen-devel, Wei Liu, Jan Beulich, Roger Pau Monne

On Fri, Apr 12, 2019 at 8:00 AM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>
> On 12/04/2019 14:41, Tamas K Lengyel wrote:
> > On Fri, Apr 12, 2019 at 5:55 AM Jan Beulich <JBeulich@suse.com> wrote:
> >>>>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
> >>> Patch 0502e0adae2 "x86: correct instances of PGC_allocated clearing" introduced
> >>> grabbing extra references for pages that drop references tied to PGC_allocated.
> >>> However, the way these extra references were grabbed were incorrect, resulting
> >>> in both share_pages and unshare_pages failing.
> >> I'm sorry for this.
> > It's my bad for not catching it earlier when I acked it. Reading the
> > patch it looked fine and made sense but evidently that's no substitute
> > for actually testing it.
>
> As an aside, do you have a one-paragraph introduction to how you use
> mem_sharing?

The use-case is malware analysis where we tend to deploy the same VM
image over and over. Mem_sharing saves resources as we don't have to
have say 4gb of memory available for each analysis session since only
a fraction of that memory actually gets touched during the analysis.
Right now it's still a bit clunky since we first have to restore the
VM image and then proceed with mem_sharing but there are other
improvements that are planned, such as VM forking
(https://xenproject.atlassian.net/projects/XEN/board?issue-key=XEN-89),
which would use mem_sharing but the deployment of new VMs would be
more streamlined.

>
> With any luck, we can try and avoid breaking it as often as we seem to.

Ideally :)

Thanks,
Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] x86/mem_sharing: reorder when pages are unlocked and released
@ 2019-04-12 14:13         ` Tamas K Lengyel
  0 siblings, 0 replies; 26+ messages in thread
From: Tamas K Lengyel @ 2019-04-12 14:13 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: George Dunlap, xen-devel, Wei Liu, Jan Beulich, Roger Pau Monne

On Fri, Apr 12, 2019 at 8:00 AM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>
> On 12/04/2019 14:41, Tamas K Lengyel wrote:
> > On Fri, Apr 12, 2019 at 5:55 AM Jan Beulich <JBeulich@suse.com> wrote:
> >>>>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
> >>> Patch 0502e0adae2 "x86: correct instances of PGC_allocated clearing" introduced
> >>> grabbing extra references for pages that drop references tied to PGC_allocated.
> >>> However, the way these extra references were grabbed were incorrect, resulting
> >>> in both share_pages and unshare_pages failing.
> >> I'm sorry for this.
> > It's my bad for not catching it earlier when I acked it. Reading the
> > patch it looked fine and made sense but evidently that's no substitute
> > for actually testing it.
>
> As an aside, do you have a one-paragraph introduction to how you use
> mem_sharing?

The use-case is malware analysis where we tend to deploy the same VM
image over and over. Mem_sharing saves resources as we don't have to
have say 4gb of memory available for each analysis session since only
a fraction of that memory actually gets touched during the analysis.
Right now it's still a bit clunky since we first have to restore the
VM image and then proceed with mem_sharing but there are other
improvements that are planned, such as VM forking
(https://xenproject.atlassian.net/projects/XEN/board?issue-key=XEN-89),
which would use mem_sharing but the deployment of new VMs would be
more streamlined.

>
> With any luck, we can try and avoid breaking it as often as we seem to.

Ideally :)

Thanks,
Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] x86/mem_sharing: reorder when pages are unlocked and released
@ 2019-04-12 14:29       ` Jan Beulich
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Beulich @ 2019-04-12 14:29 UTC (permalink / raw)
  To: Tamas K Lengyel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 12.04.19 at 15:41, <tamas@tklengyel.com> wrote:
> On Fri, Apr 12, 2019 at 5:55 AM Jan Beulich <JBeulich@suse.com> wrote:
>> >>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
>> > @@ -984,7 +976,7 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
>> >           * Don't change the type of rmap for the client page. */
>> >          rmap_del(gfn, cpage, 0);
>> >          rmap_add(gfn, spage);
>> > -        put_page_and_type(cpage);
>> > +        put_count++;
>>
>> Since this sits in a while(), ...
>>
>> > @@ -1002,7 +994,9 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
>> >      /* Free the client page */
>> >      if(test_and_clear_bit(_PGC_allocated, &cpage->count_info))
>> >          put_page(cpage);
>> > -    put_page(cpage);
>> > +
>> > +    while(put_count--)
>> > +        put_page_and_type(cpage);
>>
>> ... put_count may be zero before this while(). At which point the
>> earlier put_page() would still be unsafe.
> 
> The page should have at least one reference it got before when the
> page went through page_make_sharable function. We also verified that
> the page is still sharable so that reference is still there, so this
> count is at least 1. We could certainly add an ASSERT(put_count)
> before though.

I see - back when I created the change you're now trying to fix,
I was wondering but couldn't find any proof. So yes, adding an
ASSERT() would surely be helpful.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] x86/mem_sharing: reorder when pages are unlocked and released
@ 2019-04-12 14:29       ` Jan Beulich
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Beulich @ 2019-04-12 14:29 UTC (permalink / raw)
  To: Tamas K Lengyel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 12.04.19 at 15:41, <tamas@tklengyel.com> wrote:
> On Fri, Apr 12, 2019 at 5:55 AM Jan Beulich <JBeulich@suse.com> wrote:
>> >>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
>> > @@ -984,7 +976,7 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
>> >           * Don't change the type of rmap for the client page. */
>> >          rmap_del(gfn, cpage, 0);
>> >          rmap_add(gfn, spage);
>> > -        put_page_and_type(cpage);
>> > +        put_count++;
>>
>> Since this sits in a while(), ...
>>
>> > @@ -1002,7 +994,9 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
>> >      /* Free the client page */
>> >      if(test_and_clear_bit(_PGC_allocated, &cpage->count_info))
>> >          put_page(cpage);
>> > -    put_page(cpage);
>> > +
>> > +    while(put_count--)
>> > +        put_page_and_type(cpage);
>>
>> ... put_count may be zero before this while(). At which point the
>> earlier put_page() would still be unsafe.
> 
> The page should have at least one reference it got before when the
> page went through page_make_sharable function. We also verified that
> the page is still sharable so that reference is still there, so this
> count is at least 1. We could certainly add an ASSERT(put_count)
> before though.

I see - back when I created the change you're now trying to fix,
I was wondering but couldn't find any proof. So yes, adding an
ASSERT() would surely be helpful.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] RFC: x86/mm: conditionally check page_lock/page_unlock ownership
@ 2019-04-12 14:39         ` Jan Beulich
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Beulich @ 2019-04-12 14:39 UTC (permalink / raw)
  To: Tamas K Lengyel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 12.04.19 at 15:55, <tamas@tklengyel.com> wrote:
> On Fri, Apr 12, 2019 at 6:01 AM Jan Beulich <JBeulich@suse.com> wrote:
>> >>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
>> > Patch cf4b30dca0a "Add debug code to detect illegal page_lock and put_page_type
>> > ordering" added extra sanity checking to page_lock/page_unlock for debug builds
>> > with the assumption that no hypervisor path ever locks two pages at once.
>> >
>> > This assumption doesn't hold during memory sharing.
>>
>> A fundamental question is - doesn't mem-sharing abuse page_lock()?
>> I've never understood why it uses it, and back when it was introduced
>> the function clearly wasn't meant to be used outside of PV memory
>> management code.
> 
> I have limited understanding of what the limitations are of page_lock
> but here is the reasoning for taking the lock for two pages.
> Mem_sharing takes the pages it wants to be sharable and assigns them
> to dom_cow at least with one reference being held (since we just took
> these pages from a live domain). Those pages may then be assigned to
> several actual domains at the same time, and each domain could be
> destroyed at any given point. There is no requirement that the
> original domain still be alive since the owner of these pages is now
> dom_cow. The backing pages owned by dom_cow only get dropped when the
> reference count hits zero, ie. no actual domain actually uses those
> pages anymore. This could happen if all domains unshare the gfn's that
> use this page as the backing or if all domains that use them for
> mem_sharing get destroyed. When we share pages, we want to make sure
> those don't get dropped under our feet while we are in the middle of
> this function. So we lock both pages.

Thanks for the explanation, but I'm afraid this doesn't address
my remark: You need _some_ kind of lock for this, but I was
questioning whether using page_lock() is the right choice. From
what you say, obtaining proper references to the page would
seem what you actually want to avoid them disappearing
behind your back.

page_lock() is used on PV page table pages to exclude multiple
updates to the same page table happening at the same time.
In a !PV build it shouldn't even be available.

> Also, I don't see why it would be fundamentally wrong to hold the
> locks to two pages at once. Is this a problem with the lock
> implementation itself?

No, the implementation itself should be fine (but I also didn't say
there's a problem locking two pages). Lock order might be an issue,
but seem to be dealing with that fine (considering mem_sharing.c
alone).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] RFC: x86/mm: conditionally check page_lock/page_unlock ownership
@ 2019-04-12 14:39         ` Jan Beulich
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Beulich @ 2019-04-12 14:39 UTC (permalink / raw)
  To: Tamas K Lengyel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 12.04.19 at 15:55, <tamas@tklengyel.com> wrote:
> On Fri, Apr 12, 2019 at 6:01 AM Jan Beulich <JBeulich@suse.com> wrote:
>> >>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
>> > Patch cf4b30dca0a "Add debug code to detect illegal page_lock and put_page_type
>> > ordering" added extra sanity checking to page_lock/page_unlock for debug builds
>> > with the assumption that no hypervisor path ever locks two pages at once.
>> >
>> > This assumption doesn't hold during memory sharing.
>>
>> A fundamental question is - doesn't mem-sharing abuse page_lock()?
>> I've never understood why it uses it, and back when it was introduced
>> the function clearly wasn't meant to be used outside of PV memory
>> management code.
> 
> I have limited understanding of what the limitations are of page_lock
> but here is the reasoning for taking the lock for two pages.
> Mem_sharing takes the pages it wants to be sharable and assigns them
> to dom_cow at least with one reference being held (since we just took
> these pages from a live domain). Those pages may then be assigned to
> several actual domains at the same time, and each domain could be
> destroyed at any given point. There is no requirement that the
> original domain still be alive since the owner of these pages is now
> dom_cow. The backing pages owned by dom_cow only get dropped when the
> reference count hits zero, ie. no actual domain actually uses those
> pages anymore. This could happen if all domains unshare the gfn's that
> use this page as the backing or if all domains that use them for
> mem_sharing get destroyed. When we share pages, we want to make sure
> those don't get dropped under our feet while we are in the middle of
> this function. So we lock both pages.

Thanks for the explanation, but I'm afraid this doesn't address
my remark: You need _some_ kind of lock for this, but I was
questioning whether using page_lock() is the right choice. From
what you say, obtaining proper references to the page would
seem what you actually want to avoid them disappearing
behind your back.

page_lock() is used on PV page table pages to exclude multiple
updates to the same page table happening at the same time.
In a !PV build it shouldn't even be available.

> Also, I don't see why it would be fundamentally wrong to hold the
> locks to two pages at once. Is this a problem with the lock
> implementation itself?

No, the implementation itself should be fine (but I also didn't say
there's a problem locking two pages). Lock order might be an issue,
but seem to be dealing with that fine (considering mem_sharing.c
alone).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] RFC: x86/mm: conditionally check page_lock/page_unlock ownership
@ 2019-04-12 15:02           ` Tamas K Lengyel
  0 siblings, 0 replies; 26+ messages in thread
From: Tamas K Lengyel @ 2019-04-12 15:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

On Fri, Apr 12, 2019 at 8:39 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 12.04.19 at 15:55, <tamas@tklengyel.com> wrote:
> > On Fri, Apr 12, 2019 at 6:01 AM Jan Beulich <JBeulich@suse.com> wrote:
> >> >>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
> >> > Patch cf4b30dca0a "Add debug code to detect illegal page_lock and put_page_type
> >> > ordering" added extra sanity checking to page_lock/page_unlock for debug builds
> >> > with the assumption that no hypervisor path ever locks two pages at once.
> >> >
> >> > This assumption doesn't hold during memory sharing.
> >>
> >> A fundamental question is - doesn't mem-sharing abuse page_lock()?
> >> I've never understood why it uses it, and back when it was introduced
> >> the function clearly wasn't meant to be used outside of PV memory
> >> management code.
> >
> > I have limited understanding of what the limitations are of page_lock
> > but here is the reasoning for taking the lock for two pages.
> > Mem_sharing takes the pages it wants to be sharable and assigns them
> > to dom_cow at least with one reference being held (since we just took
> > these pages from a live domain). Those pages may then be assigned to
> > several actual domains at the same time, and each domain could be
> > destroyed at any given point. There is no requirement that the
> > original domain still be alive since the owner of these pages is now
> > dom_cow. The backing pages owned by dom_cow only get dropped when the
> > reference count hits zero, ie. no actual domain actually uses those
> > pages anymore. This could happen if all domains unshare the gfn's that
> > use this page as the backing or if all domains that use them for
> > mem_sharing get destroyed. When we share pages, we want to make sure
> > those don't get dropped under our feet while we are in the middle of
> > this function. So we lock both pages.
>
> Thanks for the explanation, but I'm afraid this doesn't address
> my remark: You need _some_ kind of lock for this, but I was
> questioning whether using page_lock() is the right choice. From
> what you say, obtaining proper references to the page would
> seem what you actually want to avoid them disappearing
> behind your back.

Yes, but beside the references ensuring the pages don't get dropped
there is also a bunch of book-keeping operations in mem_sharing (the
rmap bits) that also look like they need the locking to ensure no
concurrent updates happen for the same page. I guess we could
introduce a new lock but it would also need to be per-page, so to me
it looks like it would just duplicate what page_lock does already.

> page_lock() is used on PV page table pages to exclude multiple
> updates to the same page table happening at the same time.
> In a !PV build it shouldn't even be available.

I don't see page_lock being encapsulated by CONFIG_PV blocks so that
should be fine.

> > Also, I don't see why it would be fundamentally wrong to hold the
> > locks to two pages at once. Is this a problem with the lock
> > implementation itself?
>
> No, the implementation itself should be fine (but I also didn't say
> there's a problem locking two pages). Lock order might be an issue,
> but seem to be dealing with that fine (considering mem_sharing.c
> alone).

Could you expand on the lock ordering requirement? I noticed
mem_sharing aquires the locks to the pages in increasing mfn order and
tried to decipher why that is done but I couldn't really make heads or
tails.

Thanks,
Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] RFC: x86/mm: conditionally check page_lock/page_unlock ownership
@ 2019-04-12 15:02           ` Tamas K Lengyel
  0 siblings, 0 replies; 26+ messages in thread
From: Tamas K Lengyel @ 2019-04-12 15:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

On Fri, Apr 12, 2019 at 8:39 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 12.04.19 at 15:55, <tamas@tklengyel.com> wrote:
> > On Fri, Apr 12, 2019 at 6:01 AM Jan Beulich <JBeulich@suse.com> wrote:
> >> >>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
> >> > Patch cf4b30dca0a "Add debug code to detect illegal page_lock and put_page_type
> >> > ordering" added extra sanity checking to page_lock/page_unlock for debug builds
> >> > with the assumption that no hypervisor path ever locks two pages at once.
> >> >
> >> > This assumption doesn't hold during memory sharing.
> >>
> >> A fundamental question is - doesn't mem-sharing abuse page_lock()?
> >> I've never understood why it uses it, and back when it was introduced
> >> the function clearly wasn't meant to be used outside of PV memory
> >> management code.
> >
> > I have limited understanding of what the limitations are of page_lock
> > but here is the reasoning for taking the lock for two pages.
> > Mem_sharing takes the pages it wants to be sharable and assigns them
> > to dom_cow at least with one reference being held (since we just took
> > these pages from a live domain). Those pages may then be assigned to
> > several actual domains at the same time, and each domain could be
> > destroyed at any given point. There is no requirement that the
> > original domain still be alive since the owner of these pages is now
> > dom_cow. The backing pages owned by dom_cow only get dropped when the
> > reference count hits zero, ie. no actual domain actually uses those
> > pages anymore. This could happen if all domains unshare the gfn's that
> > use this page as the backing or if all domains that use them for
> > mem_sharing get destroyed. When we share pages, we want to make sure
> > those don't get dropped under our feet while we are in the middle of
> > this function. So we lock both pages.
>
> Thanks for the explanation, but I'm afraid this doesn't address
> my remark: You need _some_ kind of lock for this, but I was
> questioning whether using page_lock() is the right choice. From
> what you say, obtaining proper references to the page would
> seem what you actually want to avoid them disappearing
> behind your back.

Yes, but beside the references ensuring the pages don't get dropped
there is also a bunch of book-keeping operations in mem_sharing (the
rmap bits) that also look like they need the locking to ensure no
concurrent updates happen for the same page. I guess we could
introduce a new lock but it would also need to be per-page, so to me
it looks like it would just duplicate what page_lock does already.

> page_lock() is used on PV page table pages to exclude multiple
> updates to the same page table happening at the same time.
> In a !PV build it shouldn't even be available.

I don't see page_lock being encapsulated by CONFIG_PV blocks so that
should be fine.

> > Also, I don't see why it would be fundamentally wrong to hold the
> > locks to two pages at once. Is this a problem with the lock
> > implementation itself?
>
> No, the implementation itself should be fine (but I also didn't say
> there's a problem locking two pages). Lock order might be an issue,
> but seem to be dealing with that fine (considering mem_sharing.c
> alone).

Could you expand on the lock ordering requirement? I noticed
mem_sharing aquires the locks to the pages in increasing mfn order and
tried to decipher why that is done but I couldn't really make heads or
tails.

Thanks,
Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] RFC: x86/mm: conditionally check page_lock/page_unlock ownership
@ 2019-04-12 15:34             ` Jan Beulich
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Beulich @ 2019-04-12 15:34 UTC (permalink / raw)
  To: Tamas K Lengyel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 12.04.19 at 17:02, <tamas@tklengyel.com> wrote:
> On Fri, Apr 12, 2019 at 8:39 AM Jan Beulich <JBeulich@suse.com> wrote:
>> >>> On 12.04.19 at 15:55, <tamas@tklengyel.com> wrote:
>> > On Fri, Apr 12, 2019 at 6:01 AM Jan Beulich <JBeulich@suse.com> wrote:
>> >> >>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
>> >> > Patch cf4b30dca0a "Add debug code to detect illegal page_lock and put_page_type
>> >> > ordering" added extra sanity checking to page_lock/page_unlock for debug builds
>> >> > with the assumption that no hypervisor path ever locks two pages at once.
>> >> >
>> >> > This assumption doesn't hold during memory sharing.
>> >>
>> >> A fundamental question is - doesn't mem-sharing abuse page_lock()?
>> >> I've never understood why it uses it, and back when it was introduced
>> >> the function clearly wasn't meant to be used outside of PV memory
>> >> management code.
>> >
>> > I have limited understanding of what the limitations are of page_lock
>> > but here is the reasoning for taking the lock for two pages.
>> > Mem_sharing takes the pages it wants to be sharable and assigns them
>> > to dom_cow at least with one reference being held (since we just took
>> > these pages from a live domain). Those pages may then be assigned to
>> > several actual domains at the same time, and each domain could be
>> > destroyed at any given point. There is no requirement that the
>> > original domain still be alive since the owner of these pages is now
>> > dom_cow. The backing pages owned by dom_cow only get dropped when the
>> > reference count hits zero, ie. no actual domain actually uses those
>> > pages anymore. This could happen if all domains unshare the gfn's that
>> > use this page as the backing or if all domains that use them for
>> > mem_sharing get destroyed. When we share pages, we want to make sure
>> > those don't get dropped under our feet while we are in the middle of
>> > this function. So we lock both pages.
>>
>> Thanks for the explanation, but I'm afraid this doesn't address
>> my remark: You need _some_ kind of lock for this, but I was
>> questioning whether using page_lock() is the right choice. From
>> what you say, obtaining proper references to the page would
>> seem what you actually want to avoid them disappearing
>> behind your back.
> 
> Yes, but beside the references ensuring the pages don't get dropped
> there is also a bunch of book-keeping operations in mem_sharing (the
> rmap bits) that also look like they need the locking to ensure no
> concurrent updates happen for the same page. I guess we could
> introduce a new lock but it would also need to be per-page, so to me
> it looks like it would just duplicate what page_lock does already.

Hmm, I guess I can't further comment on this, as I know nothing
about how mem-sharing works, and hence why such per-page
locking would be needed in the first place. Since you have a
struct page_sharing_info * hanging off of struct page_info,
perhaps that's where such a lock would belong?

>> page_lock() is used on PV page table pages to exclude multiple
>> updates to the same page table happening at the same time.
>> In a !PV build it shouldn't even be available.
> 
> I don't see page_lock being encapsulated by CONFIG_PV blocks so that
> should be fine.

Well, it can't be because of its use by mem-sharing, or else
we'd break the build.

>> > Also, I don't see why it would be fundamentally wrong to hold the
>> > locks to two pages at once. Is this a problem with the lock
>> > implementation itself?
>>
>> No, the implementation itself should be fine (but I also didn't say
>> there's a problem locking two pages). Lock order might be an issue,
>> but seem to be dealing with that fine (considering mem_sharing.c
>> alone).
> 
> Could you expand on the lock ordering requirement? I noticed
> mem_sharing aquires the locks to the pages in increasing mfn order and
> tried to decipher why that is done but I couldn't really make heads or
> tails.

As always with multiple locks, they need to be acquired in the
same order by all parties. When no hierarchy gets established
by code structure, consistent ordering needs to be enforced
by other means, like comparing MFNs in the case here. You'll
find something similar in grant table code, for example.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] RFC: x86/mm: conditionally check page_lock/page_unlock ownership
@ 2019-04-12 15:34             ` Jan Beulich
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Beulich @ 2019-04-12 15:34 UTC (permalink / raw)
  To: Tamas K Lengyel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 12.04.19 at 17:02, <tamas@tklengyel.com> wrote:
> On Fri, Apr 12, 2019 at 8:39 AM Jan Beulich <JBeulich@suse.com> wrote:
>> >>> On 12.04.19 at 15:55, <tamas@tklengyel.com> wrote:
>> > On Fri, Apr 12, 2019 at 6:01 AM Jan Beulich <JBeulich@suse.com> wrote:
>> >> >>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
>> >> > Patch cf4b30dca0a "Add debug code to detect illegal page_lock and put_page_type
>> >> > ordering" added extra sanity checking to page_lock/page_unlock for debug builds
>> >> > with the assumption that no hypervisor path ever locks two pages at once.
>> >> >
>> >> > This assumption doesn't hold during memory sharing.
>> >>
>> >> A fundamental question is - doesn't mem-sharing abuse page_lock()?
>> >> I've never understood why it uses it, and back when it was introduced
>> >> the function clearly wasn't meant to be used outside of PV memory
>> >> management code.
>> >
>> > I have limited understanding of what the limitations are of page_lock
>> > but here is the reasoning for taking the lock for two pages.
>> > Mem_sharing takes the pages it wants to be sharable and assigns them
>> > to dom_cow at least with one reference being held (since we just took
>> > these pages from a live domain). Those pages may then be assigned to
>> > several actual domains at the same time, and each domain could be
>> > destroyed at any given point. There is no requirement that the
>> > original domain still be alive since the owner of these pages is now
>> > dom_cow. The backing pages owned by dom_cow only get dropped when the
>> > reference count hits zero, ie. no actual domain actually uses those
>> > pages anymore. This could happen if all domains unshare the gfn's that
>> > use this page as the backing or if all domains that use them for
>> > mem_sharing get destroyed. When we share pages, we want to make sure
>> > those don't get dropped under our feet while we are in the middle of
>> > this function. So we lock both pages.
>>
>> Thanks for the explanation, but I'm afraid this doesn't address
>> my remark: You need _some_ kind of lock for this, but I was
>> questioning whether using page_lock() is the right choice. From
>> what you say, obtaining proper references to the page would
>> seem what you actually want to avoid them disappearing
>> behind your back.
> 
> Yes, but beside the references ensuring the pages don't get dropped
> there is also a bunch of book-keeping operations in mem_sharing (the
> rmap bits) that also look like they need the locking to ensure no
> concurrent updates happen for the same page. I guess we could
> introduce a new lock but it would also need to be per-page, so to me
> it looks like it would just duplicate what page_lock does already.

Hmm, I guess I can't further comment on this, as I know nothing
about how mem-sharing works, and hence why such per-page
locking would be needed in the first place. Since you have a
struct page_sharing_info * hanging off of struct page_info,
perhaps that's where such a lock would belong?

>> page_lock() is used on PV page table pages to exclude multiple
>> updates to the same page table happening at the same time.
>> In a !PV build it shouldn't even be available.
> 
> I don't see page_lock being encapsulated by CONFIG_PV blocks so that
> should be fine.

Well, it can't be because of its use by mem-sharing, or else
we'd break the build.

>> > Also, I don't see why it would be fundamentally wrong to hold the
>> > locks to two pages at once. Is this a problem with the lock
>> > implementation itself?
>>
>> No, the implementation itself should be fine (but I also didn't say
>> there's a problem locking two pages). Lock order might be an issue,
>> but seem to be dealing with that fine (considering mem_sharing.c
>> alone).
> 
> Could you expand on the lock ordering requirement? I noticed
> mem_sharing aquires the locks to the pages in increasing mfn order and
> tried to decipher why that is done but I couldn't really make heads or
> tails.

As always with multiple locks, they need to be acquired in the
same order by all parties. When no hierarchy gets established
by code structure, consistent ordering needs to be enforced
by other means, like comparing MFNs in the case here. You'll
find something similar in grant table code, for example.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] RFC: x86/mm: conditionally check page_lock/page_unlock ownership
@ 2019-04-12 15:47               ` Tamas K Lengyel
  0 siblings, 0 replies; 26+ messages in thread
From: Tamas K Lengyel @ 2019-04-12 15:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

On Fri, Apr 12, 2019 at 9:34 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 12.04.19 at 17:02, <tamas@tklengyel.com> wrote:
> > On Fri, Apr 12, 2019 at 8:39 AM Jan Beulich <JBeulich@suse.com> wrote:
> >> >>> On 12.04.19 at 15:55, <tamas@tklengyel.com> wrote:
> >> > On Fri, Apr 12, 2019 at 6:01 AM Jan Beulich <JBeulich@suse.com> wrote:
> >> >> >>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
> >> >> > Patch cf4b30dca0a "Add debug code to detect illegal page_lock and put_page_type
> >> >> > ordering" added extra sanity checking to page_lock/page_unlock for debug builds
> >> >> > with the assumption that no hypervisor path ever locks two pages at once.
> >> >> >
> >> >> > This assumption doesn't hold during memory sharing.
> >> >>
> >> >> A fundamental question is - doesn't mem-sharing abuse page_lock()?
> >> >> I've never understood why it uses it, and back when it was introduced
> >> >> the function clearly wasn't meant to be used outside of PV memory
> >> >> management code.
> >> >
> >> > I have limited understanding of what the limitations are of page_lock
> >> > but here is the reasoning for taking the lock for two pages.
> >> > Mem_sharing takes the pages it wants to be sharable and assigns them
> >> > to dom_cow at least with one reference being held (since we just took
> >> > these pages from a live domain). Those pages may then be assigned to
> >> > several actual domains at the same time, and each domain could be
> >> > destroyed at any given point. There is no requirement that the
> >> > original domain still be alive since the owner of these pages is now
> >> > dom_cow. The backing pages owned by dom_cow only get dropped when the
> >> > reference count hits zero, ie. no actual domain actually uses those
> >> > pages anymore. This could happen if all domains unshare the gfn's that
> >> > use this page as the backing or if all domains that use them for
> >> > mem_sharing get destroyed. When we share pages, we want to make sure
> >> > those don't get dropped under our feet while we are in the middle of
> >> > this function. So we lock both pages.
> >>
> >> Thanks for the explanation, but I'm afraid this doesn't address
> >> my remark: You need _some_ kind of lock for this, but I was
> >> questioning whether using page_lock() is the right choice. From
> >> what you say, obtaining proper references to the page would
> >> seem what you actually want to avoid them disappearing
> >> behind your back.
> >
> > Yes, but beside the references ensuring the pages don't get dropped
> > there is also a bunch of book-keeping operations in mem_sharing (the
> > rmap bits) that also look like they need the locking to ensure no
> > concurrent updates happen for the same page. I guess we could
> > introduce a new lock but it would also need to be per-page, so to me
> > it looks like it would just duplicate what page_lock does already.
>
> Hmm, I guess I can't further comment on this, as I know nothing
> about how mem-sharing works, and hence why such per-page
> locking would be needed in the first place. Since you have a
> struct page_sharing_info * hanging off of struct page_info,
> perhaps that's where such a lock would belong?

Yea, I was looking at that too. Certainly would make sense to me. I'll
try go that route instead.

> >> page_lock() is used on PV page table pages to exclude multiple
> >> updates to the same page table happening at the same time.
> >> In a !PV build it shouldn't even be available.
> >
> > I don't see page_lock being encapsulated by CONFIG_PV blocks so that
> > should be fine.
>
> Well, it can't be because of its use by mem-sharing, or else
> we'd break the build.

OK :)

> >> > Also, I don't see why it would be fundamentally wrong to hold the
> >> > locks to two pages at once. Is this a problem with the lock
> >> > implementation itself?
> >>
> >> No, the implementation itself should be fine (but I also didn't say
> >> there's a problem locking two pages). Lock order might be an issue,
> >> but seem to be dealing with that fine (considering mem_sharing.c
> >> alone).
> >
> > Could you expand on the lock ordering requirement? I noticed
> > mem_sharing aquires the locks to the pages in increasing mfn order and
> > tried to decipher why that is done but I couldn't really make heads or
> > tails.
>
> As always with multiple locks, they need to be acquired in the
> same order by all parties. When no hierarchy gets established
> by code structure, consistent ordering needs to be enforced
> by other means, like comparing MFNs in the case here. You'll
> find something similar in grant table code, for example.

Makes sense.

Thanks,
Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] RFC: x86/mm: conditionally check page_lock/page_unlock ownership
@ 2019-04-12 15:47               ` Tamas K Lengyel
  0 siblings, 0 replies; 26+ messages in thread
From: Tamas K Lengyel @ 2019-04-12 15:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

On Fri, Apr 12, 2019 at 9:34 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 12.04.19 at 17:02, <tamas@tklengyel.com> wrote:
> > On Fri, Apr 12, 2019 at 8:39 AM Jan Beulich <JBeulich@suse.com> wrote:
> >> >>> On 12.04.19 at 15:55, <tamas@tklengyel.com> wrote:
> >> > On Fri, Apr 12, 2019 at 6:01 AM Jan Beulich <JBeulich@suse.com> wrote:
> >> >> >>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
> >> >> > Patch cf4b30dca0a "Add debug code to detect illegal page_lock and put_page_type
> >> >> > ordering" added extra sanity checking to page_lock/page_unlock for debug builds
> >> >> > with the assumption that no hypervisor path ever locks two pages at once.
> >> >> >
> >> >> > This assumption doesn't hold during memory sharing.
> >> >>
> >> >> A fundamental question is - doesn't mem-sharing abuse page_lock()?
> >> >> I've never understood why it uses it, and back when it was introduced
> >> >> the function clearly wasn't meant to be used outside of PV memory
> >> >> management code.
> >> >
> >> > I have limited understanding of what the limitations are of page_lock
> >> > but here is the reasoning for taking the lock for two pages.
> >> > Mem_sharing takes the pages it wants to be sharable and assigns them
> >> > to dom_cow at least with one reference being held (since we just took
> >> > these pages from a live domain). Those pages may then be assigned to
> >> > several actual domains at the same time, and each domain could be
> >> > destroyed at any given point. There is no requirement that the
> >> > original domain still be alive since the owner of these pages is now
> >> > dom_cow. The backing pages owned by dom_cow only get dropped when the
> >> > reference count hits zero, ie. no actual domain actually uses those
> >> > pages anymore. This could happen if all domains unshare the gfn's that
> >> > use this page as the backing or if all domains that use them for
> >> > mem_sharing get destroyed. When we share pages, we want to make sure
> >> > those don't get dropped under our feet while we are in the middle of
> >> > this function. So we lock both pages.
> >>
> >> Thanks for the explanation, but I'm afraid this doesn't address
> >> my remark: You need _some_ kind of lock for this, but I was
> >> questioning whether using page_lock() is the right choice. From
> >> what you say, obtaining proper references to the page would
> >> seem what you actually want to avoid them disappearing
> >> behind your back.
> >
> > Yes, but beside the references ensuring the pages don't get dropped
> > there is also a bunch of book-keeping operations in mem_sharing (the
> > rmap bits) that also look like they need the locking to ensure no
> > concurrent updates happen for the same page. I guess we could
> > introduce a new lock but it would also need to be per-page, so to me
> > it looks like it would just duplicate what page_lock does already.
>
> Hmm, I guess I can't further comment on this, as I know nothing
> about how mem-sharing works, and hence why such per-page
> locking would be needed in the first place. Since you have a
> struct page_sharing_info * hanging off of struct page_info,
> perhaps that's where such a lock would belong?

Yea, I was looking at that too. Certainly would make sense to me. I'll
try go that route instead.

> >> page_lock() is used on PV page table pages to exclude multiple
> >> updates to the same page table happening at the same time.
> >> In a !PV build it shouldn't even be available.
> >
> > I don't see page_lock being encapsulated by CONFIG_PV blocks so that
> > should be fine.
>
> Well, it can't be because of its use by mem-sharing, or else
> we'd break the build.

OK :)

> >> > Also, I don't see why it would be fundamentally wrong to hold the
> >> > locks to two pages at once. Is this a problem with the lock
> >> > implementation itself?
> >>
> >> No, the implementation itself should be fine (but I also didn't say
> >> there's a problem locking two pages). Lock order might be an issue,
> >> but seem to be dealing with that fine (considering mem_sharing.c
> >> alone).
> >
> > Could you expand on the lock ordering requirement? I noticed
> > mem_sharing aquires the locks to the pages in increasing mfn order and
> > tried to decipher why that is done but I couldn't really make heads or
> > tails.
>
> As always with multiple locks, they need to be acquired in the
> same order by all parties. When no hierarchy gets established
> by code structure, consistent ordering needs to be enforced
> by other means, like comparing MFNs in the case here. You'll
> find something similar in grant table code, for example.

Makes sense.

Thanks,
Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2019-04-12 15:49 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-12  4:29 [PATCH 1/2] x86/mem_sharing: reorder when pages are unlocked and released Tamas K Lengyel
2019-04-12  4:29 ` [Xen-devel] " Tamas K Lengyel
2019-04-12  4:29 ` [PATCH 2/2] RFC: x86/mm: conditionally check page_lock/page_unlock ownership Tamas K Lengyel
2019-04-12  4:29   ` [Xen-devel] " Tamas K Lengyel
2019-04-12 12:01   ` Jan Beulich
2019-04-12 12:01     ` [Xen-devel] " Jan Beulich
2019-04-12 13:55     ` Tamas K Lengyel
2019-04-12 13:55       ` [Xen-devel] " Tamas K Lengyel
2019-04-12 14:39       ` Jan Beulich
2019-04-12 14:39         ` [Xen-devel] " Jan Beulich
2019-04-12 15:02         ` Tamas K Lengyel
2019-04-12 15:02           ` [Xen-devel] " Tamas K Lengyel
2019-04-12 15:34           ` Jan Beulich
2019-04-12 15:34             ` [Xen-devel] " Jan Beulich
2019-04-12 15:47             ` Tamas K Lengyel
2019-04-12 15:47               ` [Xen-devel] " Tamas K Lengyel
2019-04-12 11:55 ` [PATCH 1/2] x86/mem_sharing: reorder when pages are unlocked and released Jan Beulich
2019-04-12 11:55   ` [Xen-devel] " Jan Beulich
2019-04-12 13:41   ` Tamas K Lengyel
2019-04-12 13:41     ` [Xen-devel] " Tamas K Lengyel
2019-04-12 14:00     ` Andrew Cooper
2019-04-12 14:00       ` [Xen-devel] " Andrew Cooper
2019-04-12 14:13       ` Tamas K Lengyel
2019-04-12 14:13         ` [Xen-devel] " Tamas K Lengyel
2019-04-12 14:29     ` Jan Beulich
2019-04-12 14:29       ` [Xen-devel] " Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.