All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/9] Memory scrubbing from idle loop
@ 2017-03-24 17:04 Boris Ostrovsky
  2017-03-24 17:04 ` [PATCH v1 1/9] mm: Separate free page chunk merging into its own routine Boris Ostrovsky
                   ` (8 more replies)
  0 siblings, 9 replies; 33+ messages in thread
From: Boris Ostrovsky @ 2017-03-24 17:04 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, jbeulich, Boris Ostrovsky

When a domain is destroyed the hypervisor must scrub domain's pages before
giving them to another guest in order to prevent leaking the deceased
guest's data. Currently this is done during guest's destruction, possibly
causing very lengthy cleanup process.

This series adds support for scrubbing released pages from idle loop,
making guest destruction significantly faster. For example, destroying a
1TB guest can now be completed in 40+ seconds as opposed to about 9 minutes
using existing scrubbing algorithm.

The downside of this series is that we sometimes fail to allocate high-order
sets of pages since dirty pages may not yet be merged into higher-order sets
while they are waiting to be scrubbed.

Briefly, the new algorithm places dirty pages at the end of heap's page list
for each node/zone/order to avoid having to scan full list while searching
for dirty pages. One processor form each node checks whether the node has any
dirty pages and, if such pages are found, scrubs them. Scrubbing itself
happens without holding heap lock so other users may access heap in the
meantime. If while idle loop is scrubbing a particular chunk of pages this
chunk is requested by the heap allocator, scrubbing is immediately stopped.

On the allocation side, alloc_heap_pages() first tries to satisfy allocation
request using only clean pages. If this is not possible, the search is
repeated and dirty pages are scrubbed by the allocator.

This series is somewhat based on earlier work by Bob Liu.

V1:
* Only set PGC_need_scrub bit for the buddy head, thus making it unnecessary
  to scan whole buddy
* Fix spin_lock_cb()
* Scrub CPU-less nodes
* ARM support. Note that I have not been able to test this, only built the
  binary
* Added scrub test patch (last one). Not sure whether it should be considered
  for committing but I have been running with it.

Deferred:
* Per-node heap locks. In addition to (presumably) improving performance in
  general, once they are available we can parallelize scrubbing further by
  allowing more than one core per node to do idle loop scrubbing.
* AVX-based scrubbing
* Use idle loop scrubbing during boot.


Boris Ostrovsky (9):
  mm: Separate free page chunk merging into its own routine
  mm: Place unscrubbed pages at the end of pagelist
  mm: Scrub pages in alloc_heap_pages() if needed
  mm: Scrub memory from idle loop
  mm: Do not discard already-scrubbed pages softirqs are pending
  spinlock: Introduce spin_lock_cb()
  mm: Keep pages available for allocation while scrubbing
  mm: Print number of unscrubbed pages in 'H' debug handler
  mm: Make sure pages are scrubbed

 xen/Kconfig.debug          |    7 +
 xen/arch/arm/domain.c      |   13 +-
 xen/arch/x86/domain.c      |    3 +-
 xen/common/page_alloc.c    |  450 +++++++++++++++++++++++++++++++++++++-------
 xen/common/spinlock.c      |   20 ++
 xen/include/asm-arm/mm.h   |    8 +
 xen/include/asm-x86/mm.h   |    8 +
 xen/include/xen/mm.h       |    1 +
 xen/include/xen/spinlock.h |    3 +
 9 files changed, 439 insertions(+), 74 deletions(-)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v1 1/9] mm: Separate free page chunk merging into its own routine
  2017-03-24 17:04 [PATCH v1 0/9] Memory scrubbing from idle loop Boris Ostrovsky
@ 2017-03-24 17:04 ` Boris Ostrovsky
  2017-03-27 15:16   ` Wei Liu
  2017-03-28 19:20   ` Wei Liu
  2017-03-24 17:04 ` [PATCH v1 2/9] mm: Place unscrubbed pages at the end of pagelist Boris Ostrovsky
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 33+ messages in thread
From: Boris Ostrovsky @ 2017-03-24 17:04 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, jbeulich, Boris Ostrovsky

This is needed for subsequent changes to memory scrubbing. No
logic change, only code re-factoring.

Based on earlier patch by Bob Liu.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/common/page_alloc.c |   85 ++++++++++++++++++++++++++++------------------
 1 files changed, 52 insertions(+), 33 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 42c20cb..7931903 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -924,11 +924,61 @@ static int reserve_offlined_page(struct page_info *head)
     return count;
 }
 
+static bool_t can_merge(struct page_info *buddy, unsigned int node,
+                        unsigned int order)
+{
+    if ( !mfn_valid(_mfn(page_to_mfn(buddy))) ||
+         !page_state_is(buddy, free) ||
+         (PFN_ORDER(buddy) != order) ||
+         (phys_to_nid(page_to_maddr(buddy)) != node) )
+        return 0;
+
+    return 1;
+}
+
+static void merge_chunks(struct page_info *pg, unsigned int node,
+                         unsigned int zone, unsigned int order)
+{
+    ASSERT(spin_is_locked(&heap_lock));
+
+    /* Merge chunks as far as possible. */
+    while ( order < MAX_ORDER )
+    {
+        unsigned long mask = 1UL << order;
+        struct page_info *buddy;
+
+        if ( (page_to_mfn(pg) & mask) )
+        {
+            /* Merge with predecessor block? */
+            buddy = pg - mask;
+            if ( !can_merge(buddy, node, order) )
+                break;
+
+            pg = buddy;
+            page_list_del(pg, &heap(node, zone, order));
+        }
+        else
+        {
+            /* Merge with successor block? */
+            buddy = pg + mask;
+            if ( !can_merge(buddy, node, order) )
+                break;
+
+            page_list_del(buddy, &heap(node, zone, order));
+        }
+
+        order++;
+    }
+
+    PFN_ORDER(pg) = order;
+    page_list_add_tail(pg, &heap(node, zone, order));
+}
+
 /* Free 2^@order set of pages. */
 static void free_heap_pages(
     struct page_info *pg, unsigned int order)
 {
-    unsigned long mask, mfn = page_to_mfn(pg);
+    unsigned long mfn = page_to_mfn(pg);
     unsigned int i, node = phys_to_nid(page_to_maddr(pg)), tainted = 0;
     unsigned int zone = page_to_zone(pg);
 
@@ -975,38 +1025,7 @@ static void free_heap_pages(
         midsize_alloc_zone_pages = max(
             midsize_alloc_zone_pages, total_avail_pages / MIDSIZE_ALLOC_FRAC);
 
-    /* Merge chunks as far as possible. */
-    while ( order < MAX_ORDER )
-    {
-        mask = 1UL << order;
-
-        if ( (page_to_mfn(pg) & mask) )
-        {
-            /* Merge with predecessor block? */
-            if ( !mfn_valid(_mfn(page_to_mfn(pg-mask))) ||
-                 !page_state_is(pg-mask, free) ||
-                 (PFN_ORDER(pg-mask) != order) ||
-                 (phys_to_nid(page_to_maddr(pg-mask)) != node) )
-                break;
-            pg -= mask;
-            page_list_del(pg, &heap(node, zone, order));
-        }
-        else
-        {
-            /* Merge with successor block? */
-            if ( !mfn_valid(_mfn(page_to_mfn(pg+mask))) ||
-                 !page_state_is(pg+mask, free) ||
-                 (PFN_ORDER(pg+mask) != order) ||
-                 (phys_to_nid(page_to_maddr(pg+mask)) != node) )
-                break;
-            page_list_del(pg + mask, &heap(node, zone, order));
-        }
-
-        order++;
-    }
-
-    PFN_ORDER(pg) = order;
-    page_list_add_tail(pg, &heap(node, zone, order));
+    merge_chunks(pg, node, zone, order);
 
     if ( tainted )
         reserve_offlined_page(pg);
-- 
1.7.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v1 2/9] mm: Place unscrubbed pages at the end of pagelist
  2017-03-24 17:04 [PATCH v1 0/9] Memory scrubbing from idle loop Boris Ostrovsky
  2017-03-24 17:04 ` [PATCH v1 1/9] mm: Separate free page chunk merging into its own routine Boris Ostrovsky
@ 2017-03-24 17:04 ` Boris Ostrovsky
  2017-03-28 19:27   ` Wei Liu
  2017-03-24 17:04 ` [PATCH v1 3/9] mm: Scrub pages in alloc_heap_pages() if needed Boris Ostrovsky
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Boris Ostrovsky @ 2017-03-24 17:04 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, jbeulich, Boris Ostrovsky

. so that it's easy to find pages that need to be scrubbed (those pages are
now marked with _PGC_need_scrub bit).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/common/page_alloc.c  |  103 +++++++++++++++++++++++++++++++++++++---------
 xen/include/asm-arm/mm.h |    4 ++
 xen/include/asm-x86/mm.h |    4 ++
 3 files changed, 91 insertions(+), 20 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 7931903..a28eb38 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -383,6 +383,8 @@ typedef struct page_list_head heap_by_zone_and_order_t[NR_ZONES][MAX_ORDER+1];
 static heap_by_zone_and_order_t *_heap[MAX_NUMNODES];
 #define heap(node, zone, order) ((*_heap[node])[zone][order])
 
+static unsigned long node_need_scrub[MAX_NUMNODES];
+
 static unsigned long *avail[MAX_NUMNODES];
 static long total_avail_pages;
 
@@ -807,7 +809,7 @@ static struct page_info *alloc_heap_pages(
     while ( j != order )
     {
         PFN_ORDER(pg) = --j;
-        page_list_add_tail(pg, &heap(node, zone, j));
+        page_list_add(pg, &heap(node, zone, j));
         pg += 1 << j;
     }
 
@@ -827,6 +829,8 @@ static struct page_info *alloc_heap_pages(
         BUG_ON(pg[i].count_info != PGC_state_free);
         pg[i].count_info = PGC_state_inuse;
 
+        BUG_ON(test_bit(_PGC_need_scrub, &pg[i].count_info));
+
         if ( !(memflags & MEMF_no_tlbflush) )
             accumulate_tlbflush(&need_tlbflush, &pg[i],
                                 &tlbflush_timestamp);
@@ -856,6 +860,7 @@ static int reserve_offlined_page(struct page_info *head)
     int zone = page_to_zone(head), i, head_order = PFN_ORDER(head), count = 0;
     struct page_info *cur_head;
     int cur_order;
+    bool_t need_scrub = !!test_bit(_PGC_need_scrub, &head->count_info);
 
     ASSERT(spin_is_locked(&heap_lock));
 
@@ -897,7 +902,13 @@ static int reserve_offlined_page(struct page_info *head)
             {
             merge:
                 /* We don't consider merging outside the head_order. */
-                page_list_add_tail(cur_head, &heap(node, zone, cur_order));
+                if ( need_scrub )
+                {
+                    cur_head->count_info |= PGC_need_scrub;
+                    page_list_add_tail(cur_head, &heap(node, zone, cur_order));
+                }
+                else
+                    page_list_add(cur_head, &heap(node, zone, cur_order));
                 PFN_ORDER(cur_head) = cur_order;
                 cur_head += (1 << cur_order);
                 break;
@@ -925,7 +936,7 @@ static int reserve_offlined_page(struct page_info *head)
 }
 
 static bool_t can_merge(struct page_info *buddy, unsigned int node,
-                        unsigned int order)
+                        unsigned int order, bool_t need_scrub)
 {
     if ( !mfn_valid(_mfn(page_to_mfn(buddy))) ||
          !page_state_is(buddy, free) ||
@@ -933,12 +944,18 @@ static bool_t can_merge(struct page_info *buddy, unsigned int node,
          (phys_to_nid(page_to_maddr(buddy)) != node) )
         return 0;
 
+    if ( need_scrub !=
+         !!test_bit(_PGC_need_scrub, &buddy->count_info) )
+        return 0;
+
     return 1;
 }
 
 static void merge_chunks(struct page_info *pg, unsigned int node,
                          unsigned int zone, unsigned int order)
 {
+    bool_t need_scrub = !!test_bit(_PGC_need_scrub, &pg->count_info);
+
     ASSERT(spin_is_locked(&heap_lock));
 
     /* Merge chunks as far as possible. */
@@ -951,9 +968,10 @@ static void merge_chunks(struct page_info *pg, unsigned int node,
         {
             /* Merge with predecessor block? */
             buddy = pg - mask;
-            if ( !can_merge(buddy, node, order) )
+            if ( !can_merge(buddy, node, order, need_scrub) )
                 break;
 
+            pg->count_info &= ~PGC_need_scrub;
             pg = buddy;
             page_list_del(pg, &heap(node, zone, order));
         }
@@ -961,9 +979,10 @@ static void merge_chunks(struct page_info *pg, unsigned int node,
         {
             /* Merge with successor block? */
             buddy = pg + mask;
-            if ( !can_merge(buddy, node, order) )
+            if ( !can_merge(buddy, node, order, need_scrub) )
                 break;
 
+            buddy->count_info &= ~PGC_need_scrub;
             page_list_del(buddy, &heap(node, zone, order));
         }
 
@@ -971,12 +990,54 @@ static void merge_chunks(struct page_info *pg, unsigned int node,
     }
 
     PFN_ORDER(pg) = order;
-    page_list_add_tail(pg, &heap(node, zone, order));
+    if ( need_scrub )
+    {
+        pg->count_info |= PGC_need_scrub;
+        page_list_add_tail(pg, &heap(node, zone, order));
+    }
+    else
+        page_list_add(pg, &heap(node, zone, order));
+}
+
+static void scrub_free_pages(unsigned int node)
+{
+    struct page_info *pg;
+    unsigned int i, zone;
+    int order;
+
+    ASSERT(spin_is_locked(&heap_lock));
+
+    if ( !node_need_scrub[node] )
+        return;
+
+    for ( zone = 0; zone < NR_ZONES; zone++ )
+    {
+        for ( order = MAX_ORDER; order >= 0; order-- )
+        {
+            while ( !page_list_empty(&heap(node, zone, order)) )
+            {
+                /* Unscrubbed pages are always at the end of the list. */
+                pg = page_list_last(&heap(node, zone, order));
+                if ( !test_bit(_PGC_need_scrub, &pg->count_info) )
+                    break;
+
+                for ( i = 0; i < (1UL << order); i++)
+                    scrub_one_page(&pg[i]);
+
+                pg->count_info &= ~PGC_need_scrub;
+
+                page_list_del(pg, &heap(node, zone, order));
+                merge_chunks(pg, node, zone, order);
+
+                node_need_scrub[node] -= (1UL << order);
+            }
+        }
+    }
 }
 
 /* Free 2^@order set of pages. */
 static void free_heap_pages(
-    struct page_info *pg, unsigned int order)
+    struct page_info *pg, unsigned int order, bool_t need_scrub)
 {
     unsigned long mfn = page_to_mfn(pg);
     unsigned int i, node = phys_to_nid(page_to_maddr(pg)), tainted = 0;
@@ -1025,11 +1086,20 @@ static void free_heap_pages(
         midsize_alloc_zone_pages = max(
             midsize_alloc_zone_pages, total_avail_pages / MIDSIZE_ALLOC_FRAC);
 
+    if ( need_scrub && !tainted )
+    {
+        pg->count_info |= PGC_need_scrub;
+        node_need_scrub[node] += (1UL << order);
+    }
+
     merge_chunks(pg, node, zone, order);
 
     if ( tainted )
         reserve_offlined_page(pg);
 
+    if ( need_scrub )
+        scrub_free_pages(node);
+
     spin_unlock(&heap_lock);
 }
 
@@ -1250,7 +1320,7 @@ unsigned int online_page(unsigned long mfn, uint32_t *status)
     spin_unlock(&heap_lock);
 
     if ( (y & PGC_state) == PGC_state_offlined )
-        free_heap_pages(pg, 0);
+        free_heap_pages(pg, 0, 0);
 
     return ret;
 }
@@ -1319,7 +1389,7 @@ static void init_heap_pages(
             nr_pages -= n;
         }
 
-        free_heap_pages(pg+i, 0);
+        free_heap_pages(pg + i, 0, 0);
     }
 }
 
@@ -1646,7 +1716,7 @@ void free_xenheap_pages(void *v, unsigned int order)
 
     memguard_guard_range(v, 1 << (order + PAGE_SHIFT));
 
-    free_heap_pages(virt_to_page(v), order);
+    free_heap_pages(virt_to_page(v), order, 0);
 }
 
 #else
@@ -1700,12 +1770,9 @@ void free_xenheap_pages(void *v, unsigned int order)
     pg = virt_to_page(v);
 
     for ( i = 0; i < (1u << order); i++ )
-    {
-        scrub_one_page(&pg[i]);
         pg[i].count_info &= ~PGC_xen_heap;
-    }
 
-    free_heap_pages(pg, order);
+    free_heap_pages(pg, order, 1);
 }
 
 #endif
@@ -1814,7 +1881,7 @@ struct page_info *alloc_domheap_pages(
     if ( d && !(memflags & MEMF_no_owner) &&
          assign_pages(d, pg, order, memflags) )
     {
-        free_heap_pages(pg, order);
+        free_heap_pages(pg, order, 0);
         return NULL;
     }
     
@@ -1882,11 +1949,7 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
             scrub = 1;
         }
 
-        if ( unlikely(scrub) )
-            for ( i = 0; i < (1 << order); i++ )
-                scrub_one_page(&pg[i]);
-
-        free_heap_pages(pg, order);
+        free_heap_pages(pg, order, scrub);
     }
 
     if ( drop_dom_ref )
diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index 60ccbf3..52a03a0 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -113,6 +113,10 @@ struct page_info
 #define PGC_count_width   PG_shift(9)
 #define PGC_count_mask    ((1UL<<PGC_count_width)-1)
 
+/* Page needs to be scrubbed */
+#define _PGC_need_scrub   PG_shift(10)
+#define PGC_need_scrub    PG_mask(1, 10)
+
 extern unsigned long xenheap_mfn_start, xenheap_mfn_end;
 extern vaddr_t xenheap_virt_end;
 #ifdef CONFIG_ARM_64
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index e22603c..f3d4443 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -233,6 +233,10 @@ struct page_info
 #define PGC_count_width   PG_shift(9)
 #define PGC_count_mask    ((1UL<<PGC_count_width)-1)
 
+/* Page needs to be scrubbed */
+#define _PGC_need_scrub   PG_shift(10)
+#define PGC_need_scrub    PG_mask(1, 10)
+
 struct spage_info
 {
        unsigned long type_info;
-- 
1.7.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v1 3/9] mm: Scrub pages in alloc_heap_pages() if needed
  2017-03-24 17:04 [PATCH v1 0/9] Memory scrubbing from idle loop Boris Ostrovsky
  2017-03-24 17:04 ` [PATCH v1 1/9] mm: Separate free page chunk merging into its own routine Boris Ostrovsky
  2017-03-24 17:04 ` [PATCH v1 2/9] mm: Place unscrubbed pages at the end of pagelist Boris Ostrovsky
@ 2017-03-24 17:04 ` Boris Ostrovsky
  2017-03-28 19:43   ` Wei Liu
  2017-03-24 17:04 ` [PATCH v1 4/9] mm: Scrub memory from idle loop Boris Ostrovsky
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Boris Ostrovsky @ 2017-03-24 17:04 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, jbeulich, Boris Ostrovsky

When allocating pages in alloc_heap_pages() first look for clean pages. If none
is found then retry, take pages marked as unscrubbed and scrub them.

Note that we shouldn't find unscrubbed pages in alloc_heap_pages() yet. However,
this will become possible when we stop scrubbing from free_heap_pages() and
instead do it from idle loop.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/common/page_alloc.c |   93 +++++++++++++++++++++++++++++++++++------------
 1 files changed, 69 insertions(+), 24 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index a28eb38..61f873a 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -691,34 +691,17 @@ static struct page_info *alloc_heap_pages(
     unsigned int order, unsigned int memflags,
     struct domain *d)
 {
-    unsigned int i, j, zone = 0, nodemask_retry = 0;
-    nodeid_t first_node, node = MEMF_get_node(memflags), req_node = node;
+    unsigned int i, j, zone, nodemask_retry;
+    nodeid_t first_node, node, req_node;
     unsigned long request = 1UL << order;
     struct page_info *pg;
-    nodemask_t nodemask = (d != NULL ) ? d->node_affinity : node_online_map;
-    bool_t need_tlbflush = 0;
+    nodemask_t nodemask;
+    bool_t need_scrub, need_tlbflush = 0, use_unscrubbed = 0;
     uint32_t tlbflush_timestamp = 0;
 
     /* Make sure there are enough bits in memflags for nodeID. */
     BUILD_BUG_ON((_MEMF_bits - _MEMF_node) < (8 * sizeof(nodeid_t)));
 
-    if ( node == NUMA_NO_NODE )
-    {
-        if ( d != NULL )
-        {
-            node = next_node(d->last_alloc_node, nodemask);
-            if ( node >= MAX_NUMNODES )
-                node = first_node(nodemask);
-        }
-        if ( node >= MAX_NUMNODES )
-            node = cpu_to_node(smp_processor_id());
-    }
-    first_node = node;
-
-    ASSERT(node < MAX_NUMNODES);
-    ASSERT(zone_lo <= zone_hi);
-    ASSERT(zone_hi < NR_ZONES);
-
     if ( unlikely(order > MAX_ORDER) )
         return NULL;
 
@@ -732,7 +715,10 @@ static struct page_info *alloc_heap_pages(
           total_avail_pages + tmem_freeable_pages()) &&
           ((memflags & MEMF_no_refcount) ||
            !d || d->outstanding_pages < request) )
-        goto not_found;
+    {
+        spin_unlock(&heap_lock);
+        return NULL;
+    }
 
     /*
      * TMEM: When available memory is scarce due to tmem absorbing it, allow
@@ -745,6 +731,28 @@ static struct page_info *alloc_heap_pages(
          tmem_freeable_pages() )
         goto try_tmem;
 
+ again:
+
+    nodemask_retry = 0;
+    nodemask = (d != NULL ) ? d->node_affinity : node_online_map;
+    node = req_node = MEMF_get_node(memflags);
+    if ( node == NUMA_NO_NODE )
+    {
+        if ( d != NULL )
+        {
+            node = next_node(d->last_alloc_node, nodemask);
+            if ( node >= MAX_NUMNODES )
+                node = first_node(nodemask);
+        }
+        if ( node >= MAX_NUMNODES )
+            node = cpu_to_node(smp_processor_id());
+    }
+    first_node = node;
+
+    ASSERT(node < MAX_NUMNODES);
+    ASSERT(zone_lo <= zone_hi);
+    ASSERT(zone_hi < NR_ZONES);
+
     /*
      * Start with requested node, but exhaust all node memory in requested 
      * zone before failing, only calc new node value if we fail to find memory 
@@ -760,8 +768,16 @@ static struct page_info *alloc_heap_pages(
 
             /* Find smallest order which can satisfy the request. */
             for ( j = order; j <= MAX_ORDER; j++ )
+            {
                 if ( (pg = page_list_remove_head(&heap(node, zone, j))) )
-                    goto found;
+                {
+                    if ( (order == 0) || use_unscrubbed ||
+                         !test_bit(_PGC_need_scrub, &pg->count_info) )
+                        goto found;
+
+                    page_list_add_tail(pg, &heap(node, zone, j));
+                }
+            }
         } while ( zone-- > zone_lo ); /* careful: unsigned zone may wrap */
 
         if ( (memflags & MEMF_exact_node) && req_node != NUMA_NO_NODE )
@@ -800,18 +816,38 @@ static struct page_info *alloc_heap_pages(
     }
 
  not_found:
+    /*
+     * If we couldn't find clean page let's search again and this time
+     * take unscrubbed pages if available.
+     */
+    if ( !use_unscrubbed )
+    {
+        use_unscrubbed = 1;
+        goto again;
+    }
+
     /* No suitable memory blocks. Fail the request. */
     spin_unlock(&heap_lock);
     return NULL;
 
  found: 
+    need_scrub = !!test_bit(_PGC_need_scrub, &pg->count_info);
+
     /* We may have to halve the chunk a number of times. */
     while ( j != order )
     {
         PFN_ORDER(pg) = --j;
-        page_list_add(pg, &heap(node, zone, j));
+        if ( need_scrub )
+        {
+            pg->count_info |= PGC_need_scrub;
+            page_list_add_tail(pg, &heap(node, zone, j));
+        }
+        else
+            page_list_add(pg, &heap(node, zone, j));
         pg += 1 << j;
     }
+    if ( need_scrub )
+        pg->count_info |= PGC_need_scrub;
 
     ASSERT(avail[node][zone] >= request);
     avail[node][zone] -= request;
@@ -823,6 +859,15 @@ static struct page_info *alloc_heap_pages(
     if ( d != NULL )
         d->last_alloc_node = node;
 
+    if ( need_scrub )
+    {
+        for ( i = 0; i < (1 << order); i++ )
+            scrub_one_page(&pg[i]);
+        pg->count_info &= ~PGC_need_scrub;
+        node_need_scrub[node] -= (1 << order);
+    }
+
+
     for ( i = 0; i < (1 << order); i++ )
     {
         /* Reference count must continuously be zero for free pages. */
-- 
1.7.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v1 4/9] mm: Scrub memory from idle loop
  2017-03-24 17:04 [PATCH v1 0/9] Memory scrubbing from idle loop Boris Ostrovsky
                   ` (2 preceding siblings ...)
  2017-03-24 17:04 ` [PATCH v1 3/9] mm: Scrub pages in alloc_heap_pages() if needed Boris Ostrovsky
@ 2017-03-24 17:04 ` Boris Ostrovsky
  2017-03-28 20:01   ` Wei Liu
  2017-03-24 17:05 ` [PATCH v1 5/9] mm: Do not discard already-scrubbed pages if softirqs are pending Boris Ostrovsky
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Boris Ostrovsky @ 2017-03-24 17:04 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, jbeulich, Boris Ostrovsky

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/arch/arm/domain.c   |   13 ++++++++-----
 xen/arch/x86/domain.c   |    3 ++-
 xen/common/page_alloc.c |   41 +++++++++++++++++++++++++++++++++--------
 xen/include/xen/mm.h    |    1 +
 4 files changed, 44 insertions(+), 14 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index bb327da..fdf06e1 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -45,13 +45,16 @@ void idle_loop(void)
         if ( cpu_is_offline(smp_processor_id()) )
             stop_cpu();
 
-        local_irq_disable();
-        if ( cpu_is_haltable(smp_processor_id()) )
+        if ( !scrub_free_pages() )
         {
-            dsb(sy);
-            wfi();
+            local_irq_disable();
+            if ( cpu_is_haltable(smp_processor_id()) )
+            {
+                dsb(sy);
+                wfi();
+            }
+            local_irq_enable();
         }
-        local_irq_enable();
 
         do_tasklet();
         do_softirq();
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 90e2b1f..a5f62b5 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -118,7 +118,8 @@ static void idle_loop(void)
     {
         if ( cpu_is_offline(smp_processor_id()) )
             play_dead();
-        (*pm_idle)();
+        if ( !scrub_free_pages() )
+            (*pm_idle)();
         do_tasklet();
         do_softirq();
         /*
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 61f873a..e438547 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1044,16 +1044,35 @@ static void merge_chunks(struct page_info *pg, unsigned int node,
         page_list_add(pg, &heap(node, zone, order));
 }
 
-static void scrub_free_pages(unsigned int node)
+bool_t scrub_free_pages()
 {
     struct page_info *pg;
     unsigned int i, zone;
-    int order;
+    int order, cpu = smp_processor_id();
+    nodeid_t node = cpu_to_node(cpu), local_node;
+    static nodemask_t node_scrubbing;
 
-    ASSERT(spin_is_locked(&heap_lock));
+    if ( node == NUMA_NO_NODE )
+        node = 0;
+    local_node = node;
 
-    if ( !node_need_scrub[node] )
-        return;
+    while ( 1 )
+    {
+        if ( node_need_scrub[node] && !node_test_and_set(node, node_scrubbing) )
+            break;
+
+        /*
+         * If local node is already being scrubbed then see if there is a
+         * memory-only node that needs scrubbing.
+         */
+        do {
+            node = cycle_node(node, node_online_map);
+            if ( node == local_node )
+                return 0;
+        } while ( !cpumask_empty(&node_to_cpumask(node)) );
+    }
+
+    spin_lock(&heap_lock);
 
     for ( zone = 0; zone < NR_ZONES; zone++ )
     {
@@ -1067,7 +1086,11 @@ static void scrub_free_pages(unsigned int node)
                     break;
 
                 for ( i = 0; i < (1UL << order); i++)
+                {
                     scrub_one_page(&pg[i]);
+                    if ( softirq_pending(cpu) )
+                        goto out;
+                }
 
                 pg->count_info &= ~PGC_need_scrub;
 
@@ -1078,6 +1101,11 @@ static void scrub_free_pages(unsigned int node)
             }
         }
     }
+
+ out:
+    spin_unlock(&heap_lock);
+    node_clear(node, node_scrubbing);
+    return (node_need_scrub[node] != 0);
 }
 
 /* Free 2^@order set of pages. */
@@ -1142,9 +1170,6 @@ static void free_heap_pages(
     if ( tainted )
         reserve_offlined_page(pg);
 
-    if ( need_scrub )
-        scrub_free_pages(node);
-
     spin_unlock(&heap_lock);
 }
 
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index 88de3c1..3d93fcc 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -138,6 +138,7 @@ void init_xenheap_pages(paddr_t ps, paddr_t pe);
 void xenheap_max_mfn(unsigned long mfn);
 void *alloc_xenheap_pages(unsigned int order, unsigned int memflags);
 void free_xenheap_pages(void *v, unsigned int order);
+bool_t scrub_free_pages(void);
 #define alloc_xenheap_page() (alloc_xenheap_pages(0,0))
 #define free_xenheap_page(v) (free_xenheap_pages(v,0))
 /* Map machine page range in Xen virtual address space. */
-- 
1.7.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v1 5/9] mm: Do not discard already-scrubbed pages if softirqs are pending
  2017-03-24 17:04 [PATCH v1 0/9] Memory scrubbing from idle loop Boris Ostrovsky
                   ` (3 preceding siblings ...)
  2017-03-24 17:04 ` [PATCH v1 4/9] mm: Scrub memory from idle loop Boris Ostrovsky
@ 2017-03-24 17:05 ` Boris Ostrovsky
  2017-03-29 10:22   ` Wei Liu
  2017-03-24 17:05 ` [PATCH v1 6/9] spinlock: Introduce spin_lock_cb() Boris Ostrovsky
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Boris Ostrovsky @ 2017-03-24 17:05 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, jbeulich, Boris Ostrovsky

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/common/page_alloc.c |   80 ++++++++++++++++++++++++++++++++++++++--------
 1 files changed, 66 insertions(+), 14 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index e438547..e5e6e70 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -997,7 +997,7 @@ static bool_t can_merge(struct page_info *buddy, unsigned int node,
 }
 
 static void merge_chunks(struct page_info *pg, unsigned int node,
-                         unsigned int zone, unsigned int order)
+                         unsigned int zone, unsigned int order, bool_t is_frag)
 {
     bool_t need_scrub = !!test_bit(_PGC_need_scrub, &pg->count_info);
 
@@ -1022,9 +1022,12 @@ static void merge_chunks(struct page_info *pg, unsigned int node,
         }
         else
         {
-            /* Merge with successor block? */
+            /*
+             * Merge with successor block?
+             * Only un-fragmented buddy can be merged forward.
+             */
             buddy = pg + mask;
-            if ( !can_merge(buddy, node, order, need_scrub) )
+            if ( is_frag || !can_merge(buddy, node, order, need_scrub) )
                 break;
 
             buddy->count_info &= ~PGC_need_scrub;
@@ -1044,10 +1047,13 @@ static void merge_chunks(struct page_info *pg, unsigned int node,
         page_list_add(pg, &heap(node, zone, order));
 }
 
+#define SCRUB_CHUNK_ORDER  8
 bool_t scrub_free_pages()
 {
     struct page_info *pg;
     unsigned int i, zone;
+    unsigned int num_scrubbed, scrub_order, start, end;
+    bool_t preempt, is_frag;
     int order, cpu = smp_processor_id();
     nodeid_t node = cpu_to_node(cpu), local_node;
     static nodemask_t node_scrubbing;
@@ -1072,6 +1078,7 @@ bool_t scrub_free_pages()
         } while ( !cpumask_empty(&node_to_cpumask(node)) );
     }
 
+    preempt = false;
     spin_lock(&heap_lock);
 
     for ( zone = 0; zone < NR_ZONES; zone++ )
@@ -1085,19 +1092,64 @@ bool_t scrub_free_pages()
                 if ( !test_bit(_PGC_need_scrub, &pg->count_info) )
                     break;
 
-                for ( i = 0; i < (1UL << order); i++)
+                page_list_del(pg, &heap(node, zone, order));
+
+                scrub_order = MIN(order, SCRUB_CHUNK_ORDER);
+                num_scrubbed = 0;
+                is_frag = false;
+                while ( num_scrubbed < (1 << order) )
                 {
-                    scrub_one_page(&pg[i]);
-                    if ( softirq_pending(cpu) )
-                        goto out;
-                }
+                    for ( i = 0; i < (1 << scrub_order); i++ )
+                        scrub_one_page(&pg[num_scrubbed + i]);
 
+                    num_scrubbed += (1 << scrub_order);
+                    if ( softirq_pending(cpu) )
+                    {
+                        preempt = 1;
+                        is_frag = (num_scrubbed < (1 << order));
+                        break;
+                    }
+                 }
+ 
+                start = 0;
+                end = num_scrubbed;
+
+                /* Merge clean pages */
                 pg->count_info &= ~PGC_need_scrub;
-
-                page_list_del(pg, &heap(node, zone, order));
-                merge_chunks(pg, node, zone, order);
-
-                node_need_scrub[node] -= (1UL << order);
+                while ( start < end )
+                {
+                    /* 
+                     * Largest power-of-two chunk starting @start,
+                     * not greater than @end
+                     */
+                    unsigned chunk_order = flsl(end - start) - 1;
+                    struct page_info *ppg = &pg[start];
+
+                    node_need_scrub[node] -= (1 << chunk_order);
+
+                    PFN_ORDER(ppg) = chunk_order;
+                    merge_chunks(ppg, node, zone, chunk_order, is_frag);
+                    start += (1 << chunk_order);
+                }
+ 
+                /* Merge unscrubbed pages */
+                while ( end < (1 << order) )
+                {
+                    /*
+                     * Largest power-of-two chunk starting @end, not crossing
+                     * next power-of-two boundary
+                     */
+                    unsigned chunk_order = ffsl(end) - 1;
+                    struct page_info *ppg = &pg[end];
+
+                    PFN_ORDER(ppg) = chunk_order;
+                    ppg->count_info |= PGC_need_scrub;
+                    merge_chunks(ppg, node, zone, chunk_order, 1);
+                    end += (1 << chunk_order);
+                 }
+
+                if ( preempt )
+                    goto out;
             }
         }
     }
@@ -1165,7 +1217,7 @@ static void free_heap_pages(
         node_need_scrub[node] += (1UL << order);
     }
 
-    merge_chunks(pg, node, zone, order);
+    merge_chunks(pg, node, zone, order, 0);
 
     if ( tainted )
         reserve_offlined_page(pg);
-- 
1.7.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v1 6/9] spinlock: Introduce spin_lock_cb()
  2017-03-24 17:04 [PATCH v1 0/9] Memory scrubbing from idle loop Boris Ostrovsky
                   ` (4 preceding siblings ...)
  2017-03-24 17:05 ` [PATCH v1 5/9] mm: Do not discard already-scrubbed pages if softirqs are pending Boris Ostrovsky
@ 2017-03-24 17:05 ` Boris Ostrovsky
  2017-03-29 10:28   ` Wei Liu
  2017-03-24 17:05 ` [PATCH v1 7/9] mm: Keep pages available for allocation while scrubbing Boris Ostrovsky
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Boris Ostrovsky @ 2017-03-24 17:05 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, jbeulich, Boris Ostrovsky

While waiting for a lock we may want to periodically run some
code. We could use spin_trylock() but since it doesn't take lock
ticket it may take a long time until the lock is taken.

Add spin_lock_cb() that allows us to execute a callback while waiting.
Also add spin_lock_kick() that will wake up the waiters.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/common/spinlock.c      |   20 ++++++++++++++++++++
 xen/include/xen/spinlock.h |    3 +++
 2 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
index 2a06406..d1de3ca 100644
--- a/xen/common/spinlock.c
+++ b/xen/common/spinlock.c
@@ -129,6 +129,26 @@ static always_inline u16 observe_head(spinlock_tickets_t *t)
     return read_atomic(&t->head);
 }
 
+void _spin_lock_cb(spinlock_t *lock, void (*cb)(void *), void *data)
+{
+    spinlock_tickets_t tickets = SPINLOCK_TICKET_INC;
+    LOCK_PROFILE_VAR;
+
+    check_lock(&lock->debug);
+    tickets.head_tail = arch_fetch_and_add(&lock->tickets.head_tail,
+                                           tickets.head_tail);
+    while ( tickets.tail != observe_head(&lock->tickets) )
+    {
+        LOCK_PROFILE_BLOCK;
+        if ( cb )
+            cb(data);
+        arch_lock_relax();
+    }
+    LOCK_PROFILE_GOT;
+    preempt_disable();
+    arch_lock_acquire_barrier();
+}
+
 void _spin_lock(spinlock_t *lock)
 {
     spinlock_tickets_t tickets = SPINLOCK_TICKET_INC;
diff --git a/xen/include/xen/spinlock.h b/xen/include/xen/spinlock.h
index c1883bd..1cd91b7 100644
--- a/xen/include/xen/spinlock.h
+++ b/xen/include/xen/spinlock.h
@@ -153,6 +153,7 @@ typedef struct spinlock {
 #define spin_lock_init(l) (*(l) = (spinlock_t)SPIN_LOCK_UNLOCKED)
 
 void _spin_lock(spinlock_t *lock);
+void _spin_lock_cb(spinlock_t *lock, void (*cond)(void *), void *data);
 void _spin_lock_irq(spinlock_t *lock);
 unsigned long _spin_lock_irqsave(spinlock_t *lock);
 
@@ -169,6 +170,8 @@ void _spin_lock_recursive(spinlock_t *lock);
 void _spin_unlock_recursive(spinlock_t *lock);
 
 #define spin_lock(l)                  _spin_lock(l)
+#define spin_lock_cb(l, c, d)         _spin_lock_cb(l, c, d)
+#define spin_lock_kick(l)             arch_lock_signal()
 #define spin_lock_irq(l)              _spin_lock_irq(l)
 #define spin_lock_irqsave(l, f)                                 \
     ({                                                          \
-- 
1.7.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v1 7/9] mm: Keep pages available for allocation while scrubbing
  2017-03-24 17:04 [PATCH v1 0/9] Memory scrubbing from idle loop Boris Ostrovsky
                   ` (5 preceding siblings ...)
  2017-03-24 17:05 ` [PATCH v1 6/9] spinlock: Introduce spin_lock_cb() Boris Ostrovsky
@ 2017-03-24 17:05 ` Boris Ostrovsky
  2017-03-24 17:05 ` [PATCH v1 8/9] mm: Print number of unscrubbed pages in 'H' debug handler Boris Ostrovsky
  2017-03-24 17:05 ` [PATCH v1 9/9] mm: Make sure pages are scrubbed Boris Ostrovsky
  8 siblings, 0 replies; 33+ messages in thread
From: Boris Ostrovsky @ 2017-03-24 17:05 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, jbeulich, Boris Ostrovsky

Instead of scrubbing pages while holding heap lock we can mark
buddy's head as being scrubbed and drop the lock temporarily.
If someone (most likely alloc_heap_pages()) tries to access
this chunk it will signal the scrubber to abort scrub by setting
head's PAGE_SCRUB_ABORT bit. The scrubber checks this bit after
processing each page and stops its work as soon as it sees it.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/common/page_alloc.c  |   78 +++++++++++++++++++++++++++++++++++++++++++--
 xen/include/asm-arm/mm.h |    4 ++
 xen/include/asm-x86/mm.h |    4 ++
 3 files changed, 82 insertions(+), 4 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index e5e6e70..df28090 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -685,6 +685,18 @@ static void check_low_mem_virq(void)
     }
 }
 
+static void check_and_stop_scrub(struct page_info *head)
+{
+    if ( head->u.free.scrub_state & PAGE_SCRUBBING )
+    {
+        head->u.free.scrub_state |= PAGE_SCRUB_ABORT;
+        smp_mb();
+        spin_lock_kick();
+        while ( ACCESS_ONCE(head->u.free.scrub_state) & PAGE_SCRUB_ABORT )
+            cpu_relax();
+    }
+}
+
 /* Allocate 2^@order contiguous pages. */
 static struct page_info *alloc_heap_pages(
     unsigned int zone_lo, unsigned int zone_hi,
@@ -771,10 +783,15 @@ static struct page_info *alloc_heap_pages(
             {
                 if ( (pg = page_list_remove_head(&heap(node, zone, j))) )
                 {
-                    if ( (order == 0) || use_unscrubbed ||
-                         !test_bit(_PGC_need_scrub, &pg->count_info) )
+                    if ( !test_bit(_PGC_need_scrub, &pg[0].count_info) )
                         goto found;
 
+                    if ( (order == 0) || use_unscrubbed )
+                    {
+                        check_and_stop_scrub(pg);
+                        goto found;
+                    }
+
                     page_list_add_tail(pg, &heap(node, zone, j));
                 }
             }
@@ -911,6 +928,8 @@ static int reserve_offlined_page(struct page_info *head)
 
     cur_head = head;
 
+    check_and_stop_scrub(head);
+
     page_list_del(head, &heap(node, zone, head_order));
 
     while ( cur_head < (head + (1 << head_order)) )
@@ -993,6 +1012,9 @@ static bool_t can_merge(struct page_info *buddy, unsigned int node,
          !!test_bit(_PGC_need_scrub, &buddy->count_info) )
         return 0;
 
+    if ( buddy->u.free.scrub_state & PAGE_SCRUBBING )
+        return 0;
+
     return 1;
 }
 
@@ -1048,12 +1070,34 @@ static void merge_chunks(struct page_info *pg, unsigned int node,
 }
 
 #define SCRUB_CHUNK_ORDER  8
+
+struct scrub_wait_state {
+    struct page_info *pg;
+    bool_t drop;
+};
+
+static void scrub_continue(void *data)
+{
+    struct scrub_wait_state *st = (struct scrub_wait_state *)data;
+
+    if ( st->drop )
+        return;
+
+    if ( st->pg->u.free.scrub_state & PAGE_SCRUB_ABORT )
+    {
+        /* There is a waiter for this chunk. Release it. */
+        st->drop = true;
+        st->pg->u.free.scrub_state = 0;
+    }
+}
+
 bool_t scrub_free_pages()
 {
     struct page_info *pg;
     unsigned int i, zone;
     unsigned int num_scrubbed, scrub_order, start, end;
     bool_t preempt, is_frag;
+    struct scrub_wait_state st;
     int order, cpu = smp_processor_id();
     nodeid_t node = cpu_to_node(cpu), local_node;
     static nodemask_t node_scrubbing;
@@ -1092,7 +1136,10 @@ bool_t scrub_free_pages()
                 if ( !test_bit(_PGC_need_scrub, &pg->count_info) )
                     break;
 
-                page_list_del(pg, &heap(node, zone, order));
+                ASSERT(!pg->u.free.scrub_state);
+                pg->u.free.scrub_state = PAGE_SCRUBBING;
+
+                spin_unlock(&heap_lock);
 
                 scrub_order = MIN(order, SCRUB_CHUNK_ORDER);
                 num_scrubbed = 0;
@@ -1100,7 +1147,15 @@ bool_t scrub_free_pages()
                 while ( num_scrubbed < (1 << order) )
                 {
                     for ( i = 0; i < (1 << scrub_order); i++ )
+                    {
                         scrub_one_page(&pg[num_scrubbed + i]);
+                        if ( ACCESS_ONCE(pg->u.free.scrub_state) & PAGE_SCRUB_ABORT )
+                        {
+                            /* Someone wants this chunk. Drop everything. */
+                            pg->u.free.scrub_state = 0;
+                            goto out_nolock;
+                        }
+                    }
 
                     num_scrubbed += (1 << scrub_order);
                     if ( softirq_pending(cpu) )
@@ -1110,7 +1165,16 @@ bool_t scrub_free_pages()
                         break;
                     }
                  }
- 
+
+                st.pg = pg;
+                st.drop = false;
+                spin_lock_cb(&heap_lock, scrub_continue, &st);
+
+                if ( st.drop )
+                    goto out;
+
+                page_list_del(pg, &heap(node, zone, order));
+
                 start = 0;
                 end = num_scrubbed;
 
@@ -1148,6 +1212,8 @@ bool_t scrub_free_pages()
                     end += (1 << chunk_order);
                  }
 
+                pg->u.free.scrub_state = 0;
+
                 if ( preempt )
                     goto out;
             }
@@ -1156,6 +1222,8 @@ bool_t scrub_free_pages()
 
  out:
     spin_unlock(&heap_lock);
+
+ out_nolock:
     node_clear(node, node_scrubbing);
     return (node_need_scrub[node] != 0);
 }
@@ -1194,6 +1262,8 @@ static void free_heap_pages(
         if ( page_state_is(&pg[i], offlined) )
             tainted = 1;
 
+        pg[i].u.free.scrub_state=0;
+
         /* If a page has no owner it will need no safety TLB flush. */
         pg[i].u.free.need_tlbflush = (page_get_owner(&pg[i]) != NULL);
         if ( pg[i].u.free.need_tlbflush )
diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index 52a03a0..1752d44 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -41,6 +41,10 @@ struct page_info
         } inuse;
         /* Page is on a free list: ((count_info & PGC_count_mask) == 0). */
         struct {
+#define PAGE_SCRUBBING      (1<<1)
+#define PAGE_SCRUB_ABORT    (1<<2)
+            unsigned char scrub_state;
+
             /* Do TLBs need flushing for safety before next page use? */
             bool_t need_tlbflush;
         } free;
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index f3d4443..31e53e9 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -87,6 +87,10 @@ struct page_info
 
         /* Page is on a free list: ((count_info & PGC_count_mask) == 0). */
         struct {
+#define PAGE_SCRUBBING      (1<<1)
+#define PAGE_SCRUB_ABORT    (1<<2)
+            unsigned char scrub_state;
+
             /* Do TLBs need flushing for safety before next page use? */
             bool_t need_tlbflush;
         } free;
-- 
1.7.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v1 8/9] mm: Print number of unscrubbed pages in 'H' debug handler
  2017-03-24 17:04 [PATCH v1 0/9] Memory scrubbing from idle loop Boris Ostrovsky
                   ` (6 preceding siblings ...)
  2017-03-24 17:05 ` [PATCH v1 7/9] mm: Keep pages available for allocation while scrubbing Boris Ostrovsky
@ 2017-03-24 17:05 ` Boris Ostrovsky
  2017-03-28 20:11   ` Wei Liu
  2017-03-24 17:05 ` [PATCH v1 9/9] mm: Make sure pages are scrubbed Boris Ostrovsky
  8 siblings, 1 reply; 33+ messages in thread
From: Boris Ostrovsky @ 2017-03-24 17:05 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, jbeulich, Boris Ostrovsky

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/common/page_alloc.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index df28090..7dfbd8c 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -2240,6 +2240,13 @@ static void dump_heap(unsigned char key)
             printk("heap[node=%d][zone=%d] -> %lu pages\n",
                    i, j, avail[i][j]);
     }
+
+    for ( i = 0; i < MAX_NUMNODES; i++ )
+    {
+        if ( !node_need_scrub[i] )
+            continue;
+        printk("Node %d has %lu unscrubbed pages\n", i, node_need_scrub[i]);
+    }
 }
 
 static __init int register_heap_trigger(void)
-- 
1.7.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v1 9/9] mm: Make sure pages are scrubbed
  2017-03-24 17:04 [PATCH v1 0/9] Memory scrubbing from idle loop Boris Ostrovsky
                   ` (7 preceding siblings ...)
  2017-03-24 17:05 ` [PATCH v1 8/9] mm: Print number of unscrubbed pages in 'H' debug handler Boris Ostrovsky
@ 2017-03-24 17:05 ` Boris Ostrovsky
  2017-03-29 10:39   ` Wei Liu
  2017-03-29 16:25   ` Wei Liu
  8 siblings, 2 replies; 33+ messages in thread
From: Boris Ostrovsky @ 2017-03-24 17:05 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, tim, jbeulich, Boris Ostrovsky

Add a debug Kconfig option that will make page allocator verify
that pages that were supposed to be scrubbed are, in fact, clean.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/Kconfig.debug       |    7 +++++++
 xen/common/page_alloc.c |   33 +++++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+), 0 deletions(-)

diff --git a/xen/Kconfig.debug b/xen/Kconfig.debug
index 8139564..35b86ae 100644
--- a/xen/Kconfig.debug
+++ b/xen/Kconfig.debug
@@ -113,6 +113,13 @@ config DEVICE_TREE_DEBUG
 	  logged in the Xen ring buffer.
 	  If unsure, say N here.
 
+config SCRUB_DEBUG
+       bool "Page scrubbing test"
+       default y
+       ---help---
+          Verify that pages that need to be scrubbed before being allocated to
+	  a guest are indeed scrubbed.
+
 endif # DEBUG || EXPERT
 
 endmenu
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 7dfbd8c..8140446 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -685,6 +685,29 @@ static void check_low_mem_virq(void)
     }
 }
 
+#ifdef CONFIG_SCRUB_DEBUG
+#define PAGE_POISON 0xbad0bad0bad0bad0ULL
+static void poison_one_page(struct page_info *pg)
+{
+    mfn_t mfn = _mfn(page_to_mfn(pg));
+    uint64_t *ptr;
+
+    ptr  = map_domain_page(mfn);
+    *ptr = PAGE_POISON;
+    unmap_domain_page(ptr);
+}
+
+static void check_one_page(struct page_info *pg)
+{
+    mfn_t mfn = _mfn(page_to_mfn(pg));
+    uint64_t *ptr;
+
+    ptr  = map_domain_page(mfn);
+    ASSERT(*ptr != PAGE_POISON);
+    unmap_domain_page(ptr);
+}
+#endif /* CONFIG_SCRUB_DEBUG */
+
 static void check_and_stop_scrub(struct page_info *head)
 {
     if ( head->u.free.scrub_state & PAGE_SCRUBBING )
@@ -905,6 +928,11 @@ static struct page_info *alloc_heap_pages(
          * guest can control its own visibility of/through the cache.
          */
         flush_page_to_ram(page_to_mfn(&pg[i]));
+
+#ifdef CONFIG_SCRUB_DEBUG
+        if ( d && !is_idle_domain(d) )
+            check_one_page(&pg[i]);
+#endif
     }
 
     spin_unlock(&heap_lock);
@@ -1285,6 +1313,11 @@ static void free_heap_pages(
     {
         pg->count_info |= PGC_need_scrub;
         node_need_scrub[node] += (1UL << order);
+
+#ifdef CONFIG_SCRUB_DEBUG
+        for ( i = 0; i < (1 << order); i++ )
+            poison_one_page(&pg[i]);
+#endif
     }
 
     merge_chunks(pg, node, zone, order, 0);
-- 
1.7.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 1/9] mm: Separate free page chunk merging into its own routine
  2017-03-24 17:04 ` [PATCH v1 1/9] mm: Separate free page chunk merging into its own routine Boris Ostrovsky
@ 2017-03-27 15:16   ` Wei Liu
  2017-03-27 16:03     ` Jan Beulich
  2017-03-28 19:20   ` Wei Liu
  1 sibling, 1 reply; 33+ messages in thread
From: Wei Liu @ 2017-03-27 15:16 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, xen-devel, jbeulich

On Fri, Mar 24, 2017 at 01:04:56PM -0400, Boris Ostrovsky wrote:
> This is needed for subsequent changes to memory scrubbing. No
> logic change, only code re-factoring.
> 
> Based on earlier patch by Bob Liu.
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> ---
>  xen/common/page_alloc.c |   85 ++++++++++++++++++++++++++++------------------
>  1 files changed, 52 insertions(+), 33 deletions(-)
> 
> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
> index 42c20cb..7931903 100644
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -924,11 +924,61 @@ static int reserve_offlined_page(struct page_info *head)
>      return count;
>  }
>  
> +static bool_t can_merge(struct page_info *buddy, unsigned int node,

Plain bool please.

> +                        unsigned int order)
> +{
> +    if ( !mfn_valid(_mfn(page_to_mfn(buddy))) ||
> +         !page_state_is(buddy, free) ||
> +         (PFN_ORDER(buddy) != order) ||
> +         (phys_to_nid(page_to_maddr(buddy)) != node) )
> +        return 0;
> +
> +    return 1;

True and false.

Other than those:

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 1/9] mm: Separate free page chunk merging into its own routine
  2017-03-27 15:16   ` Wei Liu
@ 2017-03-27 16:03     ` Jan Beulich
  2017-03-27 16:28       ` Boris Ostrovsky
  0 siblings, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2017-03-27 16:03 UTC (permalink / raw)
  To: wei.liu2, Boris Ostrovsky
  Cc: tim, sstabellini, George.Dunlap, andrew.cooper3, ian.jackson, xen-devel

>>> On 27.03.17 at 17:16, <wei.liu2@citrix.com> wrote:
> On Fri, Mar 24, 2017 at 01:04:56PM -0400, Boris Ostrovsky wrote:
>> --- a/xen/common/page_alloc.c
>> +++ b/xen/common/page_alloc.c
>> @@ -924,11 +924,61 @@ static int reserve_offlined_page(struct page_info *head)
>>      return count;
>>  }
>>  
>> +static bool_t can_merge(struct page_info *buddy, unsigned int node,
> 
> Plain bool please.
> 
>> +                        unsigned int order)
>> +{
>> +    if ( !mfn_valid(_mfn(page_to_mfn(buddy))) ||
>> +         !page_state_is(buddy, free) ||
>> +         (PFN_ORDER(buddy) != order) ||
>> +         (phys_to_nid(page_to_maddr(buddy)) != node) )
>> +        return 0;
>> +
>> +    return 1;
> 
> True and false.

Actually there's no point in having two return statements here in
the first place the value of the expression (suitably inverted) can
be the operand of return.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 1/9] mm: Separate free page chunk merging into its own routine
  2017-03-27 16:03     ` Jan Beulich
@ 2017-03-27 16:28       ` Boris Ostrovsky
  2017-03-28  7:27         ` Jan Beulich
  0 siblings, 1 reply; 33+ messages in thread
From: Boris Ostrovsky @ 2017-03-27 16:28 UTC (permalink / raw)
  To: Jan Beulich, wei.liu2
  Cc: tim, sstabellini, George.Dunlap, andrew.cooper3, ian.jackson, xen-devel

On 03/27/2017 12:03 PM, Jan Beulich wrote:
>>>> On 27.03.17 at 17:16, <wei.liu2@citrix.com> wrote:
>> On Fri, Mar 24, 2017 at 01:04:56PM -0400, Boris Ostrovsky wrote:
>>> --- a/xen/common/page_alloc.c
>>> +++ b/xen/common/page_alloc.c
>>> @@ -924,11 +924,61 @@ static int reserve_offlined_page(struct page_info *head)
>>>      return count;
>>>  }
>>>  
>>> +static bool_t can_merge(struct page_info *buddy, unsigned int node,
>> Plain bool please.
>>
>>> +                        unsigned int order)
>>> +{
>>> +    if ( !mfn_valid(_mfn(page_to_mfn(buddy))) ||
>>> +         !page_state_is(buddy, free) ||
>>> +         (PFN_ORDER(buddy) != order) ||
>>> +         (phys_to_nid(page_to_maddr(buddy)) != node) )
>>> +        return 0;
>>> +
>>> +    return 1;
>> True and false.
> Actually there's no point in having two return statements here in
> the first place the value of the expression (suitably inverted) can
> be the operand of return.

Further in the series this routine is expanded with more checks and I
kept those checks separate since I felt they make it more readable.

I can certainly merge them all together if people think that it's better
to have a single return expression.

-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 1/9] mm: Separate free page chunk merging into its own routine
  2017-03-27 16:28       ` Boris Ostrovsky
@ 2017-03-28  7:27         ` Jan Beulich
  0 siblings, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2017-03-28  7:27 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, xen-devel

>>> On 27.03.17 at 18:28, <boris.ostrovsky@oracle.com> wrote:
> On 03/27/2017 12:03 PM, Jan Beulich wrote:
>>>>> On 27.03.17 at 17:16, <wei.liu2@citrix.com> wrote:
>>> On Fri, Mar 24, 2017 at 01:04:56PM -0400, Boris Ostrovsky wrote:
>>>> --- a/xen/common/page_alloc.c
>>>> +++ b/xen/common/page_alloc.c
>>>> @@ -924,11 +924,61 @@ static int reserve_offlined_page(struct page_info 
> *head)
>>>>      return count;
>>>>  }
>>>>  
>>>> +static bool_t can_merge(struct page_info *buddy, unsigned int node,
>>> Plain bool please.
>>>
>>>> +                        unsigned int order)
>>>> +{
>>>> +    if ( !mfn_valid(_mfn(page_to_mfn(buddy))) ||
>>>> +         !page_state_is(buddy, free) ||
>>>> +         (PFN_ORDER(buddy) != order) ||
>>>> +         (phys_to_nid(page_to_maddr(buddy)) != node) )
>>>> +        return 0;
>>>> +
>>>> +    return 1;
>>> True and false.
>> Actually there's no point in having two return statements here in
>> the first place the value of the expression (suitably inverted) can
>> be the operand of return.
> 
> Further in the series this routine is expanded with more checks and I
> kept those checks separate since I felt they make it more readable.
> 
> I can certainly merge them all together if people think that it's better
> to have a single return expression.

Well, that depends on how those additions are being made: If the
if() above gets expanded, then the single return statement could
be expanded as well. If, however, you add further return-s (for
clarity, I'd assume), then keeping it the way above (with true/false
used as requested by Wei) would be fine with me.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 1/9] mm: Separate free page chunk merging into its own routine
  2017-03-24 17:04 ` [PATCH v1 1/9] mm: Separate free page chunk merging into its own routine Boris Ostrovsky
  2017-03-27 15:16   ` Wei Liu
@ 2017-03-28 19:20   ` Wei Liu
  2017-03-28 19:41     ` Boris Ostrovsky
  1 sibling, 1 reply; 33+ messages in thread
From: Wei Liu @ 2017-03-28 19:20 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, xen-devel, jbeulich

On Fri, Mar 24, 2017 at 01:04:56PM -0400, Boris Ostrovsky wrote:
> This is needed for subsequent changes to memory scrubbing. No
> logic change, only code re-factoring.

Actually there is a slight change in logic: pg and order could be
updated in the original merge code. With this patch you still pass the
original pg to reserve_offlined_page if tainted is true. I don't think
this matters in terms of correctness, but it is worth pointing out.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 2/9] mm: Place unscrubbed pages at the end of pagelist
  2017-03-24 17:04 ` [PATCH v1 2/9] mm: Place unscrubbed pages at the end of pagelist Boris Ostrovsky
@ 2017-03-28 19:27   ` Wei Liu
  2017-03-28 19:46     ` Boris Ostrovsky
  0 siblings, 1 reply; 33+ messages in thread
From: Wei Liu @ 2017-03-28 19:27 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, xen-devel, jbeulich

On Fri, Mar 24, 2017 at 01:04:57PM -0400, Boris Ostrovsky wrote:
> . so that it's easy to find pages that need to be scrubbed (those pages are
> now marked with _PGC_need_scrub bit).

s/bool_t/bool/ throughout and use true and false where appropriate.

>  
>  /* Free 2^@order set of pages. */
>  static void free_heap_pages(
> -    struct page_info *pg, unsigned int order)
> +    struct page_info *pg, unsigned int order, bool_t need_scrub)
>  {
>      unsigned long mfn = page_to_mfn(pg);
>      unsigned int i, node = phys_to_nid(page_to_maddr(pg)), tainted = 0;
> @@ -1025,11 +1086,20 @@ static void free_heap_pages(
>          midsize_alloc_zone_pages = max(
>              midsize_alloc_zone_pages, total_avail_pages / MIDSIZE_ALLOC_FRAC);
>  
> +    if ( need_scrub && !tainted )
> +    {
> +        pg->count_info |= PGC_need_scrub;
> +        node_need_scrub[node] += (1UL << order);
> +    }
> +

I think this is wrong: you shouldn't have !tainted here, otherwise
reserve_offlined_page won't know the page needs to be scrubbed,
resulting in all sub-pages not marked as needing scrub.

>      merge_chunks(pg, node, zone, order);
>  
>      if ( tainted )
>          reserve_offlined_page(pg);
>  
> +    if ( need_scrub )
> +        scrub_free_pages(node);
> +
>      spin_unlock(&heap_lock);
>  }

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 1/9] mm: Separate free page chunk merging into its own routine
  2017-03-28 19:20   ` Wei Liu
@ 2017-03-28 19:41     ` Boris Ostrovsky
  2017-03-28 19:44       ` Wei Liu
  0 siblings, 1 reply; 33+ messages in thread
From: Boris Ostrovsky @ 2017-03-28 19:41 UTC (permalink / raw)
  To: Wei Liu
  Cc: tim, sstabellini, George.Dunlap, andrew.cooper3, ian.jackson,
	xen-devel, jbeulich

On 03/28/2017 03:20 PM, Wei Liu wrote:
> On Fri, Mar 24, 2017 at 01:04:56PM -0400, Boris Ostrovsky wrote:
>> This is needed for subsequent changes to memory scrubbing. No
>> logic change, only code re-factoring.
> Actually there is a slight change in logic: pg and order could be
> updated in the original merge code. With this patch you still pass the
> original pg to reserve_offlined_page if tainted is true. I don't think
> this matters in terms of correctness, but it is worth pointing out.

Actually, for pg it does matter since reserve_offlined_page() is passed
buddy head. So I should make merge_chunks() return new head.

('order' is not used after merge_chunks(), so that's OK).

-boris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 3/9] mm: Scrub pages in alloc_heap_pages() if needed
  2017-03-24 17:04 ` [PATCH v1 3/9] mm: Scrub pages in alloc_heap_pages() if needed Boris Ostrovsky
@ 2017-03-28 19:43   ` Wei Liu
  0 siblings, 0 replies; 33+ messages in thread
From: Wei Liu @ 2017-03-28 19:43 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, xen-devel, jbeulich

On Fri, Mar 24, 2017 at 01:04:58PM -0400, Boris Ostrovsky wrote:
> When allocating pages in alloc_heap_pages() first look for clean pages. If none
> is found then retry, take pages marked as unscrubbed and scrub them.
> 
> Note that we shouldn't find unscrubbed pages in alloc_heap_pages() yet. However,
> this will become possible when we stop scrubbing from free_heap_pages() and
> instead do it from idle loop.
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

Again, s/bool_t/bool/.

>   found: 
> +    need_scrub = !!test_bit(_PGC_need_scrub, &pg->count_info);
> +
>      /* We may have to halve the chunk a number of times. */
>      while ( j != order )
>      {
>          PFN_ORDER(pg) = --j;
> -        page_list_add(pg, &heap(node, zone, j));
> +        if ( need_scrub )
> +        {
> +            pg->count_info |= PGC_need_scrub;
> +            page_list_add_tail(pg, &heap(node, zone, j));
> +        }
> +        else
> +            page_list_add(pg, &heap(node, zone, j));

This is getting repetitive. Please consider adding a function.


    /* Pages that need scrub are added to tail, otherwise to head */
    void add_to_page_list(pg, node, zone, order, need_scrub)
    {
        if ( need_scrub )
        {
            pg->count_info |= PGC_need_scrub;
            page_list_add_tail(pg, &heap(node, zone, order));
        }
        else
            page_list_add(pg, &heap(node, zone, order));
    }

Might be more appropriate to add it in previous patch and replace all
plain page_list_add{,_tail} with it.

>          pg += 1 << j;
>      }
> +    if ( need_scrub )
> +        pg->count_info |= PGC_need_scrub;
>  
>      ASSERT(avail[node][zone] >= request);
>      avail[node][zone] -= request;
> @@ -823,6 +859,15 @@ static struct page_info *alloc_heap_pages(
>      if ( d != NULL )
>          d->last_alloc_node = node;
>  
> +    if ( need_scrub )
> +    {
> +        for ( i = 0; i < (1 << order); i++ )
> +            scrub_one_page(&pg[i]);
> +        pg->count_info &= ~PGC_need_scrub;
> +        node_need_scrub[node] -= (1 << order);
> +    }
> +
> +
>      for ( i = 0; i < (1 << order); i++ )
>      {
>          /* Reference count must continuously be zero for free pages. */
> -- 
> 1.7.1
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 1/9] mm: Separate free page chunk merging into its own routine
  2017-03-28 19:41     ` Boris Ostrovsky
@ 2017-03-28 19:44       ` Wei Liu
  2017-03-28 19:57         ` Boris Ostrovsky
  0 siblings, 1 reply; 33+ messages in thread
From: Wei Liu @ 2017-03-28 19:44 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: tim, sstabellini, Wei Liu, George.Dunlap, andrew.cooper3,
	ian.jackson, xen-devel, jbeulich

On Tue, Mar 28, 2017 at 03:41:10PM -0400, Boris Ostrovsky wrote:
> On 03/28/2017 03:20 PM, Wei Liu wrote:
> > On Fri, Mar 24, 2017 at 01:04:56PM -0400, Boris Ostrovsky wrote:
> >> This is needed for subsequent changes to memory scrubbing. No
> >> logic change, only code re-factoring.
> > Actually there is a slight change in logic: pg and order could be
> > updated in the original merge code. With this patch you still pass the
> > original pg to reserve_offlined_page if tainted is true. I don't think
> > this matters in terms of correctness, but it is worth pointing out.
> 
> Actually, for pg it does matter since reserve_offlined_page() is passed
> buddy head. So I should make merge_chunks() return new head.
> 

OK. I think the downside is reserve_offlined_page doesn't merge as many
pages as it should have, so it is still correct, just the result is
suboptimal, but I could be wrong -- it's a bit late here. :-)

I think returning the updated pg is probably best.

Wei.

> ('order' is not used after merge_chunks(), so that's OK).
> 
> -boris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 2/9] mm: Place unscrubbed pages at the end of pagelist
  2017-03-28 19:27   ` Wei Liu
@ 2017-03-28 19:46     ` Boris Ostrovsky
  0 siblings, 0 replies; 33+ messages in thread
From: Boris Ostrovsky @ 2017-03-28 19:46 UTC (permalink / raw)
  To: Wei Liu
  Cc: tim, sstabellini, George.Dunlap, andrew.cooper3, ian.jackson,
	xen-devel, jbeulich

On 03/28/2017 03:27 PM, Wei Liu wrote:
> On Fri, Mar 24, 2017 at 01:04:57PM -0400, Boris Ostrovsky wrote:
>> . so that it's easy to find pages that need to be scrubbed (those pages are
>> now marked with _PGC_need_scrub bit).
> s/bool_t/bool/ throughout and use true and false where appropriate.
>
>>  
>>  /* Free 2^@order set of pages. */
>>  static void free_heap_pages(
>> -    struct page_info *pg, unsigned int order)
>> +    struct page_info *pg, unsigned int order, bool_t need_scrub)
>>  {
>>      unsigned long mfn = page_to_mfn(pg);
>>      unsigned int i, node = phys_to_nid(page_to_maddr(pg)), tainted = 0;
>> @@ -1025,11 +1086,20 @@ static void free_heap_pages(
>>          midsize_alloc_zone_pages = max(
>>              midsize_alloc_zone_pages, total_avail_pages / MIDSIZE_ALLOC_FRAC);
>>  
>> +    if ( need_scrub && !tainted )
>> +    {
>> +        pg->count_info |= PGC_need_scrub;
>> +        node_need_scrub[node] += (1UL << order);
>> +    }
>> +
> I think this is wrong: you shouldn't have !tainted here, otherwise
> reserve_offlined_page won't know the page needs to be scrubbed,
> resulting in all sub-pages not marked as needing scrub.

Right. This looks like a leftover from an earlier version of this series
and I never removed !tainted test.

-boris

>>      merge_chunks(pg, node, zone, order);
>>  
>>      if ( tainted )
>>          reserve_offlined_page(pg);
>>  
>> +    if ( need_scrub )
>> +        scrub_free_pages(node);
>> +
>>      spin_unlock(&heap_lock);
>>  }
> Wei.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 1/9] mm: Separate free page chunk merging into its own routine
  2017-03-28 19:44       ` Wei Liu
@ 2017-03-28 19:57         ` Boris Ostrovsky
  0 siblings, 0 replies; 33+ messages in thread
From: Boris Ostrovsky @ 2017-03-28 19:57 UTC (permalink / raw)
  To: Wei Liu
  Cc: tim, sstabellini, George.Dunlap, andrew.cooper3, ian.jackson,
	xen-devel, jbeulich

On 03/28/2017 03:44 PM, Wei Liu wrote:
> On Tue, Mar 28, 2017 at 03:41:10PM -0400, Boris Ostrovsky wrote:
>> On 03/28/2017 03:20 PM, Wei Liu wrote:
>>> On Fri, Mar 24, 2017 at 01:04:56PM -0400, Boris Ostrovsky wrote:
>>>> This is needed for subsequent changes to memory scrubbing. No
>>>> logic change, only code re-factoring.
>>> Actually there is a slight change in logic: pg and order could be
>>> updated in the original merge code. With this patch you still pass the
>>> original pg to reserve_offlined_page if tainted is true. I don't think
>>> this matters in terms of correctness, but it is worth pointing out.
>> Actually, for pg it does matter since reserve_offlined_page() is passed
>> buddy head. So I should make merge_chunks() return new head.
>>
> OK. I think the downside is reserve_offlined_page doesn't merge as many
> pages as it should have, so it is still correct, just the result is
> suboptimal, but I could be wrong -- it's a bit late here. :-)

You are wrong that my patch was right. ;-)

I was passing old head which, if it was merged back in merge_chunks(),
would have lost its PGC_need_scrub bit. And so when
reserve_offlined_page() tries to re-merge buddy's pieces all need_scrub
info would be lost.

Again, this would have been OK with previous version of the series where
every page in the buddy was marked with PGC_need_scrub. But this is no
longer the case.

-boris


>
> I think returning the updated pg is probably best.
>
> Wei.
>
>> ('order' is not used after merge_chunks(), so that's OK).
>>
>> -boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 4/9] mm: Scrub memory from idle loop
  2017-03-24 17:04 ` [PATCH v1 4/9] mm: Scrub memory from idle loop Boris Ostrovsky
@ 2017-03-28 20:01   ` Wei Liu
  2017-03-28 20:14     ` Boris Ostrovsky
  0 siblings, 1 reply; 33+ messages in thread
From: Wei Liu @ 2017-03-28 20:01 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, xen-devel, jbeulich

On Fri, Mar 24, 2017 at 01:04:59PM -0400, Boris Ostrovsky wrote:
> @@ -1067,7 +1086,11 @@ static void scrub_free_pages(unsigned int node)
>                      break;
>  
>                  for ( i = 0; i < (1UL << order); i++)
> +                {
>                      scrub_one_page(&pg[i]);
> +                    if ( softirq_pending(cpu) )
> +                        goto out;
> +                }
>  
>                  pg->count_info &= ~PGC_need_scrub;
>  
> @@ -1078,6 +1101,11 @@ static void scrub_free_pages(unsigned int node)
>              }
>          }
>      }
> +
> + out:
> +    spin_unlock(&heap_lock);
> +    node_clear(node, node_scrubbing);
> +    return (node_need_scrub[node] != 0);

I think you also need to have "&& !softirq_pending(cpu)" here.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 8/9] mm: Print number of unscrubbed pages in 'H' debug handler
  2017-03-24 17:05 ` [PATCH v1 8/9] mm: Print number of unscrubbed pages in 'H' debug handler Boris Ostrovsky
@ 2017-03-28 20:11   ` Wei Liu
  0 siblings, 0 replies; 33+ messages in thread
From: Wei Liu @ 2017-03-28 20:11 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, xen-devel, jbeulich

On Fri, Mar 24, 2017 at 01:05:03PM -0400, Boris Ostrovsky wrote:
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 4/9] mm: Scrub memory from idle loop
  2017-03-28 20:01   ` Wei Liu
@ 2017-03-28 20:14     ` Boris Ostrovsky
  0 siblings, 0 replies; 33+ messages in thread
From: Boris Ostrovsky @ 2017-03-28 20:14 UTC (permalink / raw)
  To: Wei Liu
  Cc: tim, sstabellini, George.Dunlap, andrew.cooper3, ian.jackson,
	xen-devel, jbeulich

On 03/28/2017 04:01 PM, Wei Liu wrote:
> On Fri, Mar 24, 2017 at 01:04:59PM -0400, Boris Ostrovsky wrote:
>> @@ -1067,7 +1086,11 @@ static void scrub_free_pages(unsigned int node)
>>                      break;
>>  
>>                  for ( i = 0; i < (1UL << order); i++)
>> +                {
>>                      scrub_one_page(&pg[i]);
>> +                    if ( softirq_pending(cpu) )
>> +                        goto out;
>> +                }
>>  
>>                  pg->count_info &= ~PGC_need_scrub;
>>  
>> @@ -1078,6 +1101,11 @@ static void scrub_free_pages(unsigned int node)
>>              }
>>          }
>>      }
>> +
>> + out:
>> +    spin_unlock(&heap_lock);
>> +    node_clear(node, node_scrubbing);
>> +    return (node_need_scrub[node] != 0);
> I think you also need to have "&& !softirq_pending(cpu)" here.

Good point.

I should probably also check here memory-only nodes.

-boris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 5/9] mm: Do not discard already-scrubbed pages if softirqs are pending
  2017-03-24 17:05 ` [PATCH v1 5/9] mm: Do not discard already-scrubbed pages if softirqs are pending Boris Ostrovsky
@ 2017-03-29 10:22   ` Wei Liu
  0 siblings, 0 replies; 33+ messages in thread
From: Wei Liu @ 2017-03-29 10:22 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, xen-devel, jbeulich

On Fri, Mar 24, 2017 at 01:05:00PM -0400, Boris Ostrovsky wrote:
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

I think you're going to update patch 1. Given this patch also uses
merge_chunks I think I will wait for a new version.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 6/9] spinlock: Introduce spin_lock_cb()
  2017-03-24 17:05 ` [PATCH v1 6/9] spinlock: Introduce spin_lock_cb() Boris Ostrovsky
@ 2017-03-29 10:28   ` Wei Liu
  2017-03-29 13:47     ` Boris Ostrovsky
  0 siblings, 1 reply; 33+ messages in thread
From: Wei Liu @ 2017-03-29 10:28 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, xen-devel, jbeulich

On Fri, Mar 24, 2017 at 01:05:01PM -0400, Boris Ostrovsky wrote:
> While waiting for a lock we may want to periodically run some
> code. We could use spin_trylock() but since it doesn't take lock
> ticket it may take a long time until the lock is taken.
> 
> Add spin_lock_cb() that allows us to execute a callback while waiting.
> Also add spin_lock_kick() that will wake up the waiters.
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> ---
>  xen/common/spinlock.c      |   20 ++++++++++++++++++++
>  xen/include/xen/spinlock.h |    3 +++
>  2 files changed, 23 insertions(+), 0 deletions(-)
> 
> diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
> index 2a06406..d1de3ca 100644
> --- a/xen/common/spinlock.c
> +++ b/xen/common/spinlock.c
> @@ -129,6 +129,26 @@ static always_inline u16 observe_head(spinlock_tickets_t *t)
>      return read_atomic(&t->head);
>  }
>  
> +void _spin_lock_cb(spinlock_t *lock, void (*cb)(void *), void *data)
> +{
> +    spinlock_tickets_t tickets = SPINLOCK_TICKET_INC;
> +    LOCK_PROFILE_VAR;
> +
> +    check_lock(&lock->debug);
> +    tickets.head_tail = arch_fetch_and_add(&lock->tickets.head_tail,
> +                                           tickets.head_tail);
> +    while ( tickets.tail != observe_head(&lock->tickets) )
> +    {
> +        LOCK_PROFILE_BLOCK;
> +        if ( cb )
> +            cb(data);
> +        arch_lock_relax();
> +    }
> +    LOCK_PROFILE_GOT;
> +    preempt_disable();
> +    arch_lock_acquire_barrier();
> +}
> +
>  void _spin_lock(spinlock_t *lock)

You should be able to use _spin_lock_cb to implement _spin_lock, right?

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 9/9] mm: Make sure pages are scrubbed
  2017-03-24 17:05 ` [PATCH v1 9/9] mm: Make sure pages are scrubbed Boris Ostrovsky
@ 2017-03-29 10:39   ` Wei Liu
  2017-03-29 16:25   ` Wei Liu
  1 sibling, 0 replies; 33+ messages in thread
From: Wei Liu @ 2017-03-29 10:39 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, xen-devel, jbeulich

On Fri, Mar 24, 2017 at 01:05:04PM -0400, Boris Ostrovsky wrote:
> Add a debug Kconfig option that will make page allocator verify
> that pages that were supposed to be scrubbed are, in fact, clean.
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> ---
>  xen/Kconfig.debug       |    7 +++++++
>  xen/common/page_alloc.c |   33 +++++++++++++++++++++++++++++++++
>  2 files changed, 40 insertions(+), 0 deletions(-)
> 
> diff --git a/xen/Kconfig.debug b/xen/Kconfig.debug
> index 8139564..35b86ae 100644
> --- a/xen/Kconfig.debug
> +++ b/xen/Kconfig.debug
> @@ -113,6 +113,13 @@ config DEVICE_TREE_DEBUG
>  	  logged in the Xen ring buffer.
>  	  If unsure, say N here.
>  
> +config SCRUB_DEBUG
> +       bool "Page scrubbing test"
> +       default y

default debug

> +       ---help---
> +          Verify that pages that need to be scrubbed before being allocated to
> +	  a guest are indeed scrubbed.

Indentation.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 6/9] spinlock: Introduce spin_lock_cb()
  2017-03-29 10:28   ` Wei Liu
@ 2017-03-29 13:47     ` Boris Ostrovsky
  2017-03-29 14:07       ` Boris Ostrovsky
  0 siblings, 1 reply; 33+ messages in thread
From: Boris Ostrovsky @ 2017-03-29 13:47 UTC (permalink / raw)
  To: Wei Liu
  Cc: tim, sstabellini, George.Dunlap, andrew.cooper3, ian.jackson,
	xen-devel, jbeulich

On 03/29/2017 06:28 AM, Wei Liu wrote:
> On Fri, Mar 24, 2017 at 01:05:01PM -0400, Boris Ostrovsky wrote:
>> While waiting for a lock we may want to periodically run some
>> code. We could use spin_trylock() but since it doesn't take lock
>> ticket it may take a long time until the lock is taken.
>>
>> Add spin_lock_cb() that allows us to execute a callback while waiting.
>> Also add spin_lock_kick() that will wake up the waiters.
>>
>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> ---
>>  xen/common/spinlock.c      |   20 ++++++++++++++++++++
>>  xen/include/xen/spinlock.h |    3 +++
>>  2 files changed, 23 insertions(+), 0 deletions(-)
>>
>> diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
>> index 2a06406..d1de3ca 100644
>> --- a/xen/common/spinlock.c
>> +++ b/xen/common/spinlock.c
>> @@ -129,6 +129,26 @@ static always_inline u16 observe_head(spinlock_tickets_t *t)
>>      return read_atomic(&t->head);
>>  }
>>  
>> +void _spin_lock_cb(spinlock_t *lock, void (*cb)(void *), void *data)
>> +{
>> +    spinlock_tickets_t tickets = SPINLOCK_TICKET_INC;
>> +    LOCK_PROFILE_VAR;
>> +
>> +    check_lock(&lock->debug);
>> +    tickets.head_tail = arch_fetch_and_add(&lock->tickets.head_tail,
>> +                                           tickets.head_tail);
>> +    while ( tickets.tail != observe_head(&lock->tickets) )
>> +    {
>> +        LOCK_PROFILE_BLOCK;
>> +        if ( cb )
>> +            cb(data);
>> +        arch_lock_relax();
>> +    }
>> +    LOCK_PROFILE_GOT;
>> +    preempt_disable();
>> +    arch_lock_acquire_barrier();
>> +}
>> +
>>  void _spin_lock(spinlock_t *lock)
> You should be able to use _spin_lock_cb to implement _spin_lock, right?


I did consider this but decided not to do it because we'd be adding a
few extra instructions and a call on potentially hot path.

(And doing it as a #define of a spin_lock() would make things even worse).

-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 6/9] spinlock: Introduce spin_lock_cb()
  2017-03-29 13:47     ` Boris Ostrovsky
@ 2017-03-29 14:07       ` Boris Ostrovsky
  0 siblings, 0 replies; 33+ messages in thread
From: Boris Ostrovsky @ 2017-03-29 14:07 UTC (permalink / raw)
  To: Wei Liu
  Cc: tim, sstabellini, George.Dunlap, andrew.cooper3, ian.jackson,
	xen-devel, jbeulich

On 03/29/2017 09:47 AM, Boris Ostrovsky wrote:
> On 03/29/2017 06:28 AM, Wei Liu wrote:
>> On Fri, Mar 24, 2017 at 01:05:01PM -0400, Boris Ostrovsky wrote:
>>> While waiting for a lock we may want to periodically run some
>>> code. We could use spin_trylock() but since it doesn't take lock
>>> ticket it may take a long time until the lock is taken.
>>>
>>> Add spin_lock_cb() that allows us to execute a callback while waiting.
>>> Also add spin_lock_kick() that will wake up the waiters.
>>>
>>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>> ---
>>>  xen/common/spinlock.c      |   20 ++++++++++++++++++++
>>>  xen/include/xen/spinlock.h |    3 +++
>>>  2 files changed, 23 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
>>> index 2a06406..d1de3ca 100644
>>> --- a/xen/common/spinlock.c
>>> +++ b/xen/common/spinlock.c
>>> @@ -129,6 +129,26 @@ static always_inline u16 observe_head(spinlock_tickets_t *t)
>>>      return read_atomic(&t->head);
>>>  }
>>>  
>>> +void _spin_lock_cb(spinlock_t *lock, void (*cb)(void *), void *data)
>>> +{
>>> +    spinlock_tickets_t tickets = SPINLOCK_TICKET_INC;
>>> +    LOCK_PROFILE_VAR;
>>> +
>>> +    check_lock(&lock->debug);
>>> +    tickets.head_tail = arch_fetch_and_add(&lock->tickets.head_tail,
>>> +                                           tickets.head_tail);
>>> +    while ( tickets.tail != observe_head(&lock->tickets) )
>>> +    {
>>> +        LOCK_PROFILE_BLOCK;
>>> +        if ( cb )
>>> +            cb(data);
>>> +        arch_lock_relax();
>>> +    }
>>> +    LOCK_PROFILE_GOT;
>>> +    preempt_disable();
>>> +    arch_lock_acquire_barrier();
>>> +}
>>> +
>>>  void _spin_lock(spinlock_t *lock)
>> You should be able to use _spin_lock_cb to implement _spin_lock, right?
>
> I did consider this but decided not to do it because we'd be adding a
> few extra instructions and a call on potentially hot path.
>
> (And doing it as a #define of a spin_lock() would make things even worse).

Although declaring _spin_lock_cb() as an inline makes generated assembly
look essentially the same as with current _spin_lock() (I don't care
about the extra check inside the loop since that's a slow path).

So maybe I can indeed do this.

-boris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 9/9] mm: Make sure pages are scrubbed
  2017-03-24 17:05 ` [PATCH v1 9/9] mm: Make sure pages are scrubbed Boris Ostrovsky
  2017-03-29 10:39   ` Wei Liu
@ 2017-03-29 16:25   ` Wei Liu
  2017-03-29 16:35     ` Boris Ostrovsky
  1 sibling, 1 reply; 33+ messages in thread
From: Wei Liu @ 2017-03-29 16:25 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: tim, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, xen-devel, jbeulich

On Fri, Mar 24, 2017 at 01:05:04PM -0400, Boris Ostrovsky wrote:
> +static void check_one_page(struct page_info *pg)
> +{
> +    mfn_t mfn = _mfn(page_to_mfn(pg));
> +    uint64_t *ptr;
> +
> +    ptr  = map_domain_page(mfn);
> +    ASSERT(*ptr != PAGE_POISON);

Should be ASSERT(*ptr == 0) here.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 9/9] mm: Make sure pages are scrubbed
  2017-03-29 16:25   ` Wei Liu
@ 2017-03-29 16:35     ` Boris Ostrovsky
  2017-03-29 16:45       ` Wei Liu
  0 siblings, 1 reply; 33+ messages in thread
From: Boris Ostrovsky @ 2017-03-29 16:35 UTC (permalink / raw)
  To: Wei Liu
  Cc: tim, sstabellini, George.Dunlap, andrew.cooper3, ian.jackson,
	xen-devel, jbeulich

On 03/29/2017 12:25 PM, Wei Liu wrote:
> On Fri, Mar 24, 2017 at 01:05:04PM -0400, Boris Ostrovsky wrote:
>> +static void check_one_page(struct page_info *pg)
>> +{
>> +    mfn_t mfn = _mfn(page_to_mfn(pg));
>> +    uint64_t *ptr;
>> +
>> +    ptr  = map_domain_page(mfn);
>> +    ASSERT(*ptr != PAGE_POISON);
> Should be ASSERT(*ptr == 0) here.

We can't assume it will be zero --- see scrub_one_page().

-boris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 9/9] mm: Make sure pages are scrubbed
  2017-03-29 16:35     ` Boris Ostrovsky
@ 2017-03-29 16:45       ` Wei Liu
  2017-03-29 17:12         ` Andrew Cooper
  0 siblings, 1 reply; 33+ messages in thread
From: Wei Liu @ 2017-03-29 16:45 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: tim, sstabellini, Wei Liu, George.Dunlap, andrew.cooper3,
	ian.jackson, xen-devel, jbeulich

On Wed, Mar 29, 2017 at 12:35:25PM -0400, Boris Ostrovsky wrote:
> On 03/29/2017 12:25 PM, Wei Liu wrote:
> > On Fri, Mar 24, 2017 at 01:05:04PM -0400, Boris Ostrovsky wrote:
> >> +static void check_one_page(struct page_info *pg)
> >> +{
> >> +    mfn_t mfn = _mfn(page_to_mfn(pg));
> >> +    uint64_t *ptr;
> >> +
> >> +    ptr  = map_domain_page(mfn);
> >> +    ASSERT(*ptr != PAGE_POISON);
> > Should be ASSERT(*ptr == 0) here.
> 
> We can't assume it will be zero --- see scrub_one_page().

It's still possible a value is leaked from the guest, and that value has
rather high probability to be != PAGE_POISON.

In that case there should be a ifdef to match the debug and non-debug
builds.

Wei.

> 
> -boris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v1 9/9] mm: Make sure pages are scrubbed
  2017-03-29 16:45       ` Wei Liu
@ 2017-03-29 17:12         ` Andrew Cooper
  0 siblings, 0 replies; 33+ messages in thread
From: Andrew Cooper @ 2017-03-29 17:12 UTC (permalink / raw)
  To: Wei Liu, Boris Ostrovsky
  Cc: sstabellini, George.Dunlap, tim, ian.jackson, xen-devel, jbeulich

On 29/03/17 17:45, Wei Liu wrote:
> On Wed, Mar 29, 2017 at 12:35:25PM -0400, Boris Ostrovsky wrote:
>> On 03/29/2017 12:25 PM, Wei Liu wrote:
>>> On Fri, Mar 24, 2017 at 01:05:04PM -0400, Boris Ostrovsky wrote:
>>>> +static void check_one_page(struct page_info *pg)
>>>> +{
>>>> +    mfn_t mfn = _mfn(page_to_mfn(pg));
>>>> +    uint64_t *ptr;
>>>> +
>>>> +    ptr  = map_domain_page(mfn);
>>>> +    ASSERT(*ptr != PAGE_POISON);
>>> Should be ASSERT(*ptr == 0) here.
>> We can't assume it will be zero --- see scrub_one_page().
> It's still possible a value is leaked from the guest, and that value has
> rather high probability to be != PAGE_POISON.
>
> In that case there should be a ifdef to match the debug and non-debug
> builds.

The only sensible thing to do is check that the entire page is zeroes. 
This is a debug option after all.

We don't want to waste time poisoning zero pages we hand back to the
free pool.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2017-03-29 17:12 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-24 17:04 [PATCH v1 0/9] Memory scrubbing from idle loop Boris Ostrovsky
2017-03-24 17:04 ` [PATCH v1 1/9] mm: Separate free page chunk merging into its own routine Boris Ostrovsky
2017-03-27 15:16   ` Wei Liu
2017-03-27 16:03     ` Jan Beulich
2017-03-27 16:28       ` Boris Ostrovsky
2017-03-28  7:27         ` Jan Beulich
2017-03-28 19:20   ` Wei Liu
2017-03-28 19:41     ` Boris Ostrovsky
2017-03-28 19:44       ` Wei Liu
2017-03-28 19:57         ` Boris Ostrovsky
2017-03-24 17:04 ` [PATCH v1 2/9] mm: Place unscrubbed pages at the end of pagelist Boris Ostrovsky
2017-03-28 19:27   ` Wei Liu
2017-03-28 19:46     ` Boris Ostrovsky
2017-03-24 17:04 ` [PATCH v1 3/9] mm: Scrub pages in alloc_heap_pages() if needed Boris Ostrovsky
2017-03-28 19:43   ` Wei Liu
2017-03-24 17:04 ` [PATCH v1 4/9] mm: Scrub memory from idle loop Boris Ostrovsky
2017-03-28 20:01   ` Wei Liu
2017-03-28 20:14     ` Boris Ostrovsky
2017-03-24 17:05 ` [PATCH v1 5/9] mm: Do not discard already-scrubbed pages if softirqs are pending Boris Ostrovsky
2017-03-29 10:22   ` Wei Liu
2017-03-24 17:05 ` [PATCH v1 6/9] spinlock: Introduce spin_lock_cb() Boris Ostrovsky
2017-03-29 10:28   ` Wei Liu
2017-03-29 13:47     ` Boris Ostrovsky
2017-03-29 14:07       ` Boris Ostrovsky
2017-03-24 17:05 ` [PATCH v1 7/9] mm: Keep pages available for allocation while scrubbing Boris Ostrovsky
2017-03-24 17:05 ` [PATCH v1 8/9] mm: Print number of unscrubbed pages in 'H' debug handler Boris Ostrovsky
2017-03-28 20:11   ` Wei Liu
2017-03-24 17:05 ` [PATCH v1 9/9] mm: Make sure pages are scrubbed Boris Ostrovsky
2017-03-29 10:39   ` Wei Liu
2017-03-29 16:25   ` Wei Liu
2017-03-29 16:35     ` Boris Ostrovsky
2017-03-29 16:45       ` Wei Liu
2017-03-29 17:12         ` Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.