xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [Xen-devel] [PATCH 0/2] Handle
@ 2020-03-19 21:17 David Woodhouse
  2020-03-19 21:21 ` [Xen-devel] [PATCH 1/2] xen/mm: fold PGC_broken into PGC_state bits David Woodhouse
  0 siblings, 1 reply; 16+ messages in thread
From: David Woodhouse @ 2020-03-19 21:17 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Andrew Cooper,
	Ian Jackson, George Dunlap, Xia, Hongyan, Jan Beulich,
	Volodymyr Babchuk, Roger Pau Monné


[-- Attachment #1.1: Type: text/plain, Size: 1366 bytes --]

There are cases where pages can get freed with free_heap_pages() when
in fact they were never properly initialised in the heap — they may
have been allocated from the boot allocator, simply assigned directly
to dom0 as part of its initrd, etc.

We have plans to make vmap available during early boot, which would
exacerbate this situation a tiny bit more, as a few more page tables
would stand a small chance of being allocated by the boot allocator and
freed later.

Resolve this by introducing a new page state, PGC_state_uninitialised,
expanding the PGC_state to 3 bits (8 possible values) by subsuming the
PGC_broken bit into it and eliminating the redundant possible
combinations of PGC_broken and various states.

Pages which find their way into free_heap_pages() while still in
PGC_state_uninitialised can thus be detected and properly
rehabilitated, basically by passing them through init_heap_pages().


David Woodhouse (2):
      xen/mm: fold PGC_broken into PGC_state bits
      xen/mm: Introduce PGC_state_uninitialised

xen/arch/x86/domctl.c    |   2 +-
 xen/arch/x86/mm.c        |   3 +-
 xen/common/page_alloc.c  | 110 +++++++++++++++++++++++++++++------------------
 xen/include/asm-arm/mm.h |  39 +++++++++++------
 xen/include/asm-x86/mm.h |  37 +++++++++++-----
 5 files changed, 125 insertions(+), 66 deletions(-)

[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5174 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Xen-devel] [PATCH 1/2] xen/mm: fold PGC_broken into PGC_state bits
  2020-03-19 21:17 [Xen-devel] [PATCH 0/2] Handle David Woodhouse
@ 2020-03-19 21:21 ` David Woodhouse
  2020-03-19 21:21   ` [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised David Woodhouse
                     ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: David Woodhouse @ 2020-03-19 21:21 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Andrew Cooper,
	Ian Jackson, George Dunlap, hongyxia, Jan Beulich,
	Volodymyr Babchuk, Roger Pau Monné

From: David Woodhouse <dwmw@amazon.co.uk>

Only PGC_state_offlining and PGC_state_offlined are valid in conjunction
with PGC_broken. The other two states (free and inuse) were never valid
for a broken page.

By folding PGC_broken in, we can have three bits for PGC_state which
allows up to 8 states, of which 6 are currently used and 2 are available
for new use cases.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 xen/arch/x86/domctl.c    |  2 +-
 xen/common/page_alloc.c  | 66 ++++++++++++++++++++++------------------
 xen/include/asm-arm/mm.h | 38 +++++++++++++++--------
 xen/include/asm-x86/mm.h | 36 ++++++++++++++++------
 4 files changed, 89 insertions(+), 53 deletions(-)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index ed86762fa6..a411f64afa 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -422,7 +422,7 @@ long arch_do_domctl(
                 if ( page->u.inuse.type_info & PGT_pinned )
                     type |= XEN_DOMCTL_PFINFO_LPINTAB;
 
-                if ( page->count_info & PGC_broken )
+                if ( page_is_broken(page) )
                     type = XEN_DOMCTL_PFINFO_BROKEN;
             }
 
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 76d37226df..8d72a64f4e 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1093,7 +1093,7 @@ static int reserve_offlined_page(struct page_info *head)
         struct page_info *pg;
         int next_order;
 
-        if ( page_state_is(cur_head, offlined) )
+        if ( page_is_offlined(cur_head) )
         {
             cur_head++;
             if ( first_dirty != INVALID_DIRTY_IDX && first_dirty )
@@ -1113,7 +1113,7 @@ static int reserve_offlined_page(struct page_info *head)
             for ( i = (1 << cur_order), pg = cur_head + (1 << cur_order );
                   i < (1 << next_order);
                   i++, pg++ )
-                if ( page_state_is(pg, offlined) )
+                if ( page_is_offlined(pg) )
                     break;
             if ( i == ( 1 << next_order) )
             {
@@ -1145,16 +1145,20 @@ static int reserve_offlined_page(struct page_info *head)
 
     for ( cur_head = head; cur_head < head + ( 1UL << head_order); cur_head++ )
     {
-        if ( !page_state_is(cur_head, offlined) )
+        struct page_list_head *list;
+
+        if ( page_state_is(cur_head, offlined) )
+            list = &page_offlined_list;
+        else if (page_state_is(cur_head, broken) )
+            list = &page_broken_list;
+        else
             continue;
 
         avail[node][zone]--;
         total_avail_pages--;
         ASSERT(total_avail_pages >= 0);
 
-        page_list_add_tail(cur_head,
-                           test_bit(_PGC_broken, &cur_head->count_info) ?
-                           &page_broken_list : &page_offlined_list);
+        page_list_add_tail(cur_head, list);
 
         count++;
     }
@@ -1404,13 +1408,16 @@ static void free_heap_pages(
         switch ( pg[i].count_info & PGC_state )
         {
         case PGC_state_inuse:
-            BUG_ON(pg[i].count_info & PGC_broken);
             pg[i].count_info = PGC_state_free;
             break;
 
         case PGC_state_offlining:
-            pg[i].count_info = (pg[i].count_info & PGC_broken) |
-                               PGC_state_offlined;
+            pg[i].count_info = PGC_state_offlined;
+            tainted = 1;
+            break;
+
+        case PGC_state_broken_offlining:
+            pg[i].count_info = PGC_state_broken;
             tainted = 1;
             break;
 
@@ -1527,16 +1534,16 @@ static unsigned long mark_page_offline(struct page_info *pg, int broken)
     do {
         nx = x = y;
 
-        if ( ((x & PGC_state) != PGC_state_offlined) &&
-             ((x & PGC_state) != PGC_state_offlining) )
-        {
-            nx &= ~PGC_state;
-            nx |= (((x & PGC_state) == PGC_state_free)
-                   ? PGC_state_offlined : PGC_state_offlining);
-        }
+        nx &= ~PGC_state;
 
-        if ( broken )
-            nx |= PGC_broken;
+        /* If it was already broken, it stays broken */
+        if ( pgc_is_broken(x) )
+            broken = 1;
+
+        if ( pgc_is_offlined(x) || pgc_is(x, free) )
+            nx |= broken ? PGC_state_broken : PGC_state_offlined;
+        else
+            nx |= broken ? PGC_state_broken_offlining : PGC_state_offlining;
 
         if ( x == nx )
             break;
@@ -1609,7 +1616,7 @@ int offline_page(mfn_t mfn, int broken, uint32_t *status)
      * need to prevent malicious guest access the broken page again.
      * Under such case, hypervisor shutdown guest, preventing recursive mce.
      */
-    if ( (pg->count_info & PGC_broken) && (owner = page_get_owner(pg)) )
+    if ( page_is_broken(pg) && (owner = page_get_owner(pg)) )
     {
         *status = PG_OFFLINE_AGAIN;
         domain_crash(owner);
@@ -1620,7 +1627,7 @@ int offline_page(mfn_t mfn, int broken, uint32_t *status)
 
     old_info = mark_page_offline(pg, broken);
 
-    if ( page_state_is(pg, offlined) )
+    if ( page_is_offlined(pg) )
     {
         reserve_heap_page(pg);
 
@@ -1699,19 +1706,18 @@ unsigned int online_page(mfn_t mfn, uint32_t *status)
     do {
         ret = *status = 0;
 
-        if ( y & PGC_broken )
+        if ( pgc_is_broken(y) )
         {
             ret = -EINVAL;
-            *status = PG_ONLINE_FAILED |PG_ONLINE_BROKEN;
+            *status = PG_ONLINE_FAILED | PG_ONLINE_BROKEN;
             break;
         }
-
-        if ( (y & PGC_state) == PGC_state_offlined )
+        else if ( pgc_is(y, offlined) )
         {
             page_list_del(pg, &page_offlined_list);
             *status = PG_ONLINE_ONLINED;
         }
-        else if ( (y & PGC_state) == PGC_state_offlining )
+        else if ( pgc_is(y, offlining) )
         {
             *status = PG_ONLINE_ONLINED;
         }
@@ -1726,7 +1732,7 @@ unsigned int online_page(mfn_t mfn, uint32_t *status)
 
     spin_unlock(&heap_lock);
 
-    if ( (y & PGC_state) == PGC_state_offlined )
+    if ( pgc_is(y, offlined) )
         free_heap_pages(pg, 0, false);
 
     return ret;
@@ -1747,11 +1753,11 @@ int query_page_offline(mfn_t mfn, uint32_t *status)
 
     pg = mfn_to_page(mfn);
 
-    if ( page_state_is(pg, offlining) )
+    if ( page_is_offlining(pg) )
         *status |= PG_OFFLINE_STATUS_OFFLINE_PENDING;
-    if ( pg->count_info & PGC_broken )
+    if ( page_is_broken(pg) )
         *status |= PG_OFFLINE_STATUS_BROKEN;
-    if ( page_state_is(pg, offlined) )
+    if ( page_is_offlined(pg) )
         *status |= PG_OFFLINE_STATUS_OFFLINED;
 
     spin_unlock(&heap_lock);
@@ -2519,7 +2525,7 @@ __initcall(pagealloc_keyhandler_init);
 
 void scrub_one_page(struct page_info *pg)
 {
-    if ( unlikely(pg->count_info & PGC_broken) )
+    if ( unlikely(page_is_broken(pg)) )
         return;
 
 #ifndef NDEBUG
diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index 7df91280bc..a877791d1c 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -108,21 +108,35 @@ struct page_info
   /* Page is Xen heap? */
 #define _PGC_xen_heap     PG_shift(2)
 #define PGC_xen_heap      PG_mask(1, 2)
-/* ... */
-/* Page is broken? */
-#define _PGC_broken       PG_shift(7)
-#define PGC_broken        PG_mask(1, 7)
- /* Mutually-exclusive page states: { inuse, offlining, offlined, free }. */
-#define PGC_state         PG_mask(3, 9)
-#define PGC_state_inuse   PG_mask(0, 9)
-#define PGC_state_offlining PG_mask(1, 9)
-#define PGC_state_offlined PG_mask(2, 9)
-#define PGC_state_free    PG_mask(3, 9)
-#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st)
+ /*
+  * Mutually-exclusive page states:
+  * { inuse, offlining, offlined, free, broken_offlining, broken }
+  */
+#define PGC_state                  PG_mask(7, 9)
+#define PGC_state_inuse            PG_mask(0, 9)
+#define PGC_state_offlining        PG_mask(1, 9)
+#define PGC_state_offlined         PG_mask(2, 9)
+#define PGC_state_free             PG_mask(3, 9)
+#define PGC_state_broken_offlining PG_mask(4, 9) /* Broken and offlining */
+#define PGC_state_broken           PG_mask(5, 9) /* Broken and offlined */
+
+#define pgc_is(pgc, st)            (((pgc) & PGC_state) == PGC_state_##st)
+#define page_state_is(pg, st)       pgc_is((pg)->count_info, st)
+
+#define pgc_is_broken(pgc)         (pgc_is(pgc, broken) || \
+                                    pgc_is(pgc, broken_offlining))
+#define pgc_is_offlined(pgc)       (pgc_is(pgc, offlined) || \
+                                    pgc_is(pgc, broken))
+#define pgc_is_offlining(pgc)      (pgc_is(pgc, offlining) || \
+                                    pgc_is(pgc, broken_offlining))
+
+#define page_is_broken(pg)         (pgc_is_broken((pg)->count_info))
+#define page_is_offlined(pg)       (pgc_is_broken((pg)->count_info))
+#define page_is_offlining(pg)      (pgc_is_broken((pg)->count_info))
+
 /* Page is not reference counted */
 #define _PGC_extra        PG_shift(10)
 #define PGC_extra         PG_mask(1, 10)
-
 /* Count of references to this frame. */
 #define PGC_count_width   PG_shift(10)
 #define PGC_count_mask    ((1UL<<PGC_count_width)-1)
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index a06b2fb81f..1203f1b179 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -67,16 +67,32 @@
  /* 3-bit PAT/PCD/PWT cache-attribute hint. */
 #define PGC_cacheattr_base PG_shift(6)
 #define PGC_cacheattr_mask PG_mask(7, 6)
- /* Page is broken? */
-#define _PGC_broken       PG_shift(7)
-#define PGC_broken        PG_mask(1, 7)
- /* Mutually-exclusive page states: { inuse, offlining, offlined, free }. */
-#define PGC_state         PG_mask(3, 9)
-#define PGC_state_inuse   PG_mask(0, 9)
-#define PGC_state_offlining PG_mask(1, 9)
-#define PGC_state_offlined PG_mask(2, 9)
-#define PGC_state_free    PG_mask(3, 9)
-#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st)
+ /*
+  * Mutually-exclusive page states:
+  * { inuse, offlining, offlined, free, broken_offlining, broken }
+  */
+#define PGC_state                  PG_mask(7, 9)
+#define PGC_state_inuse            PG_mask(0, 9)
+#define PGC_state_offlining        PG_mask(1, 9)
+#define PGC_state_offlined         PG_mask(2, 9)
+#define PGC_state_free             PG_mask(3, 9)
+#define PGC_state_broken_offlining PG_mask(4, 9) /* Broken and offlining */
+#define PGC_state_broken           PG_mask(5, 9) /* Broken and offlined */
+
+#define pgc_is(pgc, st)            (((pgc) & PGC_state) == PGC_state_##st)
+#define page_state_is(pg, st)       pgc_is((pg)->count_info, st)
+
+#define pgc_is_broken(pgc)         (pgc_is(pgc, broken) || \
+                                    pgc_is(pgc, broken_offlining))
+#define pgc_is_offlined(pgc)       (pgc_is(pgc, offlined) || \
+                                    pgc_is(pgc, broken))
+#define pgc_is_offlining(pgc)      (pgc_is(pgc, offlining) || \
+                                    pgc_is(pgc, broken_offlining))
+
+#define page_is_broken(pg)         (pgc_is_broken((pg)->count_info))
+#define page_is_offlined(pg)       (pgc_is_broken((pg)->count_info))
+#define page_is_offlining(pg)      (pgc_is_broken((pg)->count_info))
+
 /* Page is not reference counted */
 #define _PGC_extra        PG_shift(10)
 #define PGC_extra         PG_mask(1, 10)
-- 
2.21.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised
  2020-03-19 21:21 ` [Xen-devel] [PATCH 1/2] xen/mm: fold PGC_broken into PGC_state bits David Woodhouse
@ 2020-03-19 21:21   ` David Woodhouse
  2020-03-20 13:33     ` Paul Durrant
  2020-03-31 12:10     ` Jan Beulich
  2020-03-20 13:17   ` [Xen-devel] [PATCH 1/2] xen/mm: fold PGC_broken into PGC_state bits Paul Durrant
  2020-03-31 12:00   ` Jan Beulich
  2 siblings, 2 replies; 16+ messages in thread
From: David Woodhouse @ 2020-03-19 21:21 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Andrew Cooper,
	Ian Jackson, George Dunlap, hongyxia, Jan Beulich,
	Volodymyr Babchuk, Roger Pau Monné

From: David Woodhouse <dwmw@amazon.co.uk>

It is possible for pages to enter general circulation without ever
being process by init_heap_pages().

For example, pages of the multiboot module containing the initramfs may
be assigned via assign_pages() to dom0 as it is created. And some code
including map_pages_to_xen() has checks on 'system_state' to determine
whether to use the boot or the heap allocator, but it seems impossible
to prove that pages allocated by the boot allocator are not subsequently
freed with free_heap_pages().

This actually works fine in the majority of cases; there are only a few
esoteric corner cases which init_heap_pages() handles before handing the
page range off to free_heap_pages():
 • Excluding MFN #0 to avoid inappropriate cross-zone merging.
 • Ensuring that the node information structures exist, when the first
   page(s) of a given node are handled.
 • High order allocations crossing from one node to another.

To handle this case, shift PG_state_inuse from its current value of
zero, to another value. Use zero, which is the initial state of the
entire frame table, as PG_state_uninitialised.

Fix a couple of assertions which were assuming that PG_state_inuse is
zero, and make them cope with the PG_state_uninitialised case too where
appopriate.

Finally, make free_heap_pages() call through to init_heap_pages() when
given a page range which has not been initialised. This cannot keep
recursing because init_heap_pages() will set each page state to
PGC_state_inuse before passing it back to free_heap_pages() for the
second time.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 xen/arch/x86/mm.c        |  3 ++-
 xen/common/page_alloc.c  | 44 +++++++++++++++++++++++++++++-----------
 xen/include/asm-arm/mm.h |  3 ++-
 xen/include/asm-x86/mm.h |  3 ++-
 4 files changed, 38 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 62507ca651..5f0581c072 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -491,7 +491,8 @@ void share_xen_page_with_guest(struct page_info *page, struct domain *d,
 
     page_set_owner(page, d);
     smp_wmb(); /* install valid domain ptr before updating refcnt. */
-    ASSERT((page->count_info & ~PGC_xen_heap) == 0);
+    ASSERT((page->count_info & ~PGC_xen_heap) == PGC_state_inuse ||
+           (page->count_info & ~PGC_xen_heap) == PGC_state_uninitialised);
 
     /* Only add to the allocation list if the domain isn't dying. */
     if ( !d->is_dying )
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 8d72a64f4e..4f7971f2a1 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -252,6 +252,8 @@ struct bootmem_region {
 static struct bootmem_region __initdata
     bootmem_region_list[PAGE_SIZE / sizeof(struct bootmem_region)];
 static unsigned int __initdata nr_bootmem_regions;
+static void init_heap_pages(struct page_info *pg, unsigned long nr_pages,
+                            bool scrub);
 
 struct scrub_region {
     unsigned long offset;
@@ -1390,6 +1392,17 @@ static void free_heap_pages(
     ASSERT(order <= MAX_ORDER);
     ASSERT(node >= 0);
 
+    if ( page_state_is(pg, uninitialised) )
+    {
+        init_heap_pages(pg, 1 << order, need_scrub);
+        /*
+         * init_heap_pages() will call back into free_heap_pages() for
+         * each page but cannot keep recursing because each page will
+         * be set to PGC_state_inuse first.
+         */
+        return;
+    }
+
     spin_lock(&heap_lock);
 
     for ( i = 0; i < (1 << order); i++ )
@@ -1771,11 +1784,10 @@ int query_page_offline(mfn_t mfn, uint32_t *status)
  * latter is not on a MAX_ORDER boundary, then we reserve the page by
  * not freeing it to the buddy allocator.
  */
-static void init_heap_pages(
-    struct page_info *pg, unsigned long nr_pages)
+static void init_heap_pages(struct page_info *pg, unsigned long nr_pages,
+                            bool scrub)
 {
     unsigned long i;
-    bool idle_scrub = false;
 
     /*
      * Keep MFN 0 away from the buddy allocator to avoid crossing zone
@@ -1800,7 +1812,7 @@ static void init_heap_pages(
     spin_unlock(&heap_lock);
 
     if ( system_state < SYS_STATE_active && opt_bootscrub == BOOTSCRUB_IDLE )
-        idle_scrub = true;
+        scrub = true;
 
     for ( i = 0; i < nr_pages; i++ )
     {
@@ -1828,7 +1840,8 @@ static void init_heap_pages(
             nr_pages -= n;
         }
 
-        free_heap_pages(pg + i, 0, scrub_debug || idle_scrub);
+        pg[i].count_info = PGC_state_inuse;
+        free_heap_pages(pg + i, 0, scrub_debug || scrub);
     }
 }
 
@@ -1864,7 +1877,7 @@ void __init end_boot_allocator(void)
         if ( (r->s < r->e) &&
              (phys_to_nid(pfn_to_paddr(r->s)) == cpu_to_node(0)) )
         {
-            init_heap_pages(mfn_to_page(_mfn(r->s)), r->e - r->s);
+            init_heap_pages(mfn_to_page(_mfn(r->s)), r->e - r->s, false);
             r->e = r->s;
             break;
         }
@@ -1873,7 +1886,7 @@ void __init end_boot_allocator(void)
     {
         struct bootmem_region *r = &bootmem_region_list[i];
         if ( r->s < r->e )
-            init_heap_pages(mfn_to_page(_mfn(r->s)), r->e - r->s);
+            init_heap_pages(mfn_to_page(_mfn(r->s)), r->e - r->s, false);
     }
     nr_bootmem_regions = 0;
 
@@ -2142,7 +2155,7 @@ void init_xenheap_pages(paddr_t ps, paddr_t pe)
 
     memguard_guard_range(maddr_to_virt(ps), pe - ps);
 
-    init_heap_pages(maddr_to_page(ps), (pe - ps) >> PAGE_SHIFT);
+    init_heap_pages(maddr_to_page(ps), (pe - ps) >> PAGE_SHIFT, false);
 }
 
 
@@ -2251,7 +2264,7 @@ void init_domheap_pages(paddr_t ps, paddr_t pe)
     if ( mfn_x(emfn) <= mfn_x(smfn) )
         return;
 
-    init_heap_pages(mfn_to_page(smfn), mfn_x(emfn) - mfn_x(smfn));
+    init_heap_pages(mfn_to_page(smfn), mfn_x(emfn) - mfn_x(smfn), false);
 }
 
 
@@ -2280,7 +2293,8 @@ int assign_pages(
 
         for ( i = 0; i < (1ul << order); i++ )
         {
-            ASSERT(!(pg[i].count_info & ~PGC_extra));
+            ASSERT((pg[i].count_info & ~PGC_extra) == PGC_state_inuse ||
+                   (pg[i].count_info & ~PGC_extra) == PGC_state_uninitialised);
             if ( pg[i].count_info & PGC_extra )
                 extra_pages++;
         }
@@ -2316,10 +2330,16 @@ int assign_pages(
     for ( i = 0; i < (1 << order); i++ )
     {
         ASSERT(page_get_owner(&pg[i]) == NULL);
+        /*
+         * Note: Not using page_state_is() here. The ASSERT requires that
+         * all other bits in count_info are zero, in addition to PGC_state
+         * being appropriate.
+         */
+        ASSERT((pg[i].count_info & ~PGC_extra) == PGC_state_inuse ||
+               (pg[i].count_info & ~PGC_extra) == PGC_state_uninitialised);
         page_set_owner(&pg[i], d);
         smp_wmb(); /* Domain pointer must be visible before updating refcnt. */
-        pg[i].count_info =
-            (pg[i].count_info & PGC_extra) | PGC_allocated | 1;
+        pg[i].count_info = (pg[i].count_info & PGC_state) | PGC_allocated | 1;
         page_list_add_tail(&pg[i], &d->page_list);
     }
 
diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index a877791d1c..49663fa98a 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -113,12 +113,13 @@ struct page_info
   * { inuse, offlining, offlined, free, broken_offlining, broken }
   */
 #define PGC_state                  PG_mask(7, 9)
-#define PGC_state_inuse            PG_mask(0, 9)
+#define PGC_state_uninitialised    PG_mask(0, 9)
 #define PGC_state_offlining        PG_mask(1, 9)
 #define PGC_state_offlined         PG_mask(2, 9)
 #define PGC_state_free             PG_mask(3, 9)
 #define PGC_state_broken_offlining PG_mask(4, 9) /* Broken and offlining */
 #define PGC_state_broken           PG_mask(5, 9) /* Broken and offlined */
+#define PGC_state_inuse            PG_mask(6, 9)
 
 #define pgc_is(pgc, st)            (((pgc) & PGC_state) == PGC_state_##st)
 #define page_state_is(pg, st)       pgc_is((pg)->count_info, st)
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 1203f1b179..5fbbca5f05 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -72,12 +72,13 @@
   * { inuse, offlining, offlined, free, broken_offlining, broken }
   */
 #define PGC_state                  PG_mask(7, 9)
-#define PGC_state_inuse            PG_mask(0, 9)
+#define PGC_state_uninitialised    PG_mask(0, 9)
 #define PGC_state_offlining        PG_mask(1, 9)
 #define PGC_state_offlined         PG_mask(2, 9)
 #define PGC_state_free             PG_mask(3, 9)
 #define PGC_state_broken_offlining PG_mask(4, 9) /* Broken and offlining */
 #define PGC_state_broken           PG_mask(5, 9) /* Broken and offlined */
+#define PGC_state_inuse            PG_mask(6, 9)
 
 #define pgc_is(pgc, st)            (((pgc) & PGC_state) == PGC_state_##st)
 #define page_state_is(pg, st)       pgc_is((pg)->count_info, st)
-- 
2.21.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] xen/mm: fold PGC_broken into PGC_state bits
  2020-03-19 21:21 ` [Xen-devel] [PATCH 1/2] xen/mm: fold PGC_broken into PGC_state bits David Woodhouse
  2020-03-19 21:21   ` [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised David Woodhouse
@ 2020-03-20 13:17   ` Paul Durrant
  2020-03-31 12:00   ` Jan Beulich
  2 siblings, 0 replies; 16+ messages in thread
From: Paul Durrant @ 2020-03-20 13:17 UTC (permalink / raw)
  To: 'David Woodhouse', xen-devel
  Cc: 'Stefano Stabellini', 'Julien Grall',
	'Wei Liu', 'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', hongyxia, 'Jan Beulich',
	'Volodymyr Babchuk', 'Roger Pau Monné'

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of David Woodhouse
> Sent: 19 March 2020 21:22
> To: xen-devel@lists.xenproject.org
> Cc: Stefano Stabellini <sstabellini@kernel.org>; Julien Grall <julien@xen.org>; Wei Liu <wl@xen.org>;
> Andrew Cooper <andrew.cooper3@citrix.com>; Ian Jackson <ian.jackson@eu.citrix.com>; George Dunlap
> <george.dunlap@citrix.com>; hongyxia@amazon.com; Jan Beulich <jbeulich@suse.com>; Volodymyr Babchuk
> <Volodymyr_Babchuk@epam.com>; Roger Pau Monné <roger.pau@citrix.com>
> Subject: [Xen-devel] [PATCH 1/2] xen/mm: fold PGC_broken into PGC_state bits
> 
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> Only PGC_state_offlining and PGC_state_offlined are valid in conjunction
> with PGC_broken. The other two states (free and inuse) were never valid
> for a broken page.
> 
> By folding PGC_broken in, we can have three bits for PGC_state which
> allows up to 8 states, of which 6 are currently used and 2 are available
> for new use cases.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>  xen/arch/x86/domctl.c    |  2 +-
>  xen/common/page_alloc.c  | 66 ++++++++++++++++++++++------------------
>  xen/include/asm-arm/mm.h | 38 +++++++++++++++--------
>  xen/include/asm-x86/mm.h | 36 ++++++++++++++++------
>  4 files changed, 89 insertions(+), 53 deletions(-)
> 
> diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
> index ed86762fa6..a411f64afa 100644
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -422,7 +422,7 @@ long arch_do_domctl(
>                  if ( page->u.inuse.type_info & PGT_pinned )
>                      type |= XEN_DOMCTL_PFINFO_LPINTAB;
> 
> -                if ( page->count_info & PGC_broken )
> +                if ( page_is_broken(page) )
>                      type = XEN_DOMCTL_PFINFO_BROKEN;
>              }
> 
> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
> index 76d37226df..8d72a64f4e 100644
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -1093,7 +1093,7 @@ static int reserve_offlined_page(struct page_info *head)
>          struct page_info *pg;
>          int next_order;
> 
> -        if ( page_state_is(cur_head, offlined) )
> +        if ( page_is_offlined(cur_head) )
>          {
>              cur_head++;
>              if ( first_dirty != INVALID_DIRTY_IDX && first_dirty )
> @@ -1113,7 +1113,7 @@ static int reserve_offlined_page(struct page_info *head)
>              for ( i = (1 << cur_order), pg = cur_head + (1 << cur_order );
>                    i < (1 << next_order);
>                    i++, pg++ )
> -                if ( page_state_is(pg, offlined) )
> +                if ( page_is_offlined(pg) )
>                      break;
>              if ( i == ( 1 << next_order) )
>              {
> @@ -1145,16 +1145,20 @@ static int reserve_offlined_page(struct page_info *head)
> 
>      for ( cur_head = head; cur_head < head + ( 1UL << head_order); cur_head++ )
>      {
> -        if ( !page_state_is(cur_head, offlined) )
> +        struct page_list_head *list;
> +
> +        if ( page_state_is(cur_head, offlined) )
> +            list = &page_offlined_list;
> +        else if (page_state_is(cur_head, broken) )
> +            list = &page_broken_list;
> +        else
>              continue;
> 
>          avail[node][zone]--;
>          total_avail_pages--;
>          ASSERT(total_avail_pages >= 0);
> 
> -        page_list_add_tail(cur_head,
> -                           test_bit(_PGC_broken, &cur_head->count_info) ?
> -                           &page_broken_list : &page_offlined_list);
> +        page_list_add_tail(cur_head, list);
> 
>          count++;
>      }
> @@ -1404,13 +1408,16 @@ static void free_heap_pages(
>          switch ( pg[i].count_info & PGC_state )
>          {
>          case PGC_state_inuse:
> -            BUG_ON(pg[i].count_info & PGC_broken);
>              pg[i].count_info = PGC_state_free;
>              break;
> 
>          case PGC_state_offlining:
> -            pg[i].count_info = (pg[i].count_info & PGC_broken) |
> -                               PGC_state_offlined;
> +            pg[i].count_info = PGC_state_offlined;
> +            tainted = 1;
> +            break;
> +
> +        case PGC_state_broken_offlining:
> +            pg[i].count_info = PGC_state_broken;
>              tainted = 1;
>              break;
> 
> @@ -1527,16 +1534,16 @@ static unsigned long mark_page_offline(struct page_info *pg, int broken)
>      do {
>          nx = x = y;
> 
> -        if ( ((x & PGC_state) != PGC_state_offlined) &&
> -             ((x & PGC_state) != PGC_state_offlining) )
> -        {
> -            nx &= ~PGC_state;
> -            nx |= (((x & PGC_state) == PGC_state_free)
> -                   ? PGC_state_offlined : PGC_state_offlining);
> -        }
> +        nx &= ~PGC_state;
> 
> -        if ( broken )
> -            nx |= PGC_broken;
> +        /* If it was already broken, it stays broken */
> +        if ( pgc_is_broken(x) )
> +            broken = 1;
> +
> +        if ( pgc_is_offlined(x) || pgc_is(x, free) )
> +            nx |= broken ? PGC_state_broken : PGC_state_offlined;
> +        else
> +            nx |= broken ? PGC_state_broken_offlining : PGC_state_offlining;
> 
>          if ( x == nx )
>              break;
> @@ -1609,7 +1616,7 @@ int offline_page(mfn_t mfn, int broken, uint32_t *status)
>       * need to prevent malicious guest access the broken page again.
>       * Under such case, hypervisor shutdown guest, preventing recursive mce.
>       */
> -    if ( (pg->count_info & PGC_broken) && (owner = page_get_owner(pg)) )
> +    if ( page_is_broken(pg) && (owner = page_get_owner(pg)) )
>      {
>          *status = PG_OFFLINE_AGAIN;
>          domain_crash(owner);
> @@ -1620,7 +1627,7 @@ int offline_page(mfn_t mfn, int broken, uint32_t *status)
> 
>      old_info = mark_page_offline(pg, broken);
> 
> -    if ( page_state_is(pg, offlined) )
> +    if ( page_is_offlined(pg) )
>      {
>          reserve_heap_page(pg);
> 
> @@ -1699,19 +1706,18 @@ unsigned int online_page(mfn_t mfn, uint32_t *status)
>      do {
>          ret = *status = 0;
> 
> -        if ( y & PGC_broken )
> +        if ( pgc_is_broken(y) )
>          {
>              ret = -EINVAL;
> -            *status = PG_ONLINE_FAILED |PG_ONLINE_BROKEN;
> +            *status = PG_ONLINE_FAILED | PG_ONLINE_BROKEN;

Whitespace fix. Ought to be called out in the commit comment.

>              break;
>          }
> -
> -        if ( (y & PGC_state) == PGC_state_offlined )
> +        else if ( pgc_is(y, offlined) )
>          {
>              page_list_del(pg, &page_offlined_list);
>              *status = PG_ONLINE_ONLINED;
>          }
> -        else if ( (y & PGC_state) == PGC_state_offlining )
> +        else if ( pgc_is(y, offlining) )
>          {
>              *status = PG_ONLINE_ONLINED;
>          }
> @@ -1726,7 +1732,7 @@ unsigned int online_page(mfn_t mfn, uint32_t *status)
> 
>      spin_unlock(&heap_lock);
> 
> -    if ( (y & PGC_state) == PGC_state_offlined )
> +    if ( pgc_is(y, offlined) )
>          free_heap_pages(pg, 0, false);
> 
>      return ret;
> @@ -1747,11 +1753,11 @@ int query_page_offline(mfn_t mfn, uint32_t *status)
> 
>      pg = mfn_to_page(mfn);
> 
> -    if ( page_state_is(pg, offlining) )
> +    if ( page_is_offlining(pg) )
>          *status |= PG_OFFLINE_STATUS_OFFLINE_PENDING;
> -    if ( pg->count_info & PGC_broken )
> +    if ( page_is_broken(pg) )
>          *status |= PG_OFFLINE_STATUS_BROKEN;
> -    if ( page_state_is(pg, offlined) )
> +    if ( page_is_offlined(pg) )
>          *status |= PG_OFFLINE_STATUS_OFFLINED;
> 
>      spin_unlock(&heap_lock);
> @@ -2519,7 +2525,7 @@ __initcall(pagealloc_keyhandler_init);
> 
>  void scrub_one_page(struct page_info *pg)
>  {
> -    if ( unlikely(pg->count_info & PGC_broken) )
> +    if ( unlikely(page_is_broken(pg)) )
>          return;
> 
>  #ifndef NDEBUG
> diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
> index 7df91280bc..a877791d1c 100644
> --- a/xen/include/asm-arm/mm.h
> +++ b/xen/include/asm-arm/mm.h
> @@ -108,21 +108,35 @@ struct page_info
>    /* Page is Xen heap? */
>  #define _PGC_xen_heap     PG_shift(2)
>  #define PGC_xen_heap      PG_mask(1, 2)
> -/* ... */
> -/* Page is broken? */
> -#define _PGC_broken       PG_shift(7)
> -#define PGC_broken        PG_mask(1, 7)
> - /* Mutually-exclusive page states: { inuse, offlining, offlined, free }. */
> -#define PGC_state         PG_mask(3, 9)
> -#define PGC_state_inuse   PG_mask(0, 9)
> -#define PGC_state_offlining PG_mask(1, 9)
> -#define PGC_state_offlined PG_mask(2, 9)
> -#define PGC_state_free    PG_mask(3, 9)
> -#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st)
> + /*
> +  * Mutually-exclusive page states:
> +  * { inuse, offlining, offlined, free, broken_offlining, broken }
> +  */
> +#define PGC_state                  PG_mask(7, 9)
> +#define PGC_state_inuse            PG_mask(0, 9)
> +#define PGC_state_offlining        PG_mask(1, 9)
> +#define PGC_state_offlined         PG_mask(2, 9)
> +#define PGC_state_free             PG_mask(3, 9)
> +#define PGC_state_broken_offlining PG_mask(4, 9) /* Broken and offlining */
> +#define PGC_state_broken           PG_mask(5, 9) /* Broken and offlined */
> +
> +#define pgc_is(pgc, st)            (((pgc) & PGC_state) == PGC_state_##st)
> +#define page_state_is(pg, st)       pgc_is((pg)->count_info, st)
> +
> +#define pgc_is_broken(pgc)         (pgc_is(pgc, broken) || \
> +                                    pgc_is(pgc, broken_offlining))
> +#define pgc_is_offlined(pgc)       (pgc_is(pgc, offlined) || \
> +                                    pgc_is(pgc, broken))
> +#define pgc_is_offlining(pgc)      (pgc_is(pgc, offlining) || \
> +                                    pgc_is(pgc, broken_offlining))
> +
> +#define page_is_broken(pg)         (pgc_is_broken((pg)->count_info))
> +#define page_is_offlined(pg)       (pgc_is_broken((pg)->count_info))
> +#define page_is_offlining(pg)      (pgc_is_broken((pg)->count_info))
> +
>  /* Page is not reference counted */
>  #define _PGC_extra        PG_shift(10)
>  #define PGC_extra         PG_mask(1, 10)
> -

Extraneous whitespace change.

>  /* Count of references to this frame. */
>  #define PGC_count_width   PG_shift(10)
>  #define PGC_count_mask    ((1UL<<PGC_count_width)-1)
> diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
> index a06b2fb81f..1203f1b179 100644
> --- a/xen/include/asm-x86/mm.h
> +++ b/xen/include/asm-x86/mm.h
> @@ -67,16 +67,32 @@
>   /* 3-bit PAT/PCD/PWT cache-attribute hint. */
>  #define PGC_cacheattr_base PG_shift(6)
>  #define PGC_cacheattr_mask PG_mask(7, 6)
> - /* Page is broken? */
> -#define _PGC_broken       PG_shift(7)
> -#define PGC_broken        PG_mask(1, 7)
> - /* Mutually-exclusive page states: { inuse, offlining, offlined, free }. */
> -#define PGC_state         PG_mask(3, 9)
> -#define PGC_state_inuse   PG_mask(0, 9)
> -#define PGC_state_offlining PG_mask(1, 9)
> -#define PGC_state_offlined PG_mask(2, 9)
> -#define PGC_state_free    PG_mask(3, 9)
> -#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st)
> + /*
> +  * Mutually-exclusive page states:
> +  * { inuse, offlining, offlined, free, broken_offlining, broken }
> +  */
> +#define PGC_state                  PG_mask(7, 9)
> +#define PGC_state_inuse            PG_mask(0, 9)
> +#define PGC_state_offlining        PG_mask(1, 9)
> +#define PGC_state_offlined         PG_mask(2, 9)
> +#define PGC_state_free             PG_mask(3, 9)
> +#define PGC_state_broken_offlining PG_mask(4, 9) /* Broken and offlining */
> +#define PGC_state_broken           PG_mask(5, 9) /* Broken and offlined */
> +
> +#define pgc_is(pgc, st)            (((pgc) & PGC_state) == PGC_state_##st)

Maybe pgc_state_is() for consistency? Might also draw attention to the difference between e.g.:

pgc_is(pgc, offlined) and pgc_is_offlined(pgc) 

> +#define page_state_is(pg, st)       pgc_is((pg)->count_info, st)

Indentation looks wrong.

^^ Same for the arm code.

  Paul

> +
> +#define pgc_is_broken(pgc)         (pgc_is(pgc, broken) || \
> +                                    pgc_is(pgc, broken_offlining))
> +#define pgc_is_offlined(pgc)       (pgc_is(pgc, offlined) || \
> +                                    pgc_is(pgc, broken))
> +#define pgc_is_offlining(pgc)      (pgc_is(pgc, offlining) || \
> +                                    pgc_is(pgc, broken_offlining))
> +
> +#define page_is_broken(pg)         (pgc_is_broken((pg)->count_info))
> +#define page_is_offlined(pg)       (pgc_is_broken((pg)->count_info))
> +#define page_is_offlining(pg)      (pgc_is_broken((pg)->count_info))
> +
>  /* Page is not reference counted */
>  #define _PGC_extra        PG_shift(10)
>  #define PGC_extra         PG_mask(1, 10)
> --
> 2.21.0
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised
  2020-03-19 21:21   ` [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised David Woodhouse
@ 2020-03-20 13:33     ` Paul Durrant
  2020-03-20 13:53       ` Jan Beulich
  2020-03-20 15:17       ` David Woodhouse
  2020-03-31 12:10     ` Jan Beulich
  1 sibling, 2 replies; 16+ messages in thread
From: Paul Durrant @ 2020-03-20 13:33 UTC (permalink / raw)
  To: 'David Woodhouse', xen-devel
  Cc: 'Stefano Stabellini', 'Julien Grall',
	'Wei Liu', 'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', hongyxia, 'Jan Beulich',
	'Volodymyr Babchuk', 'Roger Pau Monné'

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of David Woodhouse
> Sent: 19 March 2020 21:22
> To: xen-devel@lists.xenproject.org
> Cc: Stefano Stabellini <sstabellini@kernel.org>; Julien Grall <julien@xen.org>; Wei Liu <wl@xen.org>;
> Andrew Cooper <andrew.cooper3@citrix.com>; Ian Jackson <ian.jackson@eu.citrix.com>; George Dunlap
> <george.dunlap@citrix.com>; hongyxia@amazon.com; Jan Beulich <jbeulich@suse.com>; Volodymyr Babchuk
> <Volodymyr_Babchuk@epam.com>; Roger Pau Monné <roger.pau@citrix.com>
> Subject: [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised
> 
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> It is possible for pages to enter general circulation without ever
> being process by init_heap_pages().
> 
> For example, pages of the multiboot module containing the initramfs may
> be assigned via assign_pages() to dom0 as it is created. And some code
> including map_pages_to_xen() has checks on 'system_state' to determine
> whether to use the boot or the heap allocator, but it seems impossible
> to prove that pages allocated by the boot allocator are not subsequently
> freed with free_heap_pages().
> 
> This actually works fine in the majority of cases; there are only a few
> esoteric corner cases which init_heap_pages() handles before handing the
> page range off to free_heap_pages():
>  • Excluding MFN #0 to avoid inappropriate cross-zone merging.
>  • Ensuring that the node information structures exist, when the first
>    page(s) of a given node are handled.
>  • High order allocations crossing from one node to another.
> 
> To handle this case, shift PG_state_inuse from its current value of
> zero, to another value. Use zero, which is the initial state of the
> entire frame table, as PG_state_uninitialised.
> 
> Fix a couple of assertions which were assuming that PG_state_inuse is
> zero, and make them cope with the PG_state_uninitialised case too where
> appopriate.
> 
> Finally, make free_heap_pages() call through to init_heap_pages() when
> given a page range which has not been initialised. This cannot keep
> recursing because init_heap_pages() will set each page state to
> PGC_state_inuse before passing it back to free_heap_pages() for the
> second time.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>  xen/arch/x86/mm.c        |  3 ++-
>  xen/common/page_alloc.c  | 44 +++++++++++++++++++++++++++++-----------
>  xen/include/asm-arm/mm.h |  3 ++-
>  xen/include/asm-x86/mm.h |  3 ++-
>  4 files changed, 38 insertions(+), 15 deletions(-)
> 
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index 62507ca651..5f0581c072 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -491,7 +491,8 @@ void share_xen_page_with_guest(struct page_info *page, struct domain *d,
> 
>      page_set_owner(page, d);
>      smp_wmb(); /* install valid domain ptr before updating refcnt. */
> -    ASSERT((page->count_info & ~PGC_xen_heap) == 0);
> +    ASSERT((page->count_info & ~PGC_xen_heap) == PGC_state_inuse ||
> +           (page->count_info & ~PGC_xen_heap) == PGC_state_uninitialised);

Could the page state perhaps be bumped to inuse in this case? It seems odd to leave state uninitialized yet succeed in sharing with a guest.

> 
>      /* Only add to the allocation list if the domain isn't dying. */
>      if ( !d->is_dying )
> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
> index 8d72a64f4e..4f7971f2a1 100644
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -252,6 +252,8 @@ struct bootmem_region {
>  static struct bootmem_region __initdata
>      bootmem_region_list[PAGE_SIZE / sizeof(struct bootmem_region)];
>  static unsigned int __initdata nr_bootmem_regions;
> +static void init_heap_pages(struct page_info *pg, unsigned long nr_pages,
> +                            bool scrub);
> 
>  struct scrub_region {
>      unsigned long offset;
> @@ -1390,6 +1392,17 @@ static void free_heap_pages(
>      ASSERT(order <= MAX_ORDER);
>      ASSERT(node >= 0);
> 
> +    if ( page_state_is(pg, uninitialised) )
> +    {
> +        init_heap_pages(pg, 1 << order, need_scrub);
> +        /*
> +         * init_heap_pages() will call back into free_heap_pages() for
> +         * each page but cannot keep recursing because each page will
> +         * be set to PGC_state_inuse first.
> +         */
> +        return;
> +    }
> +
>      spin_lock(&heap_lock);
> 
>      for ( i = 0; i < (1 << order); i++ )
> @@ -1771,11 +1784,10 @@ int query_page_offline(mfn_t mfn, uint32_t *status)
>   * latter is not on a MAX_ORDER boundary, then we reserve the page by
>   * not freeing it to the buddy allocator.
>   */
> -static void init_heap_pages(
> -    struct page_info *pg, unsigned long nr_pages)
> +static void init_heap_pages(struct page_info *pg, unsigned long nr_pages,
> +                            bool scrub)
>  {
>      unsigned long i;
> -    bool idle_scrub = false;
> 
>      /*
>       * Keep MFN 0 away from the buddy allocator to avoid crossing zone
> @@ -1800,7 +1812,7 @@ static void init_heap_pages(
>      spin_unlock(&heap_lock);
> 
>      if ( system_state < SYS_STATE_active && opt_bootscrub == BOOTSCRUB_IDLE )
> -        idle_scrub = true;
> +        scrub = true;
> 
>      for ( i = 0; i < nr_pages; i++ )
>      {
> @@ -1828,7 +1840,8 @@ static void init_heap_pages(
>              nr_pages -= n;
>          }
> 
> -        free_heap_pages(pg + i, 0, scrub_debug || idle_scrub);

Would it be worth an ASSERT(!pg[i].count_info) here in case something is added which erroneously modifies the page count info before this is done?

> +        pg[i].count_info = PGC_state_inuse;
> +        free_heap_pages(pg + i, 0, scrub_debug || scrub);
>      }
>  }
> 
> @@ -1864,7 +1877,7 @@ void __init end_boot_allocator(void)
>          if ( (r->s < r->e) &&
>               (phys_to_nid(pfn_to_paddr(r->s)) == cpu_to_node(0)) )
>          {
> -            init_heap_pages(mfn_to_page(_mfn(r->s)), r->e - r->s);
> +            init_heap_pages(mfn_to_page(_mfn(r->s)), r->e - r->s, false);
>              r->e = r->s;
>              break;
>          }
> @@ -1873,7 +1886,7 @@ void __init end_boot_allocator(void)
>      {
>          struct bootmem_region *r = &bootmem_region_list[i];
>          if ( r->s < r->e )
> -            init_heap_pages(mfn_to_page(_mfn(r->s)), r->e - r->s);
> +            init_heap_pages(mfn_to_page(_mfn(r->s)), r->e - r->s, false);
>      }
>      nr_bootmem_regions = 0;
> 
> @@ -2142,7 +2155,7 @@ void init_xenheap_pages(paddr_t ps, paddr_t pe)
> 
>      memguard_guard_range(maddr_to_virt(ps), pe - ps);
> 
> -    init_heap_pages(maddr_to_page(ps), (pe - ps) >> PAGE_SHIFT);
> +    init_heap_pages(maddr_to_page(ps), (pe - ps) >> PAGE_SHIFT, false);
>  }
> 
> 
> @@ -2251,7 +2264,7 @@ void init_domheap_pages(paddr_t ps, paddr_t pe)
>      if ( mfn_x(emfn) <= mfn_x(smfn) )
>          return;
> 
> -    init_heap_pages(mfn_to_page(smfn), mfn_x(emfn) - mfn_x(smfn));
> +    init_heap_pages(mfn_to_page(smfn), mfn_x(emfn) - mfn_x(smfn), false);
>  }
> 
> 
> @@ -2280,7 +2293,8 @@ int assign_pages(
> 
>          for ( i = 0; i < (1ul << order); i++ )
>          {
> -            ASSERT(!(pg[i].count_info & ~PGC_extra));
> +            ASSERT((pg[i].count_info & ~PGC_extra) == PGC_state_inuse ||
> +                   (pg[i].count_info & ~PGC_extra) == PGC_state_uninitialised);

Again, perhaps bump the state to inuse if it is uninitialized...

>              if ( pg[i].count_info & PGC_extra )
>                  extra_pages++;
>          }
> @@ -2316,10 +2330,16 @@ int assign_pages(
>      for ( i = 0; i < (1 << order); i++ )
>      {
>          ASSERT(page_get_owner(&pg[i]) == NULL);
> +        /*
> +         * Note: Not using page_state_is() here. The ASSERT requires that
> +         * all other bits in count_info are zero, in addition to PGC_state
> +         * being appropriate.
> +         */
> +        ASSERT((pg[i].count_info & ~PGC_extra) == PGC_state_inuse ||
> +               (pg[i].count_info & ~PGC_extra) == PGC_state_uninitialised);

...then this ASSERT can be tightened.

>          page_set_owner(&pg[i], d);
>          smp_wmb(); /* Domain pointer must be visible before updating refcnt. */
> -        pg[i].count_info =
> -            (pg[i].count_info & PGC_extra) | PGC_allocated | 1;
> +        pg[i].count_info = (pg[i].count_info & PGC_state) | PGC_allocated | 1;

The PGC_extra seems to have vapourized here.

  Paul

>          page_list_add_tail(&pg[i], &d->page_list);
>      }
> 
> diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
> index a877791d1c..49663fa98a 100644
> --- a/xen/include/asm-arm/mm.h
> +++ b/xen/include/asm-arm/mm.h
> @@ -113,12 +113,13 @@ struct page_info
>    * { inuse, offlining, offlined, free, broken_offlining, broken }
>    */
>  #define PGC_state                  PG_mask(7, 9)
> -#define PGC_state_inuse            PG_mask(0, 9)
> +#define PGC_state_uninitialised    PG_mask(0, 9)
>  #define PGC_state_offlining        PG_mask(1, 9)
>  #define PGC_state_offlined         PG_mask(2, 9)
>  #define PGC_state_free             PG_mask(3, 9)
>  #define PGC_state_broken_offlining PG_mask(4, 9) /* Broken and offlining */
>  #define PGC_state_broken           PG_mask(5, 9) /* Broken and offlined */
> +#define PGC_state_inuse            PG_mask(6, 9)
> 
>  #define pgc_is(pgc, st)            (((pgc) & PGC_state) == PGC_state_##st)
>  #define page_state_is(pg, st)       pgc_is((pg)->count_info, st)
> diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
> index 1203f1b179..5fbbca5f05 100644
> --- a/xen/include/asm-x86/mm.h
> +++ b/xen/include/asm-x86/mm.h
> @@ -72,12 +72,13 @@
>    * { inuse, offlining, offlined, free, broken_offlining, broken }
>    */
>  #define PGC_state                  PG_mask(7, 9)
> -#define PGC_state_inuse            PG_mask(0, 9)
> +#define PGC_state_uninitialised    PG_mask(0, 9)
>  #define PGC_state_offlining        PG_mask(1, 9)
>  #define PGC_state_offlined         PG_mask(2, 9)
>  #define PGC_state_free             PG_mask(3, 9)
>  #define PGC_state_broken_offlining PG_mask(4, 9) /* Broken and offlining */
>  #define PGC_state_broken           PG_mask(5, 9) /* Broken and offlined */
> +#define PGC_state_inuse            PG_mask(6, 9)
> 
>  #define pgc_is(pgc, st)            (((pgc) & PGC_state) == PGC_state_##st)
>  #define page_state_is(pg, st)       pgc_is((pg)->count_info, st)
> --
> 2.21.0
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised
  2020-03-20 13:33     ` Paul Durrant
@ 2020-03-20 13:53       ` Jan Beulich
  2020-03-20 15:17       ` David Woodhouse
  1 sibling, 0 replies; 16+ messages in thread
From: Jan Beulich @ 2020-03-20 13:53 UTC (permalink / raw)
  To: paul, 'David Woodhouse'
  Cc: 'Stefano Stabellini', 'Julien Grall',
	'Wei Liu', 'Andrew Cooper', 'Ian Jackson',
	'George Dunlap',
	hongyxia, xen-devel, 'Volodymyr Babchuk',
	'Roger Pau Monné'

On 20.03.2020 14:33, Paul Durrant wrote:
>> -----Original Message-----
>> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of David Woodhouse
>> Sent: 19 March 2020 21:22
>>
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -491,7 +491,8 @@ void share_xen_page_with_guest(struct page_info *page, struct domain *d,
>>
>>      page_set_owner(page, d);
>>      smp_wmb(); /* install valid domain ptr before updating refcnt. */
>> -    ASSERT((page->count_info & ~PGC_xen_heap) == 0);
>> +    ASSERT((page->count_info & ~PGC_xen_heap) == PGC_state_inuse ||
>> +           (page->count_info & ~PGC_xen_heap) == PGC_state_uninitialised);
> 
> Could the page state perhaps be bumped to inuse in this case? It
> seems odd to leave state uninitialized yet succeed in sharing with a guest.

This would be quite nice indeed, if of course it doesn't cause
new complications.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised
  2020-03-20 13:33     ` Paul Durrant
  2020-03-20 13:53       ` Jan Beulich
@ 2020-03-20 15:17       ` David Woodhouse
  2020-03-23  8:49         ` Paul Durrant
  2020-03-23  9:34         ` Julien Grall
  1 sibling, 2 replies; 16+ messages in thread
From: David Woodhouse @ 2020-03-20 15:17 UTC (permalink / raw)
  To: paul, xen-devel
  Cc: 'Stefano Stabellini', 'Julien Grall',
	'Wei Liu', 'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', hongyxia, 'Jan Beulich',
	'Volodymyr Babchuk', 'Roger Pau Monné'


[-- Attachment #1.1: Type: text/plain, Size: 5010 bytes --]

On Fri, 2020-03-20 at 13:33 +0000, Paul Durrant wrote:
> > -----Original Message-----
> > From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of David Woodhouse
> > Sent: 19 March 2020 21:22
> > To: xen-devel@lists.xenproject.org
> > Cc: Stefano Stabellini <sstabellini@kernel.org>; Julien Grall <julien@xen.org>; Wei Liu <wl@xen.org>;
> > Andrew Cooper <andrew.cooper3@citrix.com>; Ian Jackson <ian.jackson@eu.citrix.com>; George Dunlap
> > <george.dunlap@citrix.com>; hongyxia@amazon.com; Jan Beulich <jbeulich@suse.com>; Volodymyr Babchuk
> > <Volodymyr_Babchuk@epam.com>; Roger Pau Monné <roger.pau@citrix.com>
> > Subject: [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised
> > 
> > From: David Woodhouse <dwmw@amazon.co.uk>
> > 
> > It is possible for pages to enter general circulation without ever
> > being process by init_heap_pages().
> > 
> > For example, pages of the multiboot module containing the initramfs may
> > be assigned via assign_pages() to dom0 as it is created. And some code
> > including map_pages_to_xen() has checks on 'system_state' to determine
> > whether to use the boot or the heap allocator, but it seems impossible
> > to prove that pages allocated by the boot allocator are not subsequently
> > freed with free_heap_pages().
> > 
> > This actually works fine in the majority of cases; there are only a few
> > esoteric corner cases which init_heap_pages() handles before handing the
> > page range off to free_heap_pages():
> >  • Excluding MFN #0 to avoid inappropriate cross-zone merging.
> >  • Ensuring that the node information structures exist, when the first
> >    page(s) of a given node are handled.
> >  • High order allocations crossing from one node to another.
> > 
> > To handle this case, shift PG_state_inuse from its current value of
> > zero, to another value. Use zero, which is the initial state of the
> > entire frame table, as PG_state_uninitialised.
> > 
> > Fix a couple of assertions which were assuming that PG_state_inuse is
> > zero, and make them cope with the PG_state_uninitialised case too where
> > appopriate.
> > 
> > Finally, make free_heap_pages() call through to init_heap_pages() when
> > given a page range which has not been initialised. This cannot keep
> > recursing because init_heap_pages() will set each page state to
> > PGC_state_inuse before passing it back to free_heap_pages() for the
> > second time.
> > 
> > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> > ---
> >  xen/arch/x86/mm.c        |  3 ++-
> >  xen/common/page_alloc.c  | 44 +++++++++++++++++++++++++++++-----------
> >  xen/include/asm-arm/mm.h |  3 ++-
> >  xen/include/asm-x86/mm.h |  3 ++-
> >  4 files changed, 38 insertions(+), 15 deletions(-)
> > 
> > diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> > index 62507ca651..5f0581c072 100644
> > --- a/xen/arch/x86/mm.c
> > +++ b/xen/arch/x86/mm.c
> > @@ -491,7 +491,8 @@ void share_xen_page_with_guest(struct page_info *page, struct domain *d,
> > 
> >      page_set_owner(page, d);
> >      smp_wmb(); /* install valid domain ptr before updating refcnt. */
> > -    ASSERT((page->count_info & ~PGC_xen_heap) == 0);
> > +    ASSERT((page->count_info & ~PGC_xen_heap) == PGC_state_inuse ||
> > +           (page->count_info & ~PGC_xen_heap) == PGC_state_uninitialised);
> 
> Could the page state perhaps be bumped to inuse in this case? It
> seems odd to leave state uninitialized yet succeed in sharing with a
> guest.

No, that doesn't really work.

You can't just *declare* that the page was properly initialised,
because it isn't true. There's a pathological case where the zone
hasn't been initialised at all, because *all* the pages in that zone
got handed out by the boot allocator or used for initrd etc. 

The first pages 'freed' in that zone end up being used (in
init_heap_pages) to create the zone structures.

Likewise, it could include a page which init_heap_pages() doesn't
actually *put* into the buddy allocator, to work around the cross-zone
merge problem. It's fine to use that page and share it with a guest,
but it can't ever be freed into the buddy allocator.

> > @@ -1828,7 +1840,8 @@ static void init_heap_pages(
> >              nr_pages -= n;
> >          }
> > 
> > -        free_heap_pages(pg + i, 0, scrub_debug || idle_scrub);
> 
> Would it be worth an ASSERT(!pg[i].count_info) here in case something
> is added which erroneously modifies the page count info before this
> is done?

That seems valid, I think. Will test it.

> > 
> >          page_set_owner(&pg[i], d);
> >          smp_wmb(); /* Domain pointer must be visible before updating refcnt. */
> > -        pg[i].count_info =
> > -            (pg[i].count_info & PGC_extra) | PGC_allocated | 1;
> > +        pg[i].count_info = (pg[i].count_info & PGC_state) | PGC_allocated | 1;
> 
> The PGC_extra seems to have vapourized here.

Oops. Will fix; thanks.


[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5174 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised
  2020-03-20 15:17       ` David Woodhouse
@ 2020-03-23  8:49         ` Paul Durrant
  2020-03-23 10:45           ` David Woodhouse
  2020-03-23  9:34         ` Julien Grall
  1 sibling, 1 reply; 16+ messages in thread
From: Paul Durrant @ 2020-03-23  8:49 UTC (permalink / raw)
  To: 'David Woodhouse', xen-devel
  Cc: 'Stefano Stabellini', 'Julien Grall',
	'Wei Liu', 'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', hongyxia, 'Jan Beulich',
	'Volodymyr Babchuk', 'Roger Pau Monné'

> -----Original Message-----
> > > diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> > > index 62507ca651..5f0581c072 100644
> > > --- a/xen/arch/x86/mm.c
> > > +++ b/xen/arch/x86/mm.c
> > > @@ -491,7 +491,8 @@ void share_xen_page_with_guest(struct page_info *page, struct domain *d,
> > >
> > >      page_set_owner(page, d);
> > >      smp_wmb(); /* install valid domain ptr before updating refcnt. */
> > > -    ASSERT((page->count_info & ~PGC_xen_heap) == 0);
> > > +    ASSERT((page->count_info & ~PGC_xen_heap) == PGC_state_inuse ||
> > > +           (page->count_info & ~PGC_xen_heap) == PGC_state_uninitialised);
> >
> > Could the page state perhaps be bumped to inuse in this case? It
> > seems odd to leave state uninitialized yet succeed in sharing with a
> > guest.
> 
> No, that doesn't really work.
> 
> You can't just *declare* that the page was properly initialised,
> because it isn't true. There's a pathological case where the zone
> hasn't been initialised at all, because *all* the pages in that zone
> got handed out by the boot allocator or used for initrd etc.
> 
> The first pages 'freed' in that zone end up being used (in
> init_heap_pages) to create the zone structures.
> 
> Likewise, it could include a page which init_heap_pages() doesn't
> actually *put* into the buddy allocator, to work around the cross-zone
> merge problem. It's fine to use that page and share it with a guest,
> but it can't ever be freed into the buddy allocator.
> 

Ok, so deferring the call to free_heap_pages() (and consequently init_heap_pages()) is safe to defer until the guest is torn down? (Or is this only safe if the page is being assigned to the initial domain?)

  Paul


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised
  2020-03-20 15:17       ` David Woodhouse
  2020-03-23  8:49         ` Paul Durrant
@ 2020-03-23  9:34         ` Julien Grall
  2020-03-23 10:55           ` David Woodhouse
  1 sibling, 1 reply; 16+ messages in thread
From: Julien Grall @ 2020-03-23  9:34 UTC (permalink / raw)
  To: David Woodhouse, paul, xen-devel
  Cc: 'Stefano Stabellini', 'Wei Liu',
	'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', hongyxia, 'Jan Beulich',
	'Volodymyr Babchuk', 'Roger Pau Monné'

Hi David,

On 20/03/2020 15:17, David Woodhouse wrote:
> On Fri, 2020-03-20 at 13:33 +0000, Paul Durrant wrote:
>>> -----Original Message-----
>>> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of David Woodhouse
>>> Sent: 19 March 2020 21:22
>>> To: xen-devel@lists.xenproject.org
>>> Cc: Stefano Stabellini <sstabellini@kernel.org>; Julien Grall <julien@xen.org>; Wei Liu <wl@xen.org>;
>>> Andrew Cooper <andrew.cooper3@citrix.com>; Ian Jackson <ian.jackson@eu.citrix.com>; George Dunlap
>>> <george.dunlap@citrix.com>; hongyxia@amazon.com; Jan Beulich <jbeulich@suse.com>; Volodymyr Babchuk
>>> <Volodymyr_Babchuk@epam.com>; Roger Pau Monné <roger.pau@citrix.com>
>>> Subject: [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised
>>>
>>> From: David Woodhouse <dwmw@amazon.co.uk>
>>>
>>> It is possible for pages to enter general circulation without ever
>>> being process by init_heap_pages().
>>>
>>> For example, pages of the multiboot module containing the initramfs may
>>> be assigned via assign_pages() to dom0 as it is created. And some code
>>> including map_pages_to_xen() has checks on 'system_state' to determine
>>> whether to use the boot or the heap allocator, but it seems impossible
>>> to prove that pages allocated by the boot allocator are not subsequently
>>> freed with free_heap_pages().
>>>
>>> This actually works fine in the majority of cases; there are only a few
>>> esoteric corner cases which init_heap_pages() handles before handing the
>>> page range off to free_heap_pages():
>>>   • Excluding MFN #0 to avoid inappropriate cross-zone merging.
>>>   • Ensuring that the node information structures exist, when the first
>>>     page(s) of a given node are handled.
>>>   • High order allocations crossing from one node to another.
>>>
>>> To handle this case, shift PG_state_inuse from its current value of
>>> zero, to another value. Use zero, which is the initial state of the
>>> entire frame table, as PG_state_uninitialised.
>>>
>>> Fix a couple of assertions which were assuming that PG_state_inuse is
>>> zero, and make them cope with the PG_state_uninitialised case too where
>>> appopriate.
>>>
>>> Finally, make free_heap_pages() call through to init_heap_pages() when
>>> given a page range which has not been initialised. This cannot keep
>>> recursing because init_heap_pages() will set each page state to
>>> PGC_state_inuse before passing it back to free_heap_pages() for the
>>> second time.
>>>
>>> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
>>> ---
>>>   xen/arch/x86/mm.c        |  3 ++-
>>>   xen/common/page_alloc.c  | 44 +++++++++++++++++++++++++++++-----------
>>>   xen/include/asm-arm/mm.h |  3 ++-
>>>   xen/include/asm-x86/mm.h |  3 ++-
>>>   4 files changed, 38 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
>>> index 62507ca651..5f0581c072 100644
>>> --- a/xen/arch/x86/mm.c
>>> +++ b/xen/arch/x86/mm.c
>>> @@ -491,7 +491,8 @@ void share_xen_page_with_guest(struct page_info *page, struct domain *d,
>>>
>>>       page_set_owner(page, d);
>>>       smp_wmb(); /* install valid domain ptr before updating refcnt. */
>>> -    ASSERT((page->count_info & ~PGC_xen_heap) == 0);
>>> +    ASSERT((page->count_info & ~PGC_xen_heap) == PGC_state_inuse ||
>>> +           (page->count_info & ~PGC_xen_heap) == PGC_state_uninitialised);
>>
>> Could the page state perhaps be bumped to inuse in this case? It
>> seems odd to leave state uninitialized yet succeed in sharing with a
>> guest.
> 
> No, that doesn't really work.
> 
> You can't just *declare* that the page was properly initialised,
> because it isn't true. There's a pathological case where the zone
> hasn't been initialised at all, because *all* the pages in that zone
> got handed out by the boot allocator or used for initrd etc.
> 
> The first pages 'freed' in that zone end up being used (in
> init_heap_pages) to create the zone structures.
> 
> Likewise, it could include a page which init_heap_pages() doesn't
> actually *put* into the buddy allocator, to work around the cross-zone
> merge problem. It's fine to use that page and share it with a guest,
> but it can't ever be freed into the buddy allocator.

For liveupdate, we will need a way to initialize a page but mark it as 
already inuse (i.e in the same state as they would be if allocated 
normally).

It feels to me, this is also what we want in this case. The page would 
be first initialize and then we can use it normally including freeing 
later on.

Would it make sense to introduce an helper for this purpose?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised
  2020-03-23  8:49         ` Paul Durrant
@ 2020-03-23 10:45           ` David Woodhouse
  0 siblings, 0 replies; 16+ messages in thread
From: David Woodhouse @ 2020-03-23 10:45 UTC (permalink / raw)
  To: paul, xen-devel
  Cc: 'Stefano Stabellini', 'Julien Grall',
	'Wei Liu', 'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', hongyxia, 'Jan Beulich',
	'Volodymyr Babchuk', 'Roger Pau Monné'


[-- Attachment #1.1: Type: text/plain, Size: 763 bytes --]

On Mon, 2020-03-23 at 08:49 +0000, Paul Durrant wrote:
> Ok, so deferring the call to free_heap_pages() (and consequently
> init_heap_pages()) is safe to defer until the guest is torn down? (Or
> is this only safe if the page is being assigned to the initial
> domain?)

It's intended to be safe in all cases, including pages which are
allocated from the boot allocator to be used as page tables (cf. the
early-vmap patches), etc.

We kind of have to assume that it's safe to use the page for whatever
purpose it was allocated for, for the lifetime of that usage. If *that*
isn't true, we have bigger problems.

The PGC_state_uninitialised thing is only about recycling it into the
heap *later*, once the lifetime of that initial usage has ended.

[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5174 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised
  2020-03-23  9:34         ` Julien Grall
@ 2020-03-23 10:55           ` David Woodhouse
  2020-03-24 10:08             ` Julien Grall
  0 siblings, 1 reply; 16+ messages in thread
From: David Woodhouse @ 2020-03-23 10:55 UTC (permalink / raw)
  To: Julien Grall, paul, xen-devel
  Cc: 'Stefano Stabellini', 'Wei Liu',
	'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', hongyxia, 'Jan Beulich',
	'Volodymyr Babchuk', 'Roger Pau Monné'


[-- Attachment #1.1: Type: text/plain, Size: 1726 bytes --]

On Mon, 2020-03-23 at 09:34 +0000, Julien Grall wrote:
> For liveupdate, we will need a way to initialize a page but mark it as 
> already inuse (i.e in the same state as they would be if allocated 
> normally).

I am unconvinced of the veracity of this claim.

We don't want to turn specific details of the current Xen buddy
allocator part into of the implicit ABI of live update. That goes for
the power-of-two zone boundaries, amongst other things.

What if Xen receives LU state in which *all* pages in a given zone are
marked as already in use? That's one of the cases in which we *really*
want to pass through init_heap_pages() instead of just
free_heap_pages(), in order to allocate the zone data structures for
the first pages that get freed into that zone.

What if Xen starts to exclude more pages, like the exclusion at zero?

What if new Xen wants to exclude an additional page due to a hardware
erratum? It can't take it away from existing domains (especially if
there are assigned PCI devices) but it could be part of the vetting in
init_heap_pages(), for example.

My intent for PGC_state_uninitialised was to mark pages that haven't
been through init_heap_pages(), whatever init_heap_pages() does in the
current version of Xen.

The pages which are "already in use" because they're inherited through
LU state should be in PGC_state_uninitialised, shouldn't they?

Perhaps if there's a need for a helper, it could be a companion
function to init_heap_pages() which would return a boolean saying,
"nah, I didn't want to do anything to this page anyway", which could
short-circuit it into the PGC_state_inuse state. But I'm not sure I see
the need for such an optimisation. 


[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5174 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised
  2020-03-23 10:55           ` David Woodhouse
@ 2020-03-24 10:08             ` Julien Grall
  2020-03-24 17:55               ` David Woodhouse
  0 siblings, 1 reply; 16+ messages in thread
From: Julien Grall @ 2020-03-24 10:08 UTC (permalink / raw)
  To: David Woodhouse, paul, xen-devel
  Cc: 'Stefano Stabellini', 'Wei Liu',
	'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', hongyxia, 'Jan Beulich',
	'Volodymyr Babchuk', 'Roger Pau Monné'

Hi David,

On 23/03/2020 10:55, David Woodhouse wrote:
> On Mon, 2020-03-23 at 09:34 +0000, Julien Grall wrote:
>> For liveupdate, we will need a way to initialize a page but mark it as
>> already inuse (i.e in the same state as they would be if allocated
>> normally).
> 
> I am unconvinced of the veracity of this claim.
> 
> We don't want to turn specific details of the current Xen buddy
> allocator part into of the implicit ABI of live update. That goes for
> the power-of-two zone boundaries, amongst other things.

Why would you to do that? Marking the page as already used is no 
different to "PGC_state_unitialized" except the "struct page_info" and 
the internal of the buddy allocator would be properly setup for start 
rather than at free.

> 
> What if Xen receives LU state in which *all* pages in a given zone are
> marked as already in use? That's one of the cases in which we *really*
> want to pass through init_heap_pages() instead of just
> free_heap_pages(), in order to allocate the zone data structures for
> the first pages that get freed into that zone.
> 
> What if Xen starts to exclude more pages, like the exclusion at zero?
> 
> What if new Xen wants to exclude an additional page due to a hardware
> erratum? It can't take it away from existing domains (especially if
> there are assigned PCI devices) but it could be part of the vetting in
> init_heap_pages(), for example.

I don't think it would be safe to continue to run a guest using pages 
that were excluded for an HW erratum. It would be safer to not restart 
the domain (or replace the page) in the target Xen if that's hapenning.

> 
> My intent for PGC_state_uninitialised was to mark pages that haven't
> been through init_heap_pages(), whatever init_heap_pages() does in the
> current version of Xen.
> 
> The pages which are "already in use" because they're inherited through
> LU state should be in PGC_state_uninitialised, shouldn't they?

I think using "PGC_state_unitialised" for preserved page is an abuse. I 
understand this is existing in other part of Xen (particularly on x86), 
but I would rather not try to add more.

The PGC_state_unitialised may work for the current allocator because 
most of the fields are not going to be used after allocation. But it may 
not hold for any new allocator (I know the embedded folks are working on 
a new one).

> 
> Perhaps if there's a need for a helper, it could be a companion
> function to init_heap_pages() which would return a boolean saying,
> "nah, I didn't want to do anything to this page anyway", which could
> short-circuit it into the PGC_state_inuse state. But I'm not sure I see
> the need for such an optimisation.

I don't view it as an optimisation but as a way to avoid spreading the 
current misbehavior.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised
  2020-03-24 10:08             ` Julien Grall
@ 2020-03-24 17:55               ` David Woodhouse
  2020-03-24 18:34                 ` Julien Grall
  0 siblings, 1 reply; 16+ messages in thread
From: David Woodhouse @ 2020-03-24 17:55 UTC (permalink / raw)
  To: Julien Grall, paul, xen-devel
  Cc: 'Stefano Stabellini', 'Wei Liu',
	'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', hongyxia, 'Jan Beulich',
	'Volodymyr Babchuk', 'Roger Pau Monné'

[-- Attachment #1: Type: text/plain, Size: 2038 bytes --]

On Tue, 2020-03-24 at 10:08 +0000, Julien Grall wrote:
> Hi David,
> 
> On 23/03/2020 10:55, David Woodhouse wrote:
> > On Mon, 2020-03-23 at 09:34 +0000, Julien Grall wrote:
> > > For liveupdate, we will need a way to initialize a page but mark it as
> > > already inuse (i.e in the same state as they would be if allocated
> > > normally).
> > 
> > I am unconvinced of the veracity of this claim.
> > 
> > We don't want to turn specific details of the current Xen buddy
> > allocator part into of the implicit ABI of live update. That goes for
> > the power-of-two zone boundaries, amongst other things.
> 
> Why would you to do that? Marking the page as already used is no 
> different to "PGC_state_unitialized" except the "struct page_info" and 
> the internal of the buddy allocator would be properly setup for start 
> rather than at free.

The internals of the buddy allocator *cannot* be set up properly for a
page which it would not actually give out — like the zero page.

We *could* do some tricks to allocate the zone structures for zones
which *do* exist but contain only "pre-allocated" pages so the buddy
allocator has never seen those zones... yet.


> I think using "PGC_state_unitialised" for preserved page is an abuse. I 
> understand this is existing in other part of Xen (particularly on x86), 
> but I would rather not try to add more.


I am perfectly happy to try avoiding PGC_state_uninitialised for pages
which are "in use" at boot time due to live update.

All I insist on is that you explicitly describe the ABI constraints
that it imposes, if any.

For example, that hack which stops the buddy allocator from giving out
page zero: Could we have live updated from a Xen without that hack, to
a Xen which has it? With page zero already given out to a domain?

With PGC_state_initialised and passing the page through
init_heap_pages() if/when it ever gets freed, my answer would be 'yes'.

What's yours? How would we cope with a situation like that, over LU?


[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised
  2020-03-24 17:55               ` David Woodhouse
@ 2020-03-24 18:34                 ` Julien Grall
  0 siblings, 0 replies; 16+ messages in thread
From: Julien Grall @ 2020-03-24 18:34 UTC (permalink / raw)
  To: David Woodhouse, paul, xen-devel
  Cc: 'Stefano Stabellini', 'Wei Liu',
	'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', hongyxia, 'Jan Beulich',
	'Volodymyr Babchuk', 'Roger Pau Monné'



Hi David,

On 24/03/2020 17:55, David Woodhouse wrote:
> On Tue, 2020-03-24 at 10:08 +0000, Julien Grall wrote:
>> Hi David,
>>
>> On 23/03/2020 10:55, David Woodhouse wrote:
>>> On Mon, 2020-03-23 at 09:34 +0000, Julien Grall wrote:
>>>> For liveupdate, we will need a way to initialize a page but mark it as
>>>> already inuse (i.e in the same state as they would be if allocated
>>>> normally).
>>>
>>> I am unconvinced of the veracity of this claim.
>>>
>>> We don't want to turn specific details of the current Xen buddy
>>> allocator part into of the implicit ABI of live update. That goes for
>>> the power-of-two zone boundaries, amongst other things.
>>
>> Why would you to do that? Marking the page as already used is no
>> different to "PGC_state_unitialized" except the "struct page_info" and
>> the internal of the buddy allocator would be properly setup for start
>> rather than at free.
> 
> The internals of the buddy allocator *cannot* be set up properly for a
> page which it would not actually give out — like the zero page.
> 
> We *could* do some tricks to allocate the zone structures for zones
> which *do* exist but contain only "pre-allocated" pages so the buddy
> allocator has never seen those zones... yet.
> 
> 
>> I think using "PGC_state_unitialised" for preserved page is an abuse. I
>> understand this is existing in other part of Xen (particularly on x86),
>> but I would rather not try to add more.
> 
> 
> I am perfectly happy to try avoiding PGC_state_uninitialised for pages
> which are "in use" at boot time due to live update.
> 
> All I insist on is that you explicitly describe the ABI constraints
> that it imposes, if any.

Agreed.

> 
> For example, that hack which stops the buddy allocator from giving out
> page zero: Could we have live updated from a Xen without that hack, to
> a Xen which has it? With page zero already given out to a domain?

The buddy allocator could never have given out page 0 on x86 because it 
is part of the first MB of the RAM (see arch_init_memory() in 
arch/x86/mm.c). AFAIK, the first MB cannot be freed..

The change in the buddy allocator was to address the Arm side and also 
make clear this was a problem this is a weakness of the allocator.

> What's yours? How would we cope with a situation like that, over LU?

When do you expect the pages to be carved out from the buddy allocator?

I can see only two situations:
	1) Workaround a bug in the allocator.
         2) A CPU errata requiring to not use memory.

In the case of 1), it is still safe for a domain to run with that page. 
However, we don't want to give it back to the page allocator. A solution 
is to mark them as "offlining/broken". Alternatively, you could remove 
the swap the page (see more below).

In the case of 2), it is not safe for a domain to run with that page. So 
it is probably best to swap the pages with a new one. For HVM domain 
(including the P2M), it should be easy. For PV domain, I am not entirely 
sure if that's feasible.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] xen/mm: fold PGC_broken into PGC_state bits
  2020-03-19 21:21 ` [Xen-devel] [PATCH 1/2] xen/mm: fold PGC_broken into PGC_state bits David Woodhouse
  2020-03-19 21:21   ` [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised David Woodhouse
  2020-03-20 13:17   ` [Xen-devel] [PATCH 1/2] xen/mm: fold PGC_broken into PGC_state bits Paul Durrant
@ 2020-03-31 12:00   ` Jan Beulich
  2 siblings, 0 replies; 16+ messages in thread
From: Jan Beulich @ 2020-03-31 12:00 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Andrew Cooper,
	Ian Jackson, George Dunlap, hongyxia, xen-devel,
	Volodymyr Babchuk, Roger Pau Monné

On 19.03.2020 22:21, David Woodhouse wrote:
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -422,7 +422,7 @@ long arch_do_domctl(
>                  if ( page->u.inuse.type_info & PGT_pinned )
>                      type |= XEN_DOMCTL_PFINFO_LPINTAB;
>  
> -                if ( page->count_info & PGC_broken )
> +                if ( page_is_broken(page) )
>                      type = XEN_DOMCTL_PFINFO_BROKEN;

Coming back to an earlier request of mine: There are no locks being
held here afaics, so with

#define page_is_broken(pg)         (pgc_is_broken((pg)->count_info))

and

#define pgc_is_broken(pgc)         (pgc_is(pgc, broken) || \
                                    pgc_is(pgc, broken_offlining))

there's a chance that the page gets transitioned from
broken_offlining to broken (by another CPU) between these two
checks, resulting in wrong returned state. Either the latter macro
uses an intermediate variable and ACCESS_ONCE() or, as suggested
before, enumerators get arranged such that the check can be done
(e.g. using binary masking operations) with a single evaluation of
pgc.

This may or may not also be an issue for the other two pgc_is_*(),
but I think at least for symmetry they should then follow suit. At
the very least all three macros need to be immune to uses like
page_is_offlined(pg++) or similar argument expressions with side
effects.

> @@ -1699,19 +1706,18 @@ unsigned int online_page(mfn_t mfn, uint32_t *status)
>      do {
>          ret = *status = 0;
>  
> -        if ( y & PGC_broken )
> +        if ( pgc_is_broken(y) )
>          {
>              ret = -EINVAL;
> -            *status = PG_ONLINE_FAILED |PG_ONLINE_BROKEN;
> +            *status = PG_ONLINE_FAILED | PG_ONLINE_BROKEN;
>              break;
>          }
> -
> -        if ( (y & PGC_state) == PGC_state_offlined )
> +        else if ( pgc_is(y, offlined) )

At the risk of getting flamed again: Even if it was a matter of
taste in new code whether to use "else" in a case like this one,
this isn't new code, and it is in no way necessary to change what
we have for the purpose of this patch. I.e. without even having
to resort to the question of whether personal taste decisions are
to be accepted, this simply falls under "no unrelated /
unnecessary changes please". (FAOD this includes the deletion of
the blank line then as well.)

> --- a/xen/include/asm-x86/mm.h
> +++ b/xen/include/asm-x86/mm.h
> @@ -67,16 +67,32 @@
>   /* 3-bit PAT/PCD/PWT cache-attribute hint. */
>  #define PGC_cacheattr_base PG_shift(6)
>  #define PGC_cacheattr_mask PG_mask(7, 6)
> - /* Page is broken? */
> -#define _PGC_broken       PG_shift(7)
> -#define PGC_broken        PG_mask(1, 7)
> - /* Mutually-exclusive page states: { inuse, offlining, offlined, free }. */
> -#define PGC_state         PG_mask(3, 9)
> -#define PGC_state_inuse   PG_mask(0, 9)
> -#define PGC_state_offlining PG_mask(1, 9)
> -#define PGC_state_offlined PG_mask(2, 9)
> -#define PGC_state_free    PG_mask(3, 9)
> -#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st)
> + /*
> +  * Mutually-exclusive page states:
> +  * { inuse, offlining, offlined, free, broken_offlining, broken }
> +  */
> +#define PGC_state                  PG_mask(7, 9)
> +#define PGC_state_inuse            PG_mask(0, 9)
> +#define PGC_state_offlining        PG_mask(1, 9)
> +#define PGC_state_offlined         PG_mask(2, 9)
> +#define PGC_state_free             PG_mask(3, 9)
> +#define PGC_state_broken_offlining PG_mask(4, 9) /* Broken and offlining */
> +#define PGC_state_broken           PG_mask(5, 9) /* Broken and offlined */
> +
> +#define pgc_is(pgc, st)            (((pgc) & PGC_state) == PGC_state_##st)
> +#define page_state_is(pg, st)       pgc_is((pg)->count_info, st)
> +
> +#define pgc_is_broken(pgc)         (pgc_is(pgc, broken) || \
> +                                    pgc_is(pgc, broken_offlining))
> +#define pgc_is_offlined(pgc)       (pgc_is(pgc, offlined) || \
> +                                    pgc_is(pgc, broken))
> +#define pgc_is_offlining(pgc)      (pgc_is(pgc, offlining) || \
> +                                    pgc_is(pgc, broken_offlining))
> +
> +#define page_is_broken(pg)         (pgc_is_broken((pg)->count_info))
> +#define page_is_offlined(pg)       (pgc_is_broken((pg)->count_info))
> +#define page_is_offlining(pg)      (pgc_is_broken((pg)->count_info))

Copy-and-paste mistake (rhs is the same for all three; same for Arm)?
Also there's no need here for the outer pairs of parentheses.

Also, for the next version, may I ask that you number versions in
the subject's tag and that you provide a brief description of
changes from the previous version (if any, but there ought to be
some in a series for there to be a point to send out)? Thanks.

Jan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised
  2020-03-19 21:21   ` [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised David Woodhouse
  2020-03-20 13:33     ` Paul Durrant
@ 2020-03-31 12:10     ` Jan Beulich
  1 sibling, 0 replies; 16+ messages in thread
From: Jan Beulich @ 2020-03-31 12:10 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Andrew Cooper,
	Ian Jackson, George Dunlap, hongyxia, xen-devel,
	Volodymyr Babchuk, Roger Pau Monné

On 19.03.2020 22:21, David Woodhouse wrote:
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -491,7 +491,8 @@ void share_xen_page_with_guest(struct page_info *page, struct domain *d,
>  
>      page_set_owner(page, d);
>      smp_wmb(); /* install valid domain ptr before updating refcnt. */
> -    ASSERT((page->count_info & ~PGC_xen_heap) == 0);
> +    ASSERT((page->count_info & ~PGC_xen_heap) == PGC_state_inuse ||
> +           (page->count_info & ~PGC_xen_heap) == PGC_state_uninitialised);

Like for patch 1, there's a risk of the page transitioning from
uninitialised to inuse between these two comparisons, making the
ASSERT() trigger when it shouldn't. As you've got two more
similar constructs further down in the patch, maybe this also
warrants a helper function/macro?

> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -252,6 +252,8 @@ struct bootmem_region {
>  static struct bootmem_region __initdata
>      bootmem_region_list[PAGE_SIZE / sizeof(struct bootmem_region)];
>  static unsigned int __initdata nr_bootmem_regions;
> +static void init_heap_pages(struct page_info *pg, unsigned long nr_pages,
> +                            bool scrub);
>  
>  struct scrub_region {
>      unsigned long offset;
> @@ -1390,6 +1392,17 @@ static void free_heap_pages(
>      ASSERT(order <= MAX_ORDER);
>      ASSERT(node >= 0);
>  
> +    if ( page_state_is(pg, uninitialised) )
> +    {
> +        init_heap_pages(pg, 1 << order, need_scrub);

While, at least for now, it shouldn't matter in practice, and
while code overall is very inconsistent in this regard, with
the respective parameter type being "unsigned long" may I
suggest to use 1UL here?

Jan


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-03-31 12:11 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-19 21:17 [Xen-devel] [PATCH 0/2] Handle David Woodhouse
2020-03-19 21:21 ` [Xen-devel] [PATCH 1/2] xen/mm: fold PGC_broken into PGC_state bits David Woodhouse
2020-03-19 21:21   ` [Xen-devel] [PATCH 2/2] xen/mm: Introduce PGC_state_uninitialised David Woodhouse
2020-03-20 13:33     ` Paul Durrant
2020-03-20 13:53       ` Jan Beulich
2020-03-20 15:17       ` David Woodhouse
2020-03-23  8:49         ` Paul Durrant
2020-03-23 10:45           ` David Woodhouse
2020-03-23  9:34         ` Julien Grall
2020-03-23 10:55           ` David Woodhouse
2020-03-24 10:08             ` Julien Grall
2020-03-24 17:55               ` David Woodhouse
2020-03-24 18:34                 ` Julien Grall
2020-03-31 12:10     ` Jan Beulich
2020-03-20 13:17   ` [Xen-devel] [PATCH 1/2] xen/mm: fold PGC_broken into PGC_state bits Paul Durrant
2020-03-31 12:00   ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).