All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall
@ 2012-11-13 22:23 Dan Magenheimer
  2012-11-14  7:23 ` Keir Fraser
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Dan Magenheimer @ 2012-11-13 22:23 UTC (permalink / raw)
  To: Jan Beulich, Keir Fraser, TimDeegan; +Cc: Zhigang Wang, Konrad Wilk, xen-devel

[-- Attachment #1: Type: text/plain, Size: 13801 bytes --]

This is a first cut of the hypervisor patch of the proposed
XENMEM_claim_pages hypercall/subop.  The goal of this hypercall
is to attempt to atomically and very quickly determine if
there are sufficient pages available in the system and, if
so, "set aside" that quantity of pages for future allocations
by that domain.  Unlike an existing hypercall such as
increase_reservation or populate_physmap, specific physical
pageframes are not assigned to the domain because this
cannot be done sufficiently quickly (especially for very large
allocations in an arbitrarily fragmented system) and so the
existing mechanisms result in classic time-of-check-time-of-use
(TOCTOU) races.  One can think of claiming as similar to a
"lazy" allocation, but subsequent hypercalls are required
to do the actual physical pageframe allocation.

This patch is (so far) only compile-tested.

I don't have a patch for the toolstack side, but I envision
a "xl create --claim" option to maximize backwards
compatibility while minimizing impact on existing toolstacks.

Code overview:

Though the hypercall simply does arithmetic within locks,
some of the semantics in the code may be a bit subtle
so let me explain a bit.

The key variables (d->unclaimed_pages and total_unclaimed_pages)
start at zero if no claim has yet been staked for any domain.
(Perhaps a better name is "claimed_but_not_yet_possessed" but that's
a bit unwieldy.)  If no claim hypercalls are executed, there
should be no impact on existing usage.

When a claim is successfully staked by a domain, it is like a
watermark but there is no record kept of the size of the claim.
Instead, d->unclaimed_pages is set to the difference between
d->tot_pages and the claim (d->tot_pages may normally be zero
but see Note 1).  When d->tot_pages increases or decreases,
d->unclaimed_pages atomically decreases or increases.  Once
d->unclaimed_pages reaches zero, the claim is satisfied and
d->unclaimed pages stays at zero -- unless a new claim is
subsequently staked.  See Note 2.

The systemwide variable total_unclaimed_pages is always the sum
of d->unclaimed_pages, across all domains.  A non-domain-
specific heap allocation will fail if total_unclaimed_pages
exceeds free (plus, on tmem enabled systems, freeable) pages.

Claim semantics may be modified by flags.  The initial
implementation has one flag, which discerns whether the
caller would like tmem freeable pages to be considered
in determining whether or not the claim can be successfully
staked. Future flags may, for example, specify that the
caller would like the claim to be successful only if there
are sufficient pages available on a single node (per Dario's
suggestion).

Note 1: Tim: I'm thinking this may resolve your concern that
the claim mechanism must be more complicated to handle
restricted memory allocations and order>0 allocations.
The proposed claim mechanism only guarantees a quantity of
order==0 pages; if restricted allocations are required, these
are done first by the toolstack, and followed by the claim.
Order>0 allocations still work if memory is not fragmented,
but the claim mechanism doesn't guarantee anything but
a quantity of order==0 pages.

Note 2: Tim: This arithmetic also indirectly implements the
"claim auto-expire" discussed earlier.  We certainly don't
want a negative claim.  It's possible that the toolstack need
never call "unclaim" (and so such a hypercall/subop need
not even exist) as long as domain_reset_unclaimed_pages()
is called when a domain dies.  This draft doesn't even
provide unclaim... though if a reason for its existence
is determined, it should be easy to add.

Thanks for any feedback!

Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>

Diffstat:
 arch/x86/mm/mem_sharing.c |    4 -
 common/grant_table.c      |    2 
 common/memory.c           |   15 ++++++-
 common/page_alloc.c       |   93 +++++++++++++++++++++++++++++++++++++++++++++-
 include/public/memory.h   |   19 +++++++++
 include/xen/mm.h          |    7 +++
 include/xen/sched.h       |    1 
 7 files changed, 135 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index 5103285..943a3b5 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -639,7 +639,7 @@ static int page_make_sharable(struct domain *d,
     }
 
     page_set_owner(page, dom_cow);
-    d->tot_pages--;
+    domain_decrease_tot_pages(d, 1);
     drop_dom_ref = (d->tot_pages == 0);
     page_list_del(page, &d->page_list);
     spin_unlock(&d->page_alloc_lock);
@@ -680,7 +680,7 @@ static int page_make_private(struct domain *d, struct page_info *page)
     ASSERT(page_get_owner(page) == dom_cow);
     page_set_owner(page, d);
 
-    if ( d->tot_pages++ == 0 )
+    if ( domain_increase_tot_pages(d, 1) == 0 )
         get_domain(d);
     page_list_add_tail(page, &d->page_list);
     spin_unlock(&d->page_alloc_lock);
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 7912769..10ce78f 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1656,7 +1656,7 @@ gnttab_transfer(
         }
 
         /* Okay, add the page to 'e'. */
-        if ( unlikely(e->tot_pages++ == 0) )
+        if ( unlikely(domain_increase_tot_pages(e, 1) == 0) )
             get_knownalive_domain(e);
         page_list_add_tail(page, &e->page_list);
         page_set_owner(page, e);
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 83e2666..fc7b24c 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -454,7 +454,7 @@ static long memory_exchange(XEN_GUEST_HANDLE_PARAM(xen_memory_exchange_t) arg)
                              (j * (1UL << exch.out.extent_order)));
 
                 spin_lock(&d->page_alloc_lock);
-                d->tot_pages -= dec_count;
+                domain_decrease_tot_pages(d, dec_count);
                 drop_dom_ref = (dec_count && !d->tot_pages);
                 spin_unlock(&d->page_alloc_lock);
 
@@ -685,6 +685,19 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         break;
     }
 
+    case XENMEM_claim_pages:
+
+        rc = rcu_lock_target_domain_by_id(domid, &d);
+        if ( rc )
+            return rc;
+
+        rc = domain_set_unclaimed_pages(d, reservation.nr_extents,
+                                        reservation.mem_flags);
+
+        rcu_unlock_domain(d);
+
+        break;
+
     default:
         rc = arch_memory_op(op, arg);
         break;
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 15ebc66..ddda9a1 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -238,6 +238,90 @@ static long midsize_alloc_zone_pages;
 #define MIDSIZE_ALLOC_FRAC 128
 
 static DEFINE_SPINLOCK(heap_lock);
+static long total_unclaimed_pages; /* total outstanding claims by all domains */
+
+unsigned long domain_increase_tot_pages(struct domain *d, unsigned long pages)
+{
+    long dom_before, dom_after, dom_claimed, sys_before, sys_after;
+
+    ASSERT(spin_is_locked(&d->page_alloc_lock));
+    ASSERT(!spin_is_locked(&heap_lock));
+    spin_lock(&heap_lock);
+    d->tot_pages += pages;
+    if ( d->unclaimed_pages )
+    {
+        dom_before = d->unclaimed_pages;
+        dom_after = dom_before - pages;
+        if ( (dom_before > 0) && (dom_after < 0) )
+            dom_claimed = 0;
+        else
+            dom_claimed = dom_after;
+	sys_before = total_unclaimed_pages;
+	sys_after = sys_before - (dom_before - dom_claimed);
+	BUG_ON( (sys_before > 0) && (sys_after < 0) );
+        total_unclaimed_pages = sys_after;
+        d->unclaimed_pages = dom_claimed;
+    }
+    spin_unlock(&heap_lock);
+    return d->tot_pages;
+}
+
+unsigned long domain_decrease_tot_pages(struct domain *d, unsigned long pages)
+{
+    ASSERT(spin_is_locked(&d->page_alloc_lock));
+    ASSERT(!spin_is_locked(&heap_lock));
+    spin_lock(&heap_lock);
+    d->tot_pages -= pages;
+    if ( d->unclaimed_pages )
+    {
+        d->unclaimed_pages += pages;
+	total_unclaimed_pages += pages;
+    }
+    spin_unlock(&heap_lock);
+    return d->tot_pages;
+}
+
+int domain_set_unclaimed_pages(struct domain *d, unsigned long pages,
+                                unsigned long flags)
+{
+    int ret = -ENOMEM;
+    unsigned long avail_pages;
+
+    ASSERT(!spin_is_locked(&d->page_alloc_lock));
+    ASSERT(!spin_is_locked(&heap_lock));
+    spin_lock(&d->page_alloc_lock);
+    spin_lock(&heap_lock);
+    /* only one active claim per domain please */
+    if ( d->unclaimed_pages )
+	goto out;
+    avail_pages = total_avail_pages;
+    if ( !(flags & XENMEM_CLAIMF_free_only) )
+        avail_pages += tmem_freeable_pages();
+    avail_pages -= total_unclaimed_pages;
+    if ( pages < avail_pages )
+    {
+        /* ok if domain has allocated some pages before making a claim */
+        d->unclaimed_pages = pages - d->tot_pages;
+        total_unclaimed_pages += total_unclaimed_pages;
+        ret = 0;
+    }
+out:
+    spin_unlock(&heap_lock);
+    spin_unlock(&d->page_alloc_lock);
+    return ret;
+}
+
+void domain_reset_unclaimed_pages(struct domain *d)
+{
+    ASSERT(!spin_is_locked(&d->page_alloc_lock));
+    ASSERT(!spin_is_locked(&heap_lock));
+    spin_lock(&d->page_alloc_lock);
+    spin_lock(&heap_lock);
+    total_unclaimed_pages -= d->unclaimed_pages;
+    d->unclaimed_pages = 0;
+    spin_unlock(&heap_lock);
+    spin_unlock(&d->page_alloc_lock);
+}
 
 static unsigned long init_node_heap(int node, unsigned long mfn,
                                     unsigned long nr, bool_t *use_tail)
@@ -442,6 +526,11 @@ static struct page_info *alloc_heap_pages(
 
     spin_lock(&heap_lock);
 
+    /* Claimed memory is considered not available */
+    if ( total_unclaimed_pages + request >
+         total_avail_pages + tmem_freeable_pages() )
+        goto not_found;
+
     /*
      * TMEM: When available memory is scarce due to tmem absorbing it, allow
      * only mid-size allocations to avoid worst of fragmentation issues.
@@ -1291,7 +1380,7 @@ int assign_pages(
         if ( unlikely(d->tot_pages == 0) )
             get_knownalive_domain(d);
 
-        d->tot_pages += 1 << order;
+        domain_increase_tot_pages(d, 1 << order);
     }
 
     for ( i = 0; i < (1 << order); i++ )
@@ -1375,7 +1464,7 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
             page_list_del2(&pg[i], &d->page_list, &d->arch.relmem_list);
         }
 
-        d->tot_pages -= 1 << order;
+        domain_decrease_tot_pages(d, 1 << order);
         drop_dom_ref = (d->tot_pages == 0);
 
         spin_unlock_recursive(&d->page_alloc_lock);
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index f1ddbc0..d8c8b4c 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -421,6 +421,25 @@ struct xen_mem_sharing_op {
 typedef struct xen_mem_sharing_op xen_mem_sharing_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_mem_sharing_op_t);
 
+/*
+ * Attempt to stake a claim for a domain on a quantity of pages
+ * of system RAM, but _not_ assign specific pageframes.  Only
+ * arithmetic is performed so the hypercall is very fast and need
+ * not be preemptible, thus sidestepping time-of-check-time-of-use
+ * races for memory allocation.  Returns 0 if the hypervisor page
+ * allocator has atomically and successfully claimed the requested
+ * number of pages, else -ENOMEM.
+ */
+#define XENMEM_claim_pages                  24
+/*
+ * XENMEM_claim_pages flags:
+ *  free_only: claim is successful only if sufficient free pages
+ *    are available.  If not set and tmem is enabled, hypervisor
+ *    may also consider tmem "freeable" pages to satisfy the claim.
+ */
+#define _XENMEM_CLAIMF_free_only            0
+#define XENMEM_CLAIMF_free_only             (1U<<_XENMEM_CLAIMF_free_only)
+
 #endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */
 
 #endif /* __XEN_PUBLIC_MEMORY_H__ */
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index 64a0cc1..120b93f 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -48,6 +48,13 @@ void free_xenheap_pages(void *v, unsigned int order);
 #define alloc_xenheap_page() (alloc_xenheap_pages(0,0))
 #define free_xenheap_page(v) (free_xenheap_pages(v,0))
 
+/* Claim handling */
+unsigned long domain_increase_tot_pages(struct domain *d, unsigned long pages);
+unsigned long domain_decrease_tot_pages(struct domain *d, unsigned long pages);
+int domain_set_unclaimed_pages(
+    struct domain *d, unsigned long pages, unsigned long flags);
+void domain_reset_unclaimed_pages(struct domain *d);
+
 /* Domain suballocator. These functions are *not* interrupt-safe.*/
 void init_domheap_pages(paddr_t ps, paddr_t pe);
 struct page_info *alloc_domheap_pages(
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 6c55039..480ef39 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -242,6 +242,7 @@ struct domain
     struct page_list_head page_list;  /* linked list */
     struct page_list_head xenpage_list; /* linked list (size xenheap_pages) */
     unsigned int     tot_pages;       /* number of pages currently possesed */
+    unsigned int     unclaimed_pages; /* pages claimed but not possessed    */
     unsigned int     max_pages;       /* maximum value for tot_pages        */
     atomic_t         shr_pages;       /* number of shared pages             */
     atomic_t         paged_pages;     /* number of paged-out pages          */

[-- Attachment #2: claim-121113.patch --]
[-- Type: application/octet-stream, Size: 9372 bytes --]

diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index 5103285..943a3b5 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -639,7 +639,7 @@ static int page_make_sharable(struct domain *d,
     }
 
     page_set_owner(page, dom_cow);
-    d->tot_pages--;
+    domain_decrease_tot_pages(d, 1);
     drop_dom_ref = (d->tot_pages == 0);
     page_list_del(page, &d->page_list);
     spin_unlock(&d->page_alloc_lock);
@@ -680,7 +680,7 @@ static int page_make_private(struct domain *d, struct page_info *page)
     ASSERT(page_get_owner(page) == dom_cow);
     page_set_owner(page, d);
 
-    if ( d->tot_pages++ == 0 )
+    if ( domain_increase_tot_pages(d, 1) == 0 )
         get_domain(d);
     page_list_add_tail(page, &d->page_list);
     spin_unlock(&d->page_alloc_lock);
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 7912769..10ce78f 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1656,7 +1656,7 @@ gnttab_transfer(
         }
 
         /* Okay, add the page to 'e'. */
-        if ( unlikely(e->tot_pages++ == 0) )
+        if ( unlikely(domain_increase_tot_pages(e, 1) == 0) )
             get_knownalive_domain(e);
         page_list_add_tail(page, &e->page_list);
         page_set_owner(page, e);
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 83e2666..fc7b24c 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -454,7 +454,7 @@ static long memory_exchange(XEN_GUEST_HANDLE_PARAM(xen_memory_exchange_t) arg)
                              (j * (1UL << exch.out.extent_order)));
 
                 spin_lock(&d->page_alloc_lock);
-                d->tot_pages -= dec_count;
+                domain_decrease_tot_pages(d, dec_count);
                 drop_dom_ref = (dec_count && !d->tot_pages);
                 spin_unlock(&d->page_alloc_lock);
 
@@ -685,6 +685,19 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         break;
     }
 
+    case XENMEM_claim_pages:
+
+        rc = rcu_lock_target_domain_by_id(domid, &d);
+        if ( rc )
+            return rc;
+
+        rc = domain_set_unclaimed_pages(d, reservation.nr_extents,
+                                        reservation.mem_flags);
+
+        rcu_unlock_domain(d);
+
+        break;
+
     default:
         rc = arch_memory_op(op, arg);
         break;
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 15ebc66..ddda9a1 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -238,6 +238,90 @@ static long midsize_alloc_zone_pages;
 #define MIDSIZE_ALLOC_FRAC 128
 
 static DEFINE_SPINLOCK(heap_lock);
+static long total_unclaimed_pages; /* total outstanding claims by all domains */
+
+unsigned long domain_increase_tot_pages(struct domain *d, unsigned long pages)
+{
+    long dom_before, dom_after, dom_claimed, sys_before, sys_after;
+
+    ASSERT(spin_is_locked(&d->page_alloc_lock));
+    ASSERT(!spin_is_locked(&heap_lock));
+    spin_lock(&heap_lock);
+    d->tot_pages += pages;
+    if ( d->unclaimed_pages )
+    {
+        dom_before = d->unclaimed_pages;
+        dom_after = dom_before - pages;
+        if ( (dom_before > 0) && (dom_after < 0) )
+            dom_claimed = 0;
+        else
+            dom_claimed = dom_after;
+	sys_before = total_unclaimed_pages;
+	sys_after = sys_before - (dom_before - dom_claimed);
+	BUG_ON( (sys_before > 0) && (sys_after < 0) );
+        total_unclaimed_pages = sys_after;
+        d->unclaimed_pages = dom_claimed;
+    }
+    spin_unlock(&heap_lock);
+    return d->tot_pages;
+}
+
+unsigned long domain_decrease_tot_pages(struct domain *d, unsigned long pages)
+{
+    ASSERT(spin_is_locked(&d->page_alloc_lock));
+    ASSERT(!spin_is_locked(&heap_lock));
+    spin_lock(&heap_lock);
+    d->tot_pages -= pages;
+    if ( d->unclaimed_pages )
+    {
+        d->unclaimed_pages += pages;
+	total_unclaimed_pages += pages;
+    }
+    spin_unlock(&heap_lock);
+    return d->tot_pages;
+}
+
+int domain_set_unclaimed_pages(struct domain *d, unsigned long pages,
+                                unsigned long flags)
+{
+    int ret = -ENOMEM;
+    unsigned long avail_pages;
+
+    ASSERT(!spin_is_locked(&d->page_alloc_lock));
+    ASSERT(!spin_is_locked(&heap_lock));
+    spin_lock(&d->page_alloc_lock);
+    spin_lock(&heap_lock);
+    /* only one active claim per domain please */
+    if ( d->unclaimed_pages )
+	goto out;
+    avail_pages = total_avail_pages;
+    if ( !(flags & XENMEM_CLAIMF_free_only) )
+        avail_pages += tmem_freeable_pages();
+    avail_pages -= total_unclaimed_pages;
+    if ( pages < avail_pages )
+    {
+        /* ok if domain has allocated some pages before making a claim */
+        d->unclaimed_pages = pages - d->tot_pages;
+        total_unclaimed_pages += total_unclaimed_pages;
+        ret = 0;
+    }
+out:
+    spin_unlock(&heap_lock);
+    spin_unlock(&d->page_alloc_lock);
+    return ret;
+}
+
+void domain_reset_unclaimed_pages(struct domain *d)
+{
+    ASSERT(!spin_is_locked(&d->page_alloc_lock));
+    ASSERT(!spin_is_locked(&heap_lock));
+    spin_lock(&d->page_alloc_lock);
+    spin_lock(&heap_lock);
+    total_unclaimed_pages -= d->unclaimed_pages;
+    d->unclaimed_pages = 0;
+    spin_unlock(&heap_lock);
+    spin_unlock(&d->page_alloc_lock);
+}
 
 static unsigned long init_node_heap(int node, unsigned long mfn,
                                     unsigned long nr, bool_t *use_tail)
@@ -442,6 +526,11 @@ static struct page_info *alloc_heap_pages(
 
     spin_lock(&heap_lock);
 
+    /* Claimed memory is considered not available */
+    if ( total_unclaimed_pages + request >
+         total_avail_pages + tmem_freeable_pages() )
+        goto not_found;
+
     /*
      * TMEM: When available memory is scarce due to tmem absorbing it, allow
      * only mid-size allocations to avoid worst of fragmentation issues.
@@ -1291,7 +1380,7 @@ int assign_pages(
         if ( unlikely(d->tot_pages == 0) )
             get_knownalive_domain(d);
 
-        d->tot_pages += 1 << order;
+        domain_increase_tot_pages(d, 1 << order);
     }
 
     for ( i = 0; i < (1 << order); i++ )
@@ -1375,7 +1464,7 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
             page_list_del2(&pg[i], &d->page_list, &d->arch.relmem_list);
         }
 
-        d->tot_pages -= 1 << order;
+        domain_decrease_tot_pages(d, 1 << order);
         drop_dom_ref = (d->tot_pages == 0);
 
         spin_unlock_recursive(&d->page_alloc_lock);
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index f1ddbc0..d8c8b4c 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -421,6 +421,25 @@ struct xen_mem_sharing_op {
 typedef struct xen_mem_sharing_op xen_mem_sharing_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_mem_sharing_op_t);
 
+/*
+ * Attempt to stake a claim for a domain on a quantity of pages
+ * of system RAM, but _not_ assign specific pageframes.  Only
+ * arithmetic is performed so the hypercall is very fast and need
+ * not be preemptible, thus sidestepping time-of-check-time-of-use
+ * races for memory allocation.  Returns 0 if the hypervisor page
+ * allocator has atomically and successfully claimed the requested
+ * number of pages, else -ENOMEM.
+ */
+#define XENMEM_claim_pages                  24
+/*
+ * XENMEM_claim_pages flags:
+ *  free_only: claim is successful only if sufficient free pages
+ *    are available.  If not set and tmem is enabled, hypervisor
+ *    may also consider tmem "freeable" pages to satisfy the claim.
+ */
+#define _XENMEM_CLAIMF_free_only            0
+#define XENMEM_CLAIMF_free_only             (1U<<_XENMEM_CLAIMF_free_only)
+
 #endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */
 
 #endif /* __XEN_PUBLIC_MEMORY_H__ */
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index 64a0cc1..120b93f 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -48,6 +48,13 @@ void free_xenheap_pages(void *v, unsigned int order);
 #define alloc_xenheap_page() (alloc_xenheap_pages(0,0))
 #define free_xenheap_page(v) (free_xenheap_pages(v,0))
 
+/* Claim handling */
+unsigned long domain_increase_tot_pages(struct domain *d, unsigned long pages);
+unsigned long domain_decrease_tot_pages(struct domain *d, unsigned long pages);
+int domain_set_unclaimed_pages(
+    struct domain *d, unsigned long pages, unsigned long flags);
+void domain_reset_unclaimed_pages(struct domain *d);
+
 /* Domain suballocator. These functions are *not* interrupt-safe.*/
 void init_domheap_pages(paddr_t ps, paddr_t pe);
 struct page_info *alloc_domheap_pages(
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 6c55039..480ef39 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -242,6 +242,7 @@ struct domain
     struct page_list_head page_list;  /* linked list */
     struct page_list_head xenpage_list; /* linked list (size xenheap_pages) */
     unsigned int     tot_pages;       /* number of pages currently possesed */
+    unsigned int     unclaimed_pages; /* pages claimed but not possessed    */
     unsigned int     max_pages;       /* maximum value for tot_pages        */
     atomic_t         shr_pages;       /* number of shared pages             */
     atomic_t         paged_pages;     /* number of paged-out pages          */

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall
  2012-11-13 22:23 [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall Dan Magenheimer
@ 2012-11-14  7:23 ` Keir Fraser
  2012-11-14 17:42   ` Dan Magenheimer
  2012-11-14 10:33 ` Jan Beulich
  2012-11-15 10:47 ` Ian Campbell
  2 siblings, 1 reply; 12+ messages in thread
From: Keir Fraser @ 2012-11-14  7:23 UTC (permalink / raw)
  To: Dan Magenheimer, Jan Beulich, TimDeegan
  Cc: Zhigang Wang, Konrad Rzeszutek Wilk, xen-devel

On 13/11/2012 22:23, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:

> The key variables (d->unclaimed_pages and total_unclaimed_pages)
> start at zero if no claim has yet been staked for any domain.
> (Perhaps a better name is "claimed_but_not_yet_possessed" but that's
> a bit unwieldy.)  If no claim hypercalls are executed, there
> should be no impact on existing usage.

It would be nice if d->tot_pages adjustments didn't take the global
heap_lock in this case. There's probably some way to bail out of those new
update functions before doing the locked work, in that case.

 -- Keir

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall
  2012-11-13 22:23 [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall Dan Magenheimer
  2012-11-14  7:23 ` Keir Fraser
@ 2012-11-14 10:33 ` Jan Beulich
  2012-11-14 18:33   ` Dan Magenheimer
  2012-11-15 10:47 ` Ian Campbell
  2 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2012-11-14 10:33 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: Keir Fraser, TimDeegan, KonradWilk, Zhigang Wang, xen-devel

>>> On 13.11.12 at 23:23, Dan Magenheimer <dan.magenheimer@oracle.com> wrote:
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -454,7 +454,7 @@ static long memory_exchange(XEN_GUEST_HANDLE_PARAM(xen_memory_exchange_t) arg)
>                               (j * (1UL << exch.out.extent_order)));
>  
>                  spin_lock(&d->page_alloc_lock);
> -                d->tot_pages -= dec_count;
> +                domain_decrease_tot_pages(d, dec_count);
>                  drop_dom_ref = (dec_count && !d->tot_pages);
>                  spin_unlock(&d->page_alloc_lock);
>  
> @@ -685,6 +685,19 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>          break;
>      }
>  
> +    case XENMEM_claim_pages:
> +
> +        rc = rcu_lock_target_domain_by_id(domid, &d);

Where does "domid" come from?

> +        if ( rc )
> +            return rc;
> +
> +        rc = domain_set_unclaimed_pages(d, reservation.nr_extents,
> +                                        reservation.mem_flags);

Where do you initialize "reservation"?

And if you're indeed intending to re-use struct
xen_memory_reservation for this sub-function, then you either
ought to honor the other fields, or bail on them not being zero.

> +
> +        rcu_unlock_domain(d);
> +
> +        break;
> +
>      default:
>          rc = arch_memory_op(op, arg);
>          break;
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -238,6 +238,90 @@ static long midsize_alloc_zone_pages;
>  #define MIDSIZE_ALLOC_FRAC 128
>  
>  static DEFINE_SPINLOCK(heap_lock);
> +static long total_unclaimed_pages; /* total outstanding claims by all domains */
> +
> +unsigned long domain_increase_tot_pages(struct domain *d, unsigned long pages)
> +{
> +    long dom_before, dom_after, dom_claimed, sys_before, sys_after;
> +
> +    ASSERT(spin_is_locked(&d->page_alloc_lock));
> +    ASSERT(!spin_is_locked(&heap_lock));
> +    spin_lock(&heap_lock);
> +    d->tot_pages += pages;
> +    if ( d->unclaimed_pages )
> +    {
> +        dom_before = d->unclaimed_pages;
> +        dom_after = dom_before - pages;
> +        if ( (dom_before > 0) && (dom_after < 0) )
> +            dom_claimed = 0;
> +        else
> +            dom_claimed = dom_after;
> +	sys_before = total_unclaimed_pages;
> +	sys_after = sys_before - (dom_before - dom_claimed);
> +	BUG_ON( (sys_before > 0) && (sys_after < 0) );
> +        total_unclaimed_pages = sys_after;
> +        d->unclaimed_pages = dom_claimed;
> +    }
> +    spin_unlock(&heap_lock);
> +    return d->tot_pages;
> +}
> +
> +unsigned long domain_decrease_tot_pages(struct domain *d, unsigned long pages)
> +{
> +    ASSERT(spin_is_locked(&d->page_alloc_lock));
> +    ASSERT(!spin_is_locked(&heap_lock));
> +    spin_lock(&heap_lock);
> +    d->tot_pages -= pages;
> +    if ( d->unclaimed_pages )
> +    {
> +        d->unclaimed_pages += pages;
> +	total_unclaimed_pages += pages;
> +    }
> +    spin_unlock(&heap_lock);
> +    return d->tot_pages;
> +}
> +
> +int domain_set_unclaimed_pages(struct domain *d, unsigned long pages,
> +                                unsigned long flags)
> +{
> +    int ret = -ENOMEM;
> +    unsigned long avail_pages;
> +
> +    ASSERT(!spin_is_locked(&d->page_alloc_lock));
> +    ASSERT(!spin_is_locked(&heap_lock));
> +    spin_lock(&d->page_alloc_lock);
> +    spin_lock(&heap_lock);
> +    /* only one active claim per domain please */
> +    if ( d->unclaimed_pages )
> +	goto out;
> +    avail_pages = total_avail_pages;
> +    if ( !(flags & XENMEM_CLAIMF_free_only) )
> +        avail_pages += tmem_freeable_pages();
> +    avail_pages -= total_unclaimed_pages;
> +    if ( pages < avail_pages )
> +    {
> +        /* ok if domain has allocated some pages before making a claim */
> +        d->unclaimed_pages = pages - d->tot_pages;
> +        total_unclaimed_pages += total_unclaimed_pages;

        total_unclaimed_pages += d->unclaimed_pages;

> +        ret = 0;
> +    }
> +out:
> +    spin_unlock(&heap_lock);
> +    spin_unlock(&d->page_alloc_lock);
> +    return ret;
> +}
> +
> +void domain_reset_unclaimed_pages(struct domain *d)
> +{
> +    ASSERT(!spin_is_locked(&d->page_alloc_lock));
> +    ASSERT(!spin_is_locked(&heap_lock));
> +    spin_lock(&d->page_alloc_lock);
> +    spin_lock(&heap_lock);
> +    total_unclaimed_pages -= d->unclaimed_pages;
> +    d->unclaimed_pages = 0;
> +    spin_unlock(&heap_lock);
> +    spin_unlock(&d->page_alloc_lock);

What is the point of acquiring d->page_alloc_lock here if
d->unclaimed_pages is always manipulated with heap_lock
held anyway?

Also, you mention the need to use this function at least
during domain shutdown, but there is no user of it throughout
the patch.

> +}
>  
>  static unsigned long init_node_heap(int node, unsigned long mfn,
>                                      unsigned long nr, bool_t *use_tail)
> @@ -442,6 +526,11 @@ static struct page_info *alloc_heap_pages(
>  
>      spin_lock(&heap_lock);
>  
> +    /* Claimed memory is considered not available */
> +    if ( total_unclaimed_pages + request >
> +         total_avail_pages + tmem_freeable_pages() )

So how would a domain having an active claim get to allocate
its memory if total_unclaimed_pages == total_avail_pages +
tmem_freeable_pages()?

> +        goto not_found;
> +
>      /*
>       * TMEM: When available memory is scarce due to tmem absorbing it, allow
>       * only mid-size allocations to avoid worst of fragmentation issues.

There are also a few uses of tabs for indentation that would
need cleaning up.

Finally, assuming we really want/need this, shouldn't there be
a way for claims to expire (e.g. in case of tool stack problems),
so they don't indefinitely prevent (perhaps large) portions of
memory to be used for other purpose (e.g. when a domain can't
get fully cleaned up)?

Jan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall
  2012-11-14  7:23 ` Keir Fraser
@ 2012-11-14 17:42   ` Dan Magenheimer
  2012-11-14 18:58     ` Keir Fraser
  0 siblings, 1 reply; 12+ messages in thread
From: Dan Magenheimer @ 2012-11-14 17:42 UTC (permalink / raw)
  To: Keir Fraser, Jan Beulich, TimDeegan; +Cc: Zhigang Wang, Konrad Wilk, xen-devel

> From: Keir Fraser [mailto:keir.xen@gmail.com]
> Sent: Wednesday, November 14, 2012 12:24 AM
> To: Dan Magenheimer; Jan Beulich; TimDeegan
> Cc: xen-devel@lists.xen.org; Zhigang Wang; Konrad Rzeszutek Wilk
> Subject: Re: [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall
> 
> On 13/11/2012 22:23, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:
> 
> > The key variables (d->unclaimed_pages and total_unclaimed_pages)
> > start at zero if no claim has yet been staked for any domain.
> > (Perhaps a better name is "claimed_but_not_yet_possessed" but that's
> > a bit unwieldy.)  If no claim hypercalls are executed, there
> > should be no impact on existing usage.

Hi Keir --

Thanks for the feedback!

> It would be nice if d->tot_pages adjustments didn't take the global
> heap_lock in this case. There's probably some way to bail out of those new
> update functions before doing the locked work, in that case.

I agree.  I played with atomics for a bit but couldn't
see a way it would work without races.  The generic
pseudo-code is

	if (A) {
		atomically {
			modify A
			modify d->B
		}
	} else {
		modify d->B
	}

Any ideas?

Or possibly if your suggestion from a couple weeks ago (attached
below) is implemented, the heap_lock will always be held so
briefly in alloc/free_heap_pages that taking heap_lock briefly
in domain_de/increase_tot_pages will no longer be an issue?

Thanks,
Dan

> -----Original Message-----
> From: Keir Fraser [mailto:keir@xen.org]
> Sent: Friday, November 02, 2012 3:30 AM
> To: Jan Beulich; Dan Magenheimer
> Cc: Olaf Hering; IanCampbell; George Dunlap; Ian Jackson; George Shuklin; DarioFaggioli; xen-
> devel@lists.xen.org; Konrad Rzeszutek Wilk; Kurt Hackel; Mukesh Rathor; Zhigang Wang; TimDeegan
> Subject: Re: Proposed new "memory capacity claim" hypercall/feature
> 
> On 02/11/2012 09:01, "Jan Beulich" <JBeulich@suse.com> wrote:
> 
> > Plus, if necessary, that loop could be broken up so that only the
> > initial part of it gets run with the lock held (see c/s
> > 22135:69e8bb164683 for why the unlock was moved past the
> > loop). That would make for a shorter lock hold time, but for a
> > higher allocation latency on large oder allocations (due to worse
> > cache locality).
> 
> In fact I believe only the first page needs to have its count_info set to !=
> PGC_state_free, while the lock is held. That is sufficient to defeat the
> buddy merging in free_heap_pages(). Similarly, we could hoist most of the
> first loop in free_heap_pages() outside the lock. There's a lot of scope for
> optimisation here.
> 
>  -- Keir

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall
  2012-11-14 10:33 ` Jan Beulich
@ 2012-11-14 18:33   ` Dan Magenheimer
  0 siblings, 0 replies; 12+ messages in thread
From: Dan Magenheimer @ 2012-11-14 18:33 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Keir Fraser, TimDeegan, Konrad Wilk, Zhigang Wang, xen-devel

> From: Jan Beulich [mailto:JBeulich@suse.com]
> Subject: Re: [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall

Hi Jan --

Excellent feedback, thanks.  You probably saved me hours
of debugging... always nice to have another pair of eyes
to catch stupid mistakes that I could start at for a long
time without noticing the insanity. :-)

I'll start working on v2 incorporating your suggestions
(and get some runtime testing completed).

One thing:

> Finally, assuming we really want/need this, shouldn't there be
> a way for claims to expire (e.g. in case of tool stack problems),
> so they don't indefinitely prevent (perhaps large) portions of
> memory to be used for other purpose (e.g. when a domain can't
> get fully cleaned up)?

I was envisioning (and thanks for catching that I had neglected
it in the patch) that the reset_claim call would be done very
early in the domain death sequence.

But maybe that's a good reason to add back in the "manual"
unclaim hypercall/subop, so a toolstack can force the
claim to zero.  I realized I can easily add that by
allowing an existing claim to be "overridden" by a
claim of zero pages, so another subop isn't even needed.

Dan

.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall
  2012-11-14 17:42   ` Dan Magenheimer
@ 2012-11-14 18:58     ` Keir Fraser
  2012-11-14 19:32       ` Dan Magenheimer
  0 siblings, 1 reply; 12+ messages in thread
From: Keir Fraser @ 2012-11-14 18:58 UTC (permalink / raw)
  To: Dan Magenheimer, Jan Beulich, TimDeegan
  Cc: Zhigang Wang, Konrad Rzeszutek Wilk, xen-devel

On 14/11/2012 17:42, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:

>> It would be nice if d->tot_pages adjustments didn't take the global
>> heap_lock in this case. There's probably some way to bail out of those new
>> update functions before doing the locked work, in that case.
> 
> I agree.  I played with atomics for a bit but couldn't
> see a way it would work without races.  The generic
> pseudo-code is
> 
> if (A) {
> atomically {
> modify A
> modify d->B
> }
> } else {
> modify d->B
> }
> 
> Any ideas?

E.g.,
    /* Fast path (new). */
    if ( !d->unclaimed_pages )
        return d->tot_pages += pages;
    /* Slower path (as in your patch). */
    spin_lock(&heap_lock);
    d->tot_pages += pages;
    if ( d->unclaimed_pages )
        ....

The only danger is around any necessary interlocking with d->unclaimed_pages
becoming non-zero. But the fast-path test is still under d->page_alloc_lock,
and you set up d->unclaimed_pages under that lock, so it looks okay to me.

 -- Keir

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall
  2012-11-14 18:58     ` Keir Fraser
@ 2012-11-14 19:32       ` Dan Magenheimer
  0 siblings, 0 replies; 12+ messages in thread
From: Dan Magenheimer @ 2012-11-14 19:32 UTC (permalink / raw)
  To: Keir Fraser, Jan Beulich, TimDeegan; +Cc: Zhigang Wang, xen-devel, Konrad Wilk

> From: Keir Fraser [mailto:keir.xen@gmail.com]
> Subject: Re: [Xen-devel] [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall
> 
> On 14/11/2012 17:42, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:
> 
> >> It would be nice if d->tot_pages adjustments didn't take the global
> >> heap_lock in this case. There's probably some way to bail out of those new
> >> update functions before doing the locked work, in that case.
> >
> > I agree.  I played with atomics for a bit but couldn't
> > see a way it would work without races.  The generic
> > pseudo-code is
> >
> > if (A) {
> > atomically {
> > modify A
> > modify d->B
> > }
> > } else {
> > modify d->B
> > }
> >
> > Any ideas?
> 
> E.g.,
>     /* Fast path (new). */
>     if ( !d->unclaimed_pages )
>         return d->tot_pages += pages;
>     /* Slower path (as in your patch). */
>     spin_lock(&heap_lock);
>     d->tot_pages += pages;
>     if ( d->unclaimed_pages )
>         ....
> 
> The only danger is around any necessary interlocking with d->unclaimed_pages
> becoming non-zero. But the fast-path test is still under d->page_alloc_lock,
> and you set up d->unclaimed_pages under that lock, so it looks okay to me.

Hmmm, nice, I think you are right.  I had just removed the
d->page_alloc_lock in the "set" code at Jan's suggestion,
but this is a good reason to keep it there.  I'll also
add a comment to document this subtle lock interdependency.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall
  2012-11-13 22:23 [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall Dan Magenheimer
  2012-11-14  7:23 ` Keir Fraser
  2012-11-14 10:33 ` Jan Beulich
@ 2012-11-15 10:47 ` Ian Campbell
  2012-11-15 11:23   ` Jan Beulich
  2 siblings, 1 reply; 12+ messages in thread
From: Ian Campbell @ 2012-11-15 10:47 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: Keir (Xen.org), Konrad Wilk, Tim (Xen.org),
	xen-devel, Jan Beulich, Zhigang Wang

On Tue, 2012-11-13 at 22:23 +0000, Dan Magenheimer wrote:
> This is a first cut of the hypervisor patch of the proposed
> XENMEM_claim_pages hypercall/subop.

Who is expected to be able to call this? I think a XENMEM subop implies
that the guest itself, plus perhaps any controlling stubdom (e.g. qemu) could call
it, which I don't think is desirable.

Should this be limited to the toolstack only and therefore be a subop of
domctl or some other hypercall?

Ian.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall
  2012-11-15 10:47 ` Ian Campbell
@ 2012-11-15 11:23   ` Jan Beulich
  2012-11-15 12:12     ` Keir Fraser
  2012-11-15 19:02     ` Dan Magenheimer
  0 siblings, 2 replies; 12+ messages in thread
From: Jan Beulich @ 2012-11-15 11:23 UTC (permalink / raw)
  To: Ian Campbell, Dan Magenheimer
  Cc: Keir (Xen.org), Tim(Xen.org), KonradWilk, Zhigang Wang, xen-devel

>>> On 15.11.12 at 11:47, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Tue, 2012-11-13 at 22:23 +0000, Dan Magenheimer wrote:
>> This is a first cut of the hypervisor patch of the proposed
>> XENMEM_claim_pages hypercall/subop.
> 
> Who is expected to be able to call this? I think a XENMEM subop implies
> that the guest itself, plus perhaps any controlling stubdom (e.g. qemu) 
> could call
> it, which I don't think is desirable.
> 
> Should this be limited to the toolstack only and therefore be a subop of
> domctl or some other hypercall?

Oh, yes, absolutely. Whether making it a domctl (where it doesn't
belong except for that aspect) I'm not that sure, though.

Jan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall
  2012-11-15 11:23   ` Jan Beulich
@ 2012-11-15 12:12     ` Keir Fraser
  2012-11-15 12:17       ` Ian Campbell
  2012-11-15 19:02     ` Dan Magenheimer
  1 sibling, 1 reply; 12+ messages in thread
From: Keir Fraser @ 2012-11-15 12:12 UTC (permalink / raw)
  To: Jan Beulich, Ian Campbell, Dan Magenheimer
  Cc: Tim(Xen.org), Konrad Rzeszutek Wilk, Zhigang Wang, xen-devel

On 15/11/2012 11:23, "Jan Beulich" <JBeulich@suse.com> wrote:

>>>> On 15.11.12 at 11:47, Ian Campbell <Ian.Campbell@citrix.com> wrote:
>> On Tue, 2012-11-13 at 22:23 +0000, Dan Magenheimer wrote:
>>> This is a first cut of the hypervisor patch of the proposed
>>> XENMEM_claim_pages hypercall/subop.
>> 
>> Who is expected to be able to call this? I think a XENMEM subop implies
>> that the guest itself, plus perhaps any controlling stubdom (e.g. qemu)
>> could call
>> it, which I don't think is desirable.
>> 
>> Should this be limited to the toolstack only and therefore be a subop of
>> domctl or some other hypercall?
> 
> Oh, yes, absolutely. Whether making it a domctl (where it doesn't
> belong except for that aspect) I'm not that sure, though.

It can stay as a XENMEM_ subop, but it should be in the
__XEN__||__XEN_TOOLS__ section of public/memory.h indicating that it is
toolstack only (regardless of whether we actually enforce that), and that
the interface is not 100% guaranteed to remain compatible (since it is not
part of the general guest ABI).

 -- Keir

> Jan
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall
  2012-11-15 12:12     ` Keir Fraser
@ 2012-11-15 12:17       ` Ian Campbell
  0 siblings, 0 replies; 12+ messages in thread
From: Ian Campbell @ 2012-11-15 12:17 UTC (permalink / raw)
  To: Keir Fraser
  Cc: Dan Magenheimer, Konrad Rzeszutek Wilk, Tim (Xen.org),
	xen-devel, Jan Beulich, Zhigang Wang

On Thu, 2012-11-15 at 12:12 +0000, Keir Fraser wrote:
> On 15/11/2012 11:23, "Jan Beulich" <JBeulich@suse.com> wrote:
> 
> >>>> On 15.11.12 at 11:47, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> >> On Tue, 2012-11-13 at 22:23 +0000, Dan Magenheimer wrote:
> >>> This is a first cut of the hypervisor patch of the proposed
> >>> XENMEM_claim_pages hypercall/subop.
> >> 
> >> Who is expected to be able to call this? I think a XENMEM subop implies
> >> that the guest itself, plus perhaps any controlling stubdom (e.g. qemu)
> >> could call
> >> it, which I don't think is desirable.
> >> 
> >> Should this be limited to the toolstack only and therefore be a subop of
> >> domctl or some other hypercall?
> > 
> > Oh, yes, absolutely. Whether making it a domctl (where it doesn't
> > belong except for that aspect) I'm not that sure, though.
> 
> It can stay as a XENMEM_ subop, but it should be in the
> __XEN__||__XEN_TOOLS__ section of public/memory.h indicating that it is
> toolstack only (regardless of whether we actually enforce that), and that
> the interface is not 100% guaranteed to remain compatible (since it is not
> part of the general guest ABI).

We need to be mindful of what a guest who calls this anyway can achieve.
Is it limited by d->max_pages, IOW similar in principal to
XENMEM_populate_physmap? If so then it is probably acceptable.

Ian.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall
  2012-11-15 11:23   ` Jan Beulich
  2012-11-15 12:12     ` Keir Fraser
@ 2012-11-15 19:02     ` Dan Magenheimer
  1 sibling, 0 replies; 12+ messages in thread
From: Dan Magenheimer @ 2012-11-15 19:02 UTC (permalink / raw)
  To: Jan Beulich, Ian Campbell
  Cc: Keir (Xen.org), Tim(Xen.org), Konrad Wilk, Zhigang Wang, xen-devel

> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Thursday, November 15, 2012 4:24 AM
> To: Ian Campbell; Dan Magenheimer
> Cc: xen-devel@lists.xen.org; KonradWilk; Zhigang Wang; Keir (Xen.org); Tim(Xen.org)
> Subject: Re: [Xen-devel] [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall
> 
> >>> On 15.11.12 at 11:47, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> > On Tue, 2012-11-13 at 22:23 +0000, Dan Magenheimer wrote:
> >> This is a first cut of the hypervisor patch of the proposed
> >> XENMEM_claim_pages hypercall/subop.
> >
> > Who is expected to be able to call this? I think a XENMEM subop implies
> > that the guest itself, plus perhaps any controlling stubdom (e.g. qemu)
> > could call
> > it, which I don't think is desirable.
> >
> > Should this be limited to the toolstack only and therefore be a subop of
> > domctl or some other hypercall?
> 
> Oh, yes, absolutely. Whether making it a domctl (where it doesn't
> belong except for that aspect) I'm not that sure, though.

I've added an IS_PRIV(current->domain) check in v3.  If this is
incorrect or insufficient, please let me know!

Thanks,
Dan

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-11-15 19:02 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-13 22:23 [RFC/PATCH] XENMEM_claim_pages (subop of an existing) hypercall Dan Magenheimer
2012-11-14  7:23 ` Keir Fraser
2012-11-14 17:42   ` Dan Magenheimer
2012-11-14 18:58     ` Keir Fraser
2012-11-14 19:32       ` Dan Magenheimer
2012-11-14 10:33 ` Jan Beulich
2012-11-14 18:33   ` Dan Magenheimer
2012-11-15 10:47 ` Ian Campbell
2012-11-15 11:23   ` Jan Beulich
2012-11-15 12:12     ` Keir Fraser
2012-11-15 12:17       ` Ian Campbell
2012-11-15 19:02     ` Dan Magenheimer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.