All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
@ 2015-10-08 20:57 Tamas K Lengyel
  2015-10-08 20:57 ` [PATCH v2 2/2] tests/mem-sharing: Add bulk option to memshrtool Tamas K Lengyel
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Tamas K Lengyel @ 2015-10-08 20:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Tamas K Lengyel, Wei Liu, Ian Campbell, Stefano Stabellini,
	George Dunlap, Andrew Cooper, Ian Jackson, Jan Beulich,
	Tamas K Lengyel, Keir Fraser

Currently mem-sharing can be performed on a page-by-page base from the control
domain. However, when completely deduplicating (cloning) a VM, this requires
at least 3 hypercalls per page. As the user has to loop through all pages up
to max_gpfn, this process is very slow and wasteful.

This patch introduces a new mem_sharing memop for bulk deduplication where
the user doesn't have to separately nominate each page in both the source and
destination domain, and the looping over all pages happen in the hypervisor.
This significantly reduces the overhead of completely deduplicating entire
domains.

Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
v2: Stash hypercall continuation start point in xen_mem_sharing_op_t
    Return number of successfully shared pages in xen_mem_sharing_op_t
---
 tools/libxc/include/xenctrl.h | 13 +++++++
 tools/libxc/xc_memshr.c       | 19 ++++++++++
 xen/arch/x86/mm/mem_sharing.c | 84 +++++++++++++++++++++++++++++++++++++++++++
 xen/include/public/memory.h   |  9 ++++-
 4 files changed, 124 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 3bfa00b..2476ebc 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2594,6 +2594,19 @@ int xc_memshr_add_to_physmap(xc_interface *xch,
                     domid_t client_domain,
                     unsigned long client_gfn);
 
+/* Allows to deduplicate the entire memory of a client domain in bulk. Using
+ * this function is equivalent of calling xc_memshr_nominate_gfn for each gfn
+ * in the two domains followed by xc_memshr_share_gfns. If successfull,
+ * returns the number of shared pages in 'shared'.
+ *
+ * May fail with -EINVAL if the source and client domain have different
+ * memory size or if memory sharing is not enabled on either of the domains.
+ */
+int xc_memshr_bulk_share(xc_interface *xch,
+                         domid_t source_domain,
+                         domid_t client_domain,
+                         uint64_t *shared);
+
 /* Debug calls: return the number of pages referencing the shared frame backing
  * the input argument. Should be one or greater. 
  *
diff --git a/tools/libxc/xc_memshr.c b/tools/libxc/xc_memshr.c
index deb0aa4..71350d2 100644
--- a/tools/libxc/xc_memshr.c
+++ b/tools/libxc/xc_memshr.c
@@ -181,6 +181,25 @@ int xc_memshr_add_to_physmap(xc_interface *xch,
     return xc_memshr_memop(xch, source_domain, &mso);
 }
 
+int xc_memshr_bulk_share(xc_interface *xch,
+                         domid_t source_domain,
+                         domid_t client_domain,
+                         uint64_t *shared)
+{
+    int rc;
+    xen_mem_sharing_op_t mso;
+
+    memset(&mso, 0, sizeof(mso));
+
+    mso.op = XENMEM_sharing_op_bulk_share;
+    mso.u.bulk.client_domain = client_domain;
+
+    rc = xc_memshr_memop(xch, source_domain, &mso);
+    if ( !rc && shared ) *shared = mso.u.bulk.shared;
+
+    return rc;
+}
+
 int xc_memshr_domain_resume(xc_interface *xch,
                             domid_t domid)
 {
diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index a95e105..4cdddb1 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -1293,6 +1293,37 @@ int relinquish_shared_pages(struct domain *d)
     return rc;
 }
 
+static int bulk_share(struct domain *d, struct domain *cd, unsigned long max,
+                      struct mem_sharing_op_bulk_share *bulk)
+{
+    int rc = 0;
+    shr_handle_t sh, ch;
+
+    while( bulk->start <= max )
+    {
+        if ( mem_sharing_nominate_page(d, bulk->start, 0, &sh) != 0 )
+            goto next;
+
+        if ( mem_sharing_nominate_page(cd, bulk->start, 0, &ch) != 0 )
+            goto next;
+
+        if ( !mem_sharing_share_pages(d, bulk->start, sh, cd, bulk->start, ch) )
+            ++(bulk->shared);
+
+next:
+        ++(bulk->start);
+
+        /* Check for continuation if it's not the last iteration. */
+        if ( bulk->start < max && hypercall_preempt_check() )
+        {
+            rc = 1;
+            break;
+        }
+    }
+
+    return rc;
+}
+
 int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
 {
     int rc;
@@ -1467,6 +1498,59 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
         }
         break;
 
+        case XENMEM_sharing_op_bulk_share:
+        {
+            unsigned long max_sgfn, max_cgfn;
+            struct domain *cd;
+
+            rc = -EINVAL;
+            if ( !mem_sharing_enabled(d) )
+                goto out;
+
+            rc = rcu_lock_live_remote_domain_by_id(mso.u.bulk.client_domain,
+                                                   &cd);
+            if ( rc )
+                goto out;
+
+            rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd, mso.op);
+            if ( rc )
+            {
+                rcu_unlock_domain(cd);
+                goto out;
+            }
+
+            if ( !mem_sharing_enabled(cd) )
+            {
+                rcu_unlock_domain(cd);
+                rc = -EINVAL;
+                goto out;
+            }
+
+            max_sgfn = domain_get_maximum_gpfn(d);
+            max_cgfn = domain_get_maximum_gpfn(cd);
+
+            if ( max_sgfn != max_cgfn || max_sgfn < mso.u.bulk.start )
+            {
+                rcu_unlock_domain(cd);
+                rc = -EINVAL;
+                goto out;
+            }
+
+            rc = bulk_share(d, cd, max_sgfn, &mso.u.bulk);
+            if ( rc )
+            {
+                if ( __copy_to_guest(arg, &mso, 1) )
+                    rc = -EFAULT;
+                else
+                    rc = hypercall_create_continuation(__HYPERVISOR_memory_op,
+                                                       "lh", XENMEM_sharing_op,
+                                                       arg);
+            }
+
+            rcu_unlock_domain(cd);
+        }
+        break;
+
         case XENMEM_sharing_op_debug_gfn:
         {
             unsigned long gfn = mso.u.debug.u.gfn;
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index 320de91..0299746 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -447,6 +447,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t);
 #define XENMEM_sharing_op_debug_gref        5
 #define XENMEM_sharing_op_add_physmap       6
 #define XENMEM_sharing_op_audit             7
+#define XENMEM_sharing_op_bulk_share        8
 
 #define XENMEM_SHARING_OP_S_HANDLE_INVALID  (-10)
 #define XENMEM_SHARING_OP_C_HANDLE_INVALID  (-9)
@@ -482,7 +483,13 @@ struct xen_mem_sharing_op {
             uint64_aligned_t client_gfn;    /* IN: the client gfn */
             uint64_aligned_t client_handle; /* IN: handle to the client page */
             domid_t  client_domain; /* IN: the client domain id */
-        } share; 
+        } share;
+        struct mem_sharing_op_bulk_share {   /* OP_BULK_SHARE */
+            domid_t client_domain;           /* IN: the client domain id */
+            uint64_aligned_t start;          /* IN: start gfn (normally 0) */
+            uint64_aligned_t shared;         /* OUT: the number of
+                                                successfully shared pages */
+        } bulk;
         struct mem_sharing_op_debug {     /* OP_DEBUG_xxx */
             union {
                 uint64_aligned_t gfn;      /* IN: gfn to debug          */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/2] tests/mem-sharing: Add bulk option to memshrtool
  2015-10-08 20:57 [PATCH v2 1/2] x86/mem-sharing: Bulk mem-sharing entire domains Tamas K Lengyel
@ 2015-10-08 20:57 ` Tamas K Lengyel
  2015-10-09  7:51 ` [PATCH v2 1/2] x86/mem-sharing: Bulk mem-sharing entire domains Jan Beulich
  2015-10-09 13:26 ` Andrew Cooper
  2 siblings, 0 replies; 11+ messages in thread
From: Tamas K Lengyel @ 2015-10-08 20:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Tamas K Lengyel, Wei Liu, Ian Campbell, Stefano Stabellini,
	Ian Jackson, Tamas K Lengyel

Add the bulk option to the test tool to perform complete deduplication
between domains.

Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/tests/mem-sharing/memshrtool.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/tools/tests/mem-sharing/memshrtool.c b/tools/tests/mem-sharing/memshrtool.c
index 6454bc3..c5e5cb8 100644
--- a/tools/tests/mem-sharing/memshrtool.c
+++ b/tools/tests/mem-sharing/memshrtool.c
@@ -23,6 +23,8 @@ static int usage(const char* prog)
     printf("  nominate <domid> <gfn>  - Nominate a page for sharing.\n");
     printf("  share <domid> <gfn> <handle> <source> <source-gfn> <source-handle>\n");
     printf("                          - Share two pages.\n");
+    printf("  bulk <source-domid> <destination-domid>\n");
+    printf("                          - Share all pages between domains.\n");
     printf("  unshare <domid> <gfn>   - Unshare a page by grabbing a writable map.\n");
     printf("  add-to-physmap <domid> <gfn> <source> <source-gfn> <source-handle>\n");
     printf("                          - Populate a page in a domain with a shared page.\n");
@@ -179,6 +181,26 @@ int main(int argc, const char** argv)
         }
         printf("Audit returned %d errors.\n", rc);
     }
+    else if( !strcasecmp(cmd, "bulk") )
+    {
+        domid_t sdomid, cdomid;
+        int rc;
+        unsigned long shared;
+
+        if( argc != 4 )
+            return usage(argv[0]);
+
+        sdomid = strtol(argv[2], NULL, 0);
+        cdomid = strtol(argv[3], NULL, 0);
 
+        rc = xc_memshr_bulk_share(xch, sdomid, cdomid, &shared);
+        if ( rc < 0 )
+        {
+            printf("error executing xc_memshr_bulk_dedup: %s\n", strerror(errno));
+            return rc;
+        }
+
+        printf("Successfully shared %lu pages between the domains\n", shared);
+    }
     return 0;
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
  2015-10-08 20:57 [PATCH v2 1/2] x86/mem-sharing: Bulk mem-sharing entire domains Tamas K Lengyel
  2015-10-08 20:57 ` [PATCH v2 2/2] tests/mem-sharing: Add bulk option to memshrtool Tamas K Lengyel
@ 2015-10-09  7:51 ` Jan Beulich
  2015-10-09 17:55   ` Tamas K Lengyel
  2015-10-09 13:26 ` Andrew Cooper
  2 siblings, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2015-10-09  7:51 UTC (permalink / raw)
  To: Tamas K Lengyel
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Andrew Cooper, Ian Jackson, Tamas K Lengyel, xen-devel,
	Keir Fraser

>>> On 08.10.15 at 22:57, <tamas@tklengyel.com> wrote:
> --- a/xen/arch/x86/mm/mem_sharing.c
> +++ b/xen/arch/x86/mm/mem_sharing.c
> @@ -1293,6 +1293,37 @@ int relinquish_shared_pages(struct domain *d)
>      return rc;
>  }
>  
> +static int bulk_share(struct domain *d, struct domain *cd, unsigned long max,
> +                      struct mem_sharing_op_bulk_share *bulk)
> +{
> +    int rc = 0;
> +    shr_handle_t sh, ch;
> +
> +    while( bulk->start <= max )
> +    {
> +        if ( mem_sharing_nominate_page(d, bulk->start, 0, &sh) != 0 )
> +            goto next;
> +
> +        if ( mem_sharing_nominate_page(cd, bulk->start, 0, &ch) != 0 )
> +            goto next;
> +
> +        if ( !mem_sharing_share_pages(d, bulk->start, sh, cd, bulk->start, ch) )
> +            ++(bulk->shared);

Pointless parentheses.

> +next:

Labels indented by at least one space please.

> +        ++(bulk->start);
> +
> +        /* Check for continuation if it's not the last iteration. */
> +        if ( bulk->start < max && hypercall_preempt_check() )
> +        {
> +            rc = 1;
> +            break;

This could simple be a return statement, avoiding the need for a
local variable (the type of which would need to be changed, see
below).

> +        }
> +    }
> +
> +    return rc;
> +}

So effectively the function right now returns a boolean. If that's
intended, its return type should reflect that (but I then wonder
whether the sense of the values shouldn't be inverted, to have
"true" mean "done").

> @@ -1467,6 +1498,59 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>          }
>          break;
>  
> +        case XENMEM_sharing_op_bulk_share:
> +        {
> +            unsigned long max_sgfn, max_cgfn;
> +            struct domain *cd;
> +
> +            rc = -EINVAL;
> +            if ( !mem_sharing_enabled(d) )
> +                goto out;
> +
> +            rc = rcu_lock_live_remote_domain_by_id(mso.u.bulk.client_domain,
> +                                                   &cd);
> +            if ( rc )
> +                goto out;
> +
> +            rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd, mso.op);
> +            if ( rc )
> +            {
> +                rcu_unlock_domain(cd);
> +                goto out;
> +            }
> +
> +            if ( !mem_sharing_enabled(cd) )
> +            {
> +                rcu_unlock_domain(cd);
> +                rc = -EINVAL;
> +                goto out;
> +            }
> +
> +            max_sgfn = domain_get_maximum_gpfn(d);
> +            max_cgfn = domain_get_maximum_gpfn(cd);
> +
> +            if ( max_sgfn != max_cgfn || max_sgfn < mso.u.bulk.start )

Are both domains required to be paused for this operation? If so,
shouldn't you enforce this? If not, by the time you reach the if()
the values are stale.

> +            {
> +                rcu_unlock_domain(cd);
> +                rc = -EINVAL;
> +                goto out;
> +            }
> +
> +            rc = bulk_share(d, cd, max_sgfn, &mso.u.bulk);
> +            if ( rc )

With the boolean nature the use of "rc" here is rather confusing;
I'd suggest avoiding the use of in intermediate variable in this case.

> +            {
> +                if ( __copy_to_guest(arg, &mso, 1) )
> +                    rc = -EFAULT;
> +                else
> +                    rc = hypercall_create_continuation(__HYPERVISOR_memory_op,
> +                                                       "lh", XENMEM_sharing_op,
> +                                                       arg);
> +            }
> +
> +            rcu_unlock_domain(cd);
> +        }
> +        break;
> +
>          case XENMEM_sharing_op_debug_gfn:
>          {
>              unsigned long gfn = mso.u.debug.u.gfn;
> diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
> index 320de91..0299746 100644
> --- a/xen/include/public/memory.h
> +++ b/xen/include/public/memory.h
> @@ -447,6 +447,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t);
>  #define XENMEM_sharing_op_debug_gref        5
>  #define XENMEM_sharing_op_add_physmap       6
>  #define XENMEM_sharing_op_audit             7
> +#define XENMEM_sharing_op_bulk_share        8
>  
>  #define XENMEM_SHARING_OP_S_HANDLE_INVALID  (-10)
>  #define XENMEM_SHARING_OP_C_HANDLE_INVALID  (-9)
> @@ -482,7 +483,13 @@ struct xen_mem_sharing_op {
>              uint64_aligned_t client_gfn;    /* IN: the client gfn */
>              uint64_aligned_t client_handle; /* IN: handle to the client page */
>              domid_t  client_domain; /* IN: the client domain id */
> -        } share; 
> +        } share;
> +        struct mem_sharing_op_bulk_share {   /* OP_BULK_SHARE */

Is the _share suffix really useful here? Even more, if you dropped
it and renamed "shared" below to "done", the same structure could
be used for a hypothetical bulk-unshare operation.

> +            domid_t client_domain;           /* IN: the client domain id */
> +            uint64_aligned_t start;          /* IN: start gfn (normally 0) */

Marking this as just IN implies that the value doesn't change across
the operation.

> +            uint64_aligned_t shared;         /* OUT: the number of
> +                                                successfully shared pages */

For this to be correct, the value is required to be zero initialized by
the caller at present. Either this needs to be documented, or you
should zero the field on any non-continuation invocation.

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
  2015-10-08 20:57 [PATCH v2 1/2] x86/mem-sharing: Bulk mem-sharing entire domains Tamas K Lengyel
  2015-10-08 20:57 ` [PATCH v2 2/2] tests/mem-sharing: Add bulk option to memshrtool Tamas K Lengyel
  2015-10-09  7:51 ` [PATCH v2 1/2] x86/mem-sharing: Bulk mem-sharing entire domains Jan Beulich
@ 2015-10-09 13:26 ` Andrew Cooper
  2015-10-09 18:13   ` Tamas K Lengyel
  2 siblings, 1 reply; 11+ messages in thread
From: Andrew Cooper @ 2015-10-09 13:26 UTC (permalink / raw)
  To: Tamas K Lengyel, xen-devel
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Ian Jackson, Jan Beulich, Tamas K Lengyel, Keir Fraser

On 08/10/15 21:57, Tamas K Lengyel wrote:
> diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
> index a95e105..4cdddb1 100644
> --- a/xen/arch/x86/mm/mem_sharing.c
> +++ b/xen/arch/x86/mm/mem_sharing.c
> @@ -1293,6 +1293,37 @@ int relinquish_shared_pages(struct domain *d)
>      return rc;
>  }
>  
> +static int bulk_share(struct domain *d, struct domain *cd, unsigned long max,
> +                      struct mem_sharing_op_bulk_share *bulk)
> +{
> +    int rc = 0;
> +    shr_handle_t sh, ch;
> +
> +    while( bulk->start <= max )
> +    {
> +        if ( mem_sharing_nominate_page(d, bulk->start, 0, &sh) != 0 )

This swallows the error from mem_sharing_nominate_page().  Some errors
might be safe to ignore in this context, but ones like ENOMEM most
certainly are not.

You should record the error into rc and switch ( rc ) to ignore/process
the error, passing hard errors straight up.

> +            goto next;
> +
> +        if ( mem_sharing_nominate_page(cd, bulk->start, 0, &ch) != 0 )
> +            goto next;
> +
> +        if ( !mem_sharing_share_pages(d, bulk->start, sh, cd, bulk->start, ch) )
> +            ++(bulk->shared);
> +
> +next:
> +        ++(bulk->start);
> +
> +        /* Check for continuation if it's not the last iteration. */
> +        if ( bulk->start < max && hypercall_preempt_check() )
> +        {
> +            rc = 1;

Using -ERESTART here allows...

> +            break;
> +        }
> +    }
> +
> +    return rc;
> +}
> +
>  int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>  {
>      int rc;
> @@ -1467,6 +1498,59 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>          }
>          break;
>  
> +        case XENMEM_sharing_op_bulk_share:
> +        {
> +            unsigned long max_sgfn, max_cgfn;
> +            struct domain *cd;
> +
> +            rc = -EINVAL;
> +            if ( !mem_sharing_enabled(d) )
> +                goto out;
> +
> +            rc = rcu_lock_live_remote_domain_by_id(mso.u.bulk.client_domain,
> +                                                   &cd);
> +            if ( rc )
> +                goto out;
> +
> +            rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd, mso.op);
> +            if ( rc )
> +            {
> +                rcu_unlock_domain(cd);
> +                goto out;
> +            }
> +
> +            if ( !mem_sharing_enabled(cd) )
> +            {
> +                rcu_unlock_domain(cd);
> +                rc = -EINVAL;
> +                goto out;
> +            }
> +
> +            max_sgfn = domain_get_maximum_gpfn(d);
> +            max_cgfn = domain_get_maximum_gpfn(cd);
> +
> +            if ( max_sgfn != max_cgfn || max_sgfn < mso.u.bulk.start )
> +            {
> +                rcu_unlock_domain(cd);
> +                rc = -EINVAL;
> +                goto out;
> +            }
> +
> +            rc = bulk_share(d, cd, max_sgfn, &mso.u.bulk);
> +            if ( rc )

... this check to be selective.

> +            {
> +                if ( __copy_to_guest(arg, &mso, 1) )

This __copy_to_guest() needs to happen unconditionally before creating
the continuation, as it contains the continuation information.

It also needs to happen on the success path, so .shared is correct.

~Andrew

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
  2015-10-09  7:51 ` [PATCH v2 1/2] x86/mem-sharing: Bulk mem-sharing entire domains Jan Beulich
@ 2015-10-09 17:55   ` Tamas K Lengyel
  2015-10-12  6:42     ` Jan Beulich
  0 siblings, 1 reply; 11+ messages in thread
From: Tamas K Lengyel @ 2015-10-09 17:55 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Andrew Cooper, Ian Jackson, Tamas K Lengyel, Xen-devel,
	Keir Fraser


[-- Attachment #1.1: Type: text/plain, Size: 6855 bytes --]

On Fri, Oct 9, 2015 at 1:51 AM, Jan Beulich <JBeulich@suse.com> wrote:

> >>> On 08.10.15 at 22:57, <tamas@tklengyel.com> wrote:
> > --- a/xen/arch/x86/mm/mem_sharing.c
> > +++ b/xen/arch/x86/mm/mem_sharing.c
> > @@ -1293,6 +1293,37 @@ int relinquish_shared_pages(struct domain *d)
> >      return rc;
> >  }
> >
> > +static int bulk_share(struct domain *d, struct domain *cd, unsigned
> long max,
> > +                      struct mem_sharing_op_bulk_share *bulk)
> > +{
> > +    int rc = 0;
> > +    shr_handle_t sh, ch;
> > +
> > +    while( bulk->start <= max )
> > +    {
> > +        if ( mem_sharing_nominate_page(d, bulk->start, 0, &sh) != 0 )
> > +            goto next;
> > +
> > +        if ( mem_sharing_nominate_page(cd, bulk->start, 0, &ch) != 0 )
> > +            goto next;
> > +
> > +        if ( !mem_sharing_share_pages(d, bulk->start, sh, cd,
> bulk->start, ch) )
> > +            ++(bulk->shared);
>
> Pointless parentheses.
>
>
Pointless but harmless and I like this style better.


> > +next:
>
> Labels indented by at least one space please.
>

Ack.


>
> > +        ++(bulk->start);
> > +
> > +        /* Check for continuation if it's not the last iteration. */
> > +        if ( bulk->start < max && hypercall_preempt_check() )
> > +        {
> > +            rc = 1;
> > +            break;
>
> This could simple be a return statement, avoiding the need for a
> local variable (the type of which would need to be changed, see
> below).
>

It's reused in the caller to indicate where the mso copyback happens and rc
is of type int in the caller.


>
> > +        }
> > +    }
> > +
> > +    return rc;
> > +}
>
> So effectively the function right now returns a boolean. If that's
> intended, its return type should reflect that (but I then wonder
> whether the sense of the values shouldn't be inverted, to have
> "true" mean "done").
>

It does but it's return is assigned to rc which is used to decide where
copyback happens.


>
> > @@ -1467,6 +1498,59 @@ int
> mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
> >          }
> >          break;
> >
> > +        case XENMEM_sharing_op_bulk_share:
> > +        {
> > +            unsigned long max_sgfn, max_cgfn;
> > +            struct domain *cd;
> > +
> > +            rc = -EINVAL;
> > +            if ( !mem_sharing_enabled(d) )
> > +                goto out;
> > +
> > +            rc =
> rcu_lock_live_remote_domain_by_id(mso.u.bulk.client_domain,
> > +                                                   &cd);
> > +            if ( rc )
> > +                goto out;
> > +
> > +            rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd, mso.op);
> > +            if ( rc )
> > +            {
> > +                rcu_unlock_domain(cd);
> > +                goto out;
> > +            }
> > +
> > +            if ( !mem_sharing_enabled(cd) )
> > +            {
> > +                rcu_unlock_domain(cd);
> > +                rc = -EINVAL;
> > +                goto out;
> > +            }
> > +
> > +            max_sgfn = domain_get_maximum_gpfn(d);
> > +            max_cgfn = domain_get_maximum_gpfn(cd);
> > +
> > +            if ( max_sgfn != max_cgfn || max_sgfn < mso.u.bulk.start )
>
> Are both domains required to be paused for this operation? If so,
> shouldn't you enforce this? If not, by the time you reach the if()
> the values are stale.
>

It's expected that the user has exclusive tool-side lock on the domains
before issuing this hypercall and that the domains are paused already.


>
> > +            {
> > +                rcu_unlock_domain(cd);
> > +                rc = -EINVAL;
> > +                goto out;
> > +            }
> > +
> > +            rc = bulk_share(d, cd, max_sgfn, &mso.u.bulk);
> > +            if ( rc )
>
> With the boolean nature the use of "rc" here is rather confusing;
> I'd suggest avoiding the use of in intermediate variable in this case.
>

I don't see where the confusion is - rc indicates there is work left to do
and hypercall continuation needs to be setup. I could move this block into
bulk_share itself.


>
> > +            {
> > +                if ( __copy_to_guest(arg, &mso, 1) )
> > +                    rc = -EFAULT;
> > +                else
> > +                    rc =
> hypercall_create_continuation(__HYPERVISOR_memory_op,
> > +                                                       "lh",
> XENMEM_sharing_op,
> > +                                                       arg);
> > +            }
> > +
> > +            rcu_unlock_domain(cd);
> > +        }
> > +        break;
> > +
> >          case XENMEM_sharing_op_debug_gfn:
> >          {
> >              unsigned long gfn = mso.u.debug.u.gfn;
> > diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
> > index 320de91..0299746 100644
> > --- a/xen/include/public/memory.h
> > +++ b/xen/include/public/memory.h
> > @@ -447,6 +447,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t);
> >  #define XENMEM_sharing_op_debug_gref        5
> >  #define XENMEM_sharing_op_add_physmap       6
> >  #define XENMEM_sharing_op_audit             7
> > +#define XENMEM_sharing_op_bulk_share        8
> >
> >  #define XENMEM_SHARING_OP_S_HANDLE_INVALID  (-10)
> >  #define XENMEM_SHARING_OP_C_HANDLE_INVALID  (-9)
> > @@ -482,7 +483,13 @@ struct xen_mem_sharing_op {
> >              uint64_aligned_t client_gfn;    /* IN: the client gfn */
> >              uint64_aligned_t client_handle; /* IN: handle to the client
> page */
> >              domid_t  client_domain; /* IN: the client domain id */
> > -        } share;
> > +        } share;
> > +        struct mem_sharing_op_bulk_share {   /* OP_BULK_SHARE */
>
> Is the _share suffix really useful here? Even more, if you dropped
> it and renamed "shared" below to "done", the same structure could
> be used for a hypothetical bulk-unshare operation.
>

I don't really have a use-case for that at the moment and having it simply
as "bulk" is not specific enough IMHO.


>
> > +            domid_t client_domain;           /* IN: the client domain
> id */
> > +            uint64_aligned_t start;          /* IN: start gfn (normally
> 0) */
>
> Marking this as just IN implies that the value doesn't change across
> the operation.
>

In my book IN means it's used strictly only to pass input and it's value
may or may not be the same afterwards.


>
> > +            uint64_aligned_t shared;         /* OUT: the number of
> > +                                                successfully shared
> pages */
>
> For this to be correct, the value is required to be zero initialized by
> the caller at present. Either this needs to be documented, or you
> should zero the field on any non-continuation invocation.
>

Sure, I'll add a comment for that effect - the toolstack already memsets
the mso struct to zero but I agree it's better to explicitly state that.

Tamas

[-- Attachment #1.2: Type: text/html, Size: 10284 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
  2015-10-09 13:26 ` Andrew Cooper
@ 2015-10-09 18:13   ` Tamas K Lengyel
  2015-10-12 10:22     ` Andrew Cooper
  0 siblings, 1 reply; 11+ messages in thread
From: Tamas K Lengyel @ 2015-10-09 18:13 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Ian Jackson, Jan Beulich, Tamas K Lengyel, Xen-devel,
	Keir Fraser


[-- Attachment #1.1: Type: text/plain, Size: 4656 bytes --]

On Fri, Oct 9, 2015 at 7:26 AM, Andrew Cooper <andrew.cooper3@citrix.com>
wrote:

> On 08/10/15 21:57, Tamas K Lengyel wrote:
> > diff --git a/xen/arch/x86/mm/mem_sharing.c
> b/xen/arch/x86/mm/mem_sharing.c
> > index a95e105..4cdddb1 100644
> > --- a/xen/arch/x86/mm/mem_sharing.c
> > +++ b/xen/arch/x86/mm/mem_sharing.c
> > @@ -1293,6 +1293,37 @@ int relinquish_shared_pages(struct domain *d)
> >      return rc;
> >  }
> >
> > +static int bulk_share(struct domain *d, struct domain *cd, unsigned
> long max,
> > +                      struct mem_sharing_op_bulk_share *bulk)
> > +{
> > +    int rc = 0;
> > +    shr_handle_t sh, ch;
> > +
> > +    while( bulk->start <= max )
> > +    {
> > +        if ( mem_sharing_nominate_page(d, bulk->start, 0, &sh) != 0 )
>
> This swallows the error from mem_sharing_nominate_page().  Some errors
> might be safe to ignore in this context, but ones like ENOMEM most
> certainly are not.
>
> You should record the error into rc and switch ( rc ) to ignore/process
> the error, passing hard errors straight up.
>

In my experience all errors here are safe to ignore. Yes, if an ENOMEM
condition presents itself the sharing will be incomplete (or even 0). There
isn't much the user can do about that other than killing another domain to
free up memory.. which will happen anyway automatically when a clone domain
first hits an unshare operation. These conditions are better handled with
the memsharing event listener.


>
> > +            goto next;
> > +
> > +        if ( mem_sharing_nominate_page(cd, bulk->start, 0, &ch) != 0 )
> > +            goto next;
> > +
> > +        if ( !mem_sharing_share_pages(d, bulk->start, sh, cd,
> bulk->start, ch) )
> > +            ++(bulk->shared);
> > +
> > +next:
> > +        ++(bulk->start);
> > +
> > +        /* Check for continuation if it's not the last iteration. */
> > +        if ( bulk->start < max && hypercall_preempt_check() )
> > +        {
> > +            rc = 1;
>
> Using -ERESTART here allows...
>
> > +            break;
> > +        }
> > +    }
> > +
> > +    return rc;
> > +}
> > +
> >  int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
> >  {
> >      int rc;
> > @@ -1467,6 +1498,59 @@ int
> mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
> >          }
> >          break;
> >
> > +        case XENMEM_sharing_op_bulk_share:
> > +        {
> > +            unsigned long max_sgfn, max_cgfn;
> > +            struct domain *cd;
> > +
> > +            rc = -EINVAL;
> > +            if ( !mem_sharing_enabled(d) )
> > +                goto out;
> > +
> > +            rc =
> rcu_lock_live_remote_domain_by_id(mso.u.bulk.client_domain,
> > +                                                   &cd);
> > +            if ( rc )
> > +                goto out;
> > +
> > +            rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd, mso.op);
> > +            if ( rc )
> > +            {
> > +                rcu_unlock_domain(cd);
> > +                goto out;
> > +            }
> > +
> > +            if ( !mem_sharing_enabled(cd) )
> > +            {
> > +                rcu_unlock_domain(cd);
> > +                rc = -EINVAL;
> > +                goto out;
> > +            }
> > +
> > +            max_sgfn = domain_get_maximum_gpfn(d);
> > +            max_cgfn = domain_get_maximum_gpfn(cd);
> > +
> > +            if ( max_sgfn != max_cgfn || max_sgfn < mso.u.bulk.start )
> > +            {
> > +                rcu_unlock_domain(cd);
> > +                rc = -EINVAL;
> > +                goto out;
> > +            }
> > +
> > +            rc = bulk_share(d, cd, max_sgfn, &mso.u.bulk);
> > +            if ( rc )
>
> ... this check to be selective.
>

Sure, but I don't have a usecase for returning the error code to the user
from the nominate/sharing op. The reason for that is that just return the
error condition by itself is not very uself. Furthermore, the gfn at which
the operation failed would have to be passed to allow for debugging. But
for debugging the user could also just do this loop on his side where he
would readily have this information. So I would rather just keep this op as
simple as possible.


>
> > +            {
> > +                if ( __copy_to_guest(arg, &mso, 1) )
>
> This __copy_to_guest() needs to happen unconditionally before creating
> the continuation, as it contains the continuation information.
>
>
It does happen before setting up the continuation and the continuation only
gets setup if this succeeds.


> It also needs to happen on the success path, so .shared is correct.
>

It does happen on the success path below for !rc for all sharing ops.

Tamas

[-- Attachment #1.2: Type: text/html, Size: 6572 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
  2015-10-09 17:55   ` Tamas K Lengyel
@ 2015-10-12  6:42     ` Jan Beulich
  2015-10-15 16:09       ` Tamas K Lengyel
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2015-10-12  6:42 UTC (permalink / raw)
  To: Tamas K Lengyel
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Andrew Cooper, Ian Jackson, Tamas K Lengyel, Xen-devel,
	Keir Fraser

>>> On 09.10.15 at 19:55, <tamas@tklengyel.com> wrote:
> On Fri, Oct 9, 2015 at 1:51 AM, Jan Beulich <JBeulich@suse.com> wrote:
> 
>> >>> On 08.10.15 at 22:57, <tamas@tklengyel.com> wrote:
>> > --- a/xen/arch/x86/mm/mem_sharing.c
>> > +++ b/xen/arch/x86/mm/mem_sharing.c
>> > @@ -1293,6 +1293,37 @@ int relinquish_shared_pages(struct domain *d)
>> >      return rc;
>> >  }
>> >
>> > +static int bulk_share(struct domain *d, struct domain *cd, unsigned
>> long max,
>> > +                      struct mem_sharing_op_bulk_share *bulk)
>> > +{
>> > +    int rc = 0;
>> > +    shr_handle_t sh, ch;
>> > +
>> > +    while( bulk->start <= max )
>> > +    {
>> > +        if ( mem_sharing_nominate_page(d, bulk->start, 0, &sh) != 0 )
>> > +            goto next;
>> > +
>> > +        if ( mem_sharing_nominate_page(cd, bulk->start, 0, &ch) != 0 )
>> > +            goto next;
>> > +
>> > +        if ( !mem_sharing_share_pages(d, bulk->start, sh, cd,
>> bulk->start, ch) )
>> > +            ++(bulk->shared);
>>
>> Pointless parentheses.
>>
>>
> Pointless but harmless and I like this style better.

Well, it's code you're the maintainer for, so you know, but generally
I think parentheses should be used to clarify things, not to make
code harder to read (the effect is minimal here, but extended to a
more complex expression this may well matter).

>> > +        ++(bulk->start);
>> > +
>> > +        /* Check for continuation if it's not the last iteration. */
>> > +        if ( bulk->start < max && hypercall_preempt_check() )
>> > +        {
>> > +            rc = 1;
>> > +            break;
>>
>> This could simple be a return statement, avoiding the need for a
>> local variable (the type of which would need to be changed, see
>> below).
> 
> It's reused in the caller to indicate where the mso copyback happens and rc
> is of type int in the caller.

But you're actually abusing the int nature of rc in the caller to store
a boolean value. I'd really like to see this be done consistently -
either use another variable (or, as suggested, no intermediate
variable in the caller at all), or (also taking into consideration Andrew's
comments) propagate an actual -E value from here (along with not
losing errors).

>> > +            rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd, mso.op);
>> > +            if ( rc )
>> > +            {
>> > +                rcu_unlock_domain(cd);
>> > +                goto out;
>> > +            }
>> > +
>> > +            if ( !mem_sharing_enabled(cd) )
>> > +            {
>> > +                rcu_unlock_domain(cd);
>> > +                rc = -EINVAL;
>> > +                goto out;
>> > +            }
>> > +
>> > +            max_sgfn = domain_get_maximum_gpfn(d);
>> > +            max_cgfn = domain_get_maximum_gpfn(cd);
>> > +
>> > +            if ( max_sgfn != max_cgfn || max_sgfn < mso.u.bulk.start )
>>
>> Are both domains required to be paused for this operation? If so,
>> shouldn't you enforce this? If not, by the time you reach the if()
>> the values are stale.
> 
> It's expected that the user has exclusive tool-side lock on the domains
> before issuing this hypercall and that the domains are paused already.

As said -  in that case, please enforce this (by bailing if not so).

>> > +            {
>> > +                rcu_unlock_domain(cd);
>> > +                rc = -EINVAL;
>> > +                goto out;
>> > +            }
>> > +
>> > +            rc = bulk_share(d, cd, max_sgfn, &mso.u.bulk);
>> > +            if ( rc )
>>
>> With the boolean nature the use of "rc" here is rather confusing;
>> I'd suggest avoiding the use of in intermediate variable in this case.
>>
> 
> I don't see where the confusion is - rc indicates there is work left to do
> and hypercall continuation needs to be setup. I could move this block into
> bulk_share itself.

Not sure that would help. The confusion, just to clarify, results from
storing two different kinds of values (-E error indicator vs boolean) in
the same variable.

>> > --- a/xen/include/public/memory.h
>> > +++ b/xen/include/public/memory.h
>> > @@ -447,6 +447,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t);
>> >  #define XENMEM_sharing_op_debug_gref        5
>> >  #define XENMEM_sharing_op_add_physmap       6
>> >  #define XENMEM_sharing_op_audit             7
>> > +#define XENMEM_sharing_op_bulk_share        8
>> >
>> >  #define XENMEM_SHARING_OP_S_HANDLE_INVALID  (-10)
>> >  #define XENMEM_SHARING_OP_C_HANDLE_INVALID  (-9)
>> > @@ -482,7 +483,13 @@ struct xen_mem_sharing_op {
>> >              uint64_aligned_t client_gfn;    /* IN: the client gfn */
>> >              uint64_aligned_t client_handle; /* IN: handle to the client
>> page */
>> >              domid_t  client_domain; /* IN: the client domain id */
>> > -        } share;
>> > +        } share;
>> > +        struct mem_sharing_op_bulk_share {   /* OP_BULK_SHARE */
>>
>> Is the _share suffix really useful here? Even more, if you dropped
>> it and renamed "shared" below to "done", the same structure could
>> be used for a hypothetical bulk-unshare operation.
> 
> I don't really have a use-case for that at the moment and having it simply
> as "bulk" is not specific enough IMHO.

I tend to disagree, together wit OP_BULK_SHARE the structure
name doesn't need to be mode specific - as said, it can easily serve
for all kinds of bulk operations.

>> > +            domid_t client_domain;           /* IN: the client domain
>> id */
>> > +            uint64_aligned_t start;          /* IN: start gfn (normally
>> 0) */
>>
>> Marking this as just IN implies that the value doesn't change across
>> the operation.
> 
> In my book IN means it's used strictly only to pass input and it's value
> may or may not be the same afterwards.

I think our annotations are pretty consistent about marking
modified fields as IN/OUT. (Otoh we don't normally modify fields
when their value is of no relevance to the caller.)

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
  2015-10-09 18:13   ` Tamas K Lengyel
@ 2015-10-12 10:22     ` Andrew Cooper
  0 siblings, 0 replies; 11+ messages in thread
From: Andrew Cooper @ 2015-10-12 10:22 UTC (permalink / raw)
  To: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 5597 bytes --]

On 09/10/15 19:13, Tamas K Lengyel wrote:
>
>
> On Fri, Oct 9, 2015 at 7:26 AM, Andrew Cooper
> <andrew.cooper3@citrix.com <mailto:andrew.cooper3@citrix.com>> wrote:
>
>     On 08/10/15 21:57, Tamas K Lengyel wrote:
>     > diff --git a/xen/arch/x86/mm/mem_sharing.c
>     b/xen/arch/x86/mm/mem_sharing.c
>     > index a95e105..4cdddb1 100644
>     > --- a/xen/arch/x86/mm/mem_sharing.c
>     > +++ b/xen/arch/x86/mm/mem_sharing.c
>     > @@ -1293,6 +1293,37 @@ int relinquish_shared_pages(struct domain *d)
>     >      return rc;
>     >  }
>     >
>     > +static int bulk_share(struct domain *d, struct domain *cd,
>     unsigned long max,
>     > +                      struct mem_sharing_op_bulk_share *bulk)
>     > +{
>     > +    int rc = 0;
>     > +    shr_handle_t sh, ch;
>     > +
>     > +    while( bulk->start <= max )
>     > +    {
>     > +        if ( mem_sharing_nominate_page(d, bulk->start, 0, &sh)
>     != 0 )
>
>     This swallows the error from mem_sharing_nominate_page().  Some errors
>     might be safe to ignore in this context, but ones like ENOMEM most
>     certainly are not.
>
>     You should record the error into rc and switch ( rc ) to
>     ignore/process
>     the error, passing hard errors straight up.
>
>
> In my experience all errors here are safe to ignore. Yes, if an ENOMEM
> condition presents itself the sharing will be incomplete (or even 0).
> There isn't much the user can do about that other than killing another
> domain to free up memory..

There is plenty which can be done, including ballooning down other
domains to make more memory (Which is an option XenServer will take if
the admin chooses).  The point is that it puts the decision in the hands
of the toolstack, but that can only happen when errors are correctly
propagated back to userspace.

> which will happen anyway automatically when a clone domain first hits
> an unshare operation. These conditions are better handled with the
> memsharing event listener.
>  
>
>
>     > +            goto next;
>     > +
>     > +        if ( mem_sharing_nominate_page(cd, bulk->start, 0, &ch)
>     != 0 )
>     > +            goto next;
>     > +
>     > +        if ( !mem_sharing_share_pages(d, bulk->start, sh, cd,
>     bulk->start, ch) )
>     > +            ++(bulk->shared);
>     > +
>     > +next:
>     > +        ++(bulk->start);
>     > +
>     > +        /* Check for continuation if it's not the last
>     iteration. */
>     > +        if ( bulk->start < max && hypercall_preempt_check() )
>     > +        {
>     > +            rc = 1;
>
>     Using -ERESTART here allows...
>
>     > +            break;
>     > +        }
>     > +    }
>     > +
>     > +    return rc;
>     > +}
>     > +
>     >  int
>     mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>     >  {
>     >      int rc;
>     > @@ -1467,6 +1498,59 @@ int
>     mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>     >          }
>     >          break;
>     >
>     > +        case XENMEM_sharing_op_bulk_share:
>     > +        {
>     > +            unsigned long max_sgfn, max_cgfn;
>     > +            struct domain *cd;
>     > +
>     > +            rc = -EINVAL;
>     > +            if ( !mem_sharing_enabled(d) )
>     > +                goto out;
>     > +
>     > +            rc =
>     rcu_lock_live_remote_domain_by_id(mso.u.bulk.client_domain,
>     > +                                                   &cd);
>     > +            if ( rc )
>     > +                goto out;
>     > +
>     > +            rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd, mso.op);
>     > +            if ( rc )
>     > +            {
>     > +                rcu_unlock_domain(cd);
>     > +                goto out;
>     > +            }
>     > +
>     > +            if ( !mem_sharing_enabled(cd) )
>     > +            {
>     > +                rcu_unlock_domain(cd);
>     > +                rc = -EINVAL;
>     > +                goto out;
>     > +            }
>     > +
>     > +            max_sgfn = domain_get_maximum_gpfn(d);
>     > +            max_cgfn = domain_get_maximum_gpfn(cd);
>     > +
>     > +            if ( max_sgfn != max_cgfn || max_sgfn <
>     mso.u.bulk.start )
>     > +            {
>     > +                rcu_unlock_domain(cd);
>     > +                rc = -EINVAL;
>     > +                goto out;
>     > +            }
>     > +
>     > +            rc = bulk_share(d, cd, max_sgfn, &mso.u.bulk);
>     > +            if ( rc )
>
>     ... this check to be selective.
>
>
> Sure, but I don't have a usecase for returning the error code to the
> user from the nominate/sharing op. The reason for that is that just
> return the error condition by itself is not very uself. Furthermore,
> the gfn at which the operation failed would have to be passed to allow
> for debugging. But for debugging the user could also just do this loop
> on his side where he would readily have this information. So I would
> rather just keep this op as simple as possible.
>  
>
>
>     > +            {
>     > +                if ( __copy_to_guest(arg, &mso, 1) )
>
>     This __copy_to_guest() needs to happen unconditionally before creating
>     the continuation, as it contains the continuation information.
>
>
> It does happen before setting up the continuation and the continuation
> only gets setup if this succeeds.
>  
>
>     It also needs to happen on the success path, so .shared is correct.
>
>
> It does happen on the success path below for !rc for all sharing ops.

Ah - so it does.  Sorry for the noise.

~Andrew

[-- Attachment #1.2: Type: text/html, Size: 10559 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
  2015-10-12  6:42     ` Jan Beulich
@ 2015-10-15 16:09       ` Tamas K Lengyel
  2015-10-15 16:54         ` Tamas K Lengyel
  0 siblings, 1 reply; 11+ messages in thread
From: Tamas K Lengyel @ 2015-10-15 16:09 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Andrew Cooper, Ian Jackson, Tamas K Lengyel, Xen-devel,
	Keir Fraser


[-- Attachment #1.1: Type: text/plain, Size: 6613 bytes --]

On Mon, Oct 12, 2015 at 12:42 AM, Jan Beulich <JBeulich@suse.com> wrote:

> >>> On 09.10.15 at 19:55, <tamas@tklengyel.com> wrote:
> > On Fri, Oct 9, 2015 at 1:51 AM, Jan Beulich <JBeulich@suse.com> wrote:
> >
> >> >>> On 08.10.15 at 22:57, <tamas@tklengyel.com> wrote:
> >> > --- a/xen/arch/x86/mm/mem_sharing.c
> >> > +++ b/xen/arch/x86/mm/mem_sharing.c
> >> > @@ -1293,6 +1293,37 @@ int relinquish_shared_pages(struct domain *d)
> >> >      return rc;
> >> >  }
> >> >
> >> > +static int bulk_share(struct domain *d, struct domain *cd, unsigned
> >> long max,
> >> > +                      struct mem_sharing_op_bulk_share *bulk)
> >> > +{
> >> > +    int rc = 0;
> >> > +    shr_handle_t sh, ch;
> >> > +
> >> > +    while( bulk->start <= max )
> >> > +    {
> >> > +        if ( mem_sharing_nominate_page(d, bulk->start, 0, &sh) != 0 )
> >> > +            goto next;
> >> > +
> >> > +        if ( mem_sharing_nominate_page(cd, bulk->start, 0, &ch) != 0
> )
> >> > +            goto next;
> >> > +
> >> > +        if ( !mem_sharing_share_pages(d, bulk->start, sh, cd,
> >> bulk->start, ch) )
> >> > +            ++(bulk->shared);
> >>
> >> Pointless parentheses.
> >>
> >>
> > Pointless but harmless and I like this style better.
>
> Well, it's code you're the maintainer for, so you know, but generally
> I think parentheses should be used to clarify things, not to make
> code harder to read (the effect is minimal here, but extended to a
> more complex expression this may well matter).
>

It might just be me but that's exactly what the parentheses do here. I tend
to read operation from left-to-right and the ++ on the left actually is the
last operation which will be performed after the pointer dereference. The
parentheses make that explicit.


>
> >> > +        ++(bulk->start);
> >> > +
> >> > +        /* Check for continuation if it's not the last iteration. */
> >> > +        if ( bulk->start < max && hypercall_preempt_check() )
> >> > +        {
> >> > +            rc = 1;
> >> > +            break;
> >>
> >> This could simple be a return statement, avoiding the need for a
> >> local variable (the type of which would need to be changed, see
> >> below).
> >
> > It's reused in the caller to indicate where the mso copyback happens and
> rc
> > is of type int in the caller.
>
> But you're actually abusing the int nature of rc in the caller to store
> a boolean value. I'd really like to see this be done consistently -
> either use another variable (or, as suggested, no intermediate
> variable in the caller at all), or (also taking into consideration Andrew's
> comments) propagate an actual -E value from here (along with not
> losing errors).
>

So the way we return >0 values is copied from the hypercall continuation
handler of mem_access - there it returns the GFN value which was the same
approach I used in the first version of this patch. Now since we store the
gfn value in the mso struct itself, this just returns an indicator that
there is more work to be done. Otherwise the logic and setup is the same. I
rather not mix these as long as mem_access does it similarly (returning a
positive value indicating more work is to be done).


>
> >> > +            rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd, mso.op);
> >> > +            if ( rc )
> >> > +            {
> >> > +                rcu_unlock_domain(cd);
> >> > +                goto out;
> >> > +            }
> >> > +
> >> > +            if ( !mem_sharing_enabled(cd) )
> >> > +            {
> >> > +                rcu_unlock_domain(cd);
> >> > +                rc = -EINVAL;
> >> > +                goto out;
> >> > +            }
> >> > +
> >> > +            max_sgfn = domain_get_maximum_gpfn(d);
> >> > +            max_cgfn = domain_get_maximum_gpfn(cd);
> >> > +
> >> > +            if ( max_sgfn != max_cgfn || max_sgfn < mso.u.bulk.start
> )
> >>
> >> Are both domains required to be paused for this operation? If so,
> >> shouldn't you enforce this? If not, by the time you reach the if()
> >> the values are stale.
> >
> > It's expected that the user has exclusive tool-side lock on the domains
> > before issuing this hypercall and that the domains are paused already.
>
> As said -  in that case, please enforce this (by bailing if not so).
>

Ack.


> >> > --- a/xen/include/public/memory.h
> >> > +++ b/xen/include/public/memory.h
> >> > @@ -447,6 +447,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t);
> >> >  #define XENMEM_sharing_op_debug_gref        5
> >> >  #define XENMEM_sharing_op_add_physmap       6
> >> >  #define XENMEM_sharing_op_audit             7
> >> > +#define XENMEM_sharing_op_bulk_share        8
> >> >
> >> >  #define XENMEM_SHARING_OP_S_HANDLE_INVALID  (-10)
> >> >  #define XENMEM_SHARING_OP_C_HANDLE_INVALID  (-9)
> >> > @@ -482,7 +483,13 @@ struct xen_mem_sharing_op {
> >> >              uint64_aligned_t client_gfn;    /* IN: the client gfn */
> >> >              uint64_aligned_t client_handle; /* IN: handle to the
> client
> >> page */
> >> >              domid_t  client_domain; /* IN: the client domain id */
> >> > -        } share;
> >> > +        } share;
> >> > +        struct mem_sharing_op_bulk_share {   /* OP_BULK_SHARE */
> >>
> >> Is the _share suffix really useful here? Even more, if you dropped
> >> it and renamed "shared" below to "done", the same structure could
> >> be used for a hypothetical bulk-unshare operation.
> >
> > I don't really have a use-case for that at the moment and having it
> simply
> > as "bulk" is not specific enough IMHO.
>
> I tend to disagree, together wit OP_BULK_SHARE the structure
> name doesn't need to be mode specific - as said, it can easily serve
> for all kinds of bulk operations.
>

Sure, I'll change it.


>
> >> > +            domid_t client_domain;           /* IN: the client domain
> >> id */
> >> > +            uint64_aligned_t start;          /* IN: start gfn
> (normally
> >> 0) */
> >>
> >> Marking this as just IN implies that the value doesn't change across
> >> the operation.
> >
> > In my book IN means it's used strictly only to pass input and it's value
> > may or may not be the same afterwards.
>
> I think our annotations are pretty consistent about marking
> modified fields as IN/OUT. (Otoh we don't normally modify fields
> when their value is of no relevance to the caller.)
>

Making it an IN/OUT would suggest it has a value that the user could need -
which is not the case. I can describe this in the header that the field is
used internally and the value may not be the same after the hypercall
finished but the value it changed to has no particular importance for the
caller.

Tamas

[-- Attachment #1.2: Type: text/html, Size: 9498 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
  2015-10-15 16:09       ` Tamas K Lengyel
@ 2015-10-15 16:54         ` Tamas K Lengyel
  2015-10-16  6:34           ` Jan Beulich
  0 siblings, 1 reply; 11+ messages in thread
From: Tamas K Lengyel @ 2015-10-15 16:54 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Andrew Cooper, Ian Jackson, Tamas K Lengyel, Xen-devel,
	Keir Fraser


[-- Attachment #1.1: Type: text/plain, Size: 1207 bytes --]

>
>
> >> > +            rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd, mso.op);
> >> > +            if ( rc )
> >> > +            {
> >> > +                rcu_unlock_domain(cd);
> >> > +                goto out;
> >> > +            }
> >> > +
> >> > +            if ( !mem_sharing_enabled(cd) )
> >> > +            {
> >> > +                rcu_unlock_domain(cd);
> >> > +                rc = -EINVAL;
> >> > +                goto out;
> >> > +            }
> >> > +
> >> > +            max_sgfn = domain_get_maximum_gpfn(d);
> >> > +            max_cgfn = domain_get_maximum_gpfn(cd);
> >> > +
> >> > +            if ( max_sgfn != max_cgfn || max_sgfn < mso.u.bulk.start
> )
> >>
> >> Are both domains required to be paused for this operation? If so,
> >> shouldn't you enforce this? If not, by the time you reach the if()
> >> the values are stale.
> >
> > It's expected that the user has exclusive tool-side lock on the domains
> > before issuing this hypercall and that the domains are paused already.
>
> As said -  in that case, please enforce this (by bailing if not so).
>

Is there a convenient way to check if a domain is already paused? I can't
seem to find anything of that effect.

Thanks,
Tamas

[-- Attachment #1.2: Type: text/html, Size: 1829 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
  2015-10-15 16:54         ` Tamas K Lengyel
@ 2015-10-16  6:34           ` Jan Beulich
  0 siblings, 0 replies; 11+ messages in thread
From: Jan Beulich @ 2015-10-16  6:34 UTC (permalink / raw)
  To: Tamas K Lengyel
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Andrew Cooper, Ian Jackson, Tamas K Lengyel, Xen-devel,
	Keir Fraser

>>> On 15.10.15 at 18:54, <tamas@tklengyel.com> wrote:
>> 
>>
>> >> > +            rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd, mso.op);
>> >> > +            if ( rc )
>> >> > +            {
>> >> > +                rcu_unlock_domain(cd);
>> >> > +                goto out;
>> >> > +            }
>> >> > +
>> >> > +            if ( !mem_sharing_enabled(cd) )
>> >> > +            {
>> >> > +                rcu_unlock_domain(cd);
>> >> > +                rc = -EINVAL;
>> >> > +                goto out;
>> >> > +            }
>> >> > +
>> >> > +            max_sgfn = domain_get_maximum_gpfn(d);
>> >> > +            max_cgfn = domain_get_maximum_gpfn(cd);
>> >> > +
>> >> > +            if ( max_sgfn != max_cgfn || max_sgfn < mso.u.bulk.start )
>> >>
>> >> Are both domains required to be paused for this operation? If so,
>> >> shouldn't you enforce this? If not, by the time you reach the if()
>> >> the values are stale.
>> >
>> > It's expected that the user has exclusive tool-side lock on the domains
>> > before issuing this hypercall and that the domains are paused already.
>>
>> As said -  in that case, please enforce this (by bailing if not so).
> 
> Is there a convenient way to check if a domain is already paused? I can't
> seem to find anything of that effect.

I notice that you already sent out v3, but you check ->pause_count
there. You really want to look at controller_pause_count, as Xen
internal pausing is something that you would have to expect to change
behind your back (controller based pausing could change too of course,
but that would be a tool stack bug then).

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-10-16  6:34 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-08 20:57 [PATCH v2 1/2] x86/mem-sharing: Bulk mem-sharing entire domains Tamas K Lengyel
2015-10-08 20:57 ` [PATCH v2 2/2] tests/mem-sharing: Add bulk option to memshrtool Tamas K Lengyel
2015-10-09  7:51 ` [PATCH v2 1/2] x86/mem-sharing: Bulk mem-sharing entire domains Jan Beulich
2015-10-09 17:55   ` Tamas K Lengyel
2015-10-12  6:42     ` Jan Beulich
2015-10-15 16:09       ` Tamas K Lengyel
2015-10-15 16:54         ` Tamas K Lengyel
2015-10-16  6:34           ` Jan Beulich
2015-10-09 13:26 ` Andrew Cooper
2015-10-09 18:13   ` Tamas K Lengyel
2015-10-12 10:22     ` Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.