xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8] x86/mem-sharing: mem-sharing a range of memory
@ 2016-07-20 18:01 Tamas K Lengyel
  2016-07-26  9:12 ` George Dunlap
  2016-08-01 11:12 ` Jan Beulich
  0 siblings, 2 replies; 8+ messages in thread
From: Tamas K Lengyel @ 2016-07-20 18:01 UTC (permalink / raw)
  To: xen-devel
  Cc: George Dunlap, Tamas K Lengyel, Ian Jackson, Jan Beulich, Andrew Cooper

Currently mem-sharing can be performed on a page-by-page basis from the control
domain. However, this process is quite wasteful when a range of pages have to
be deduplicated.

This patch introduces a new mem_sharing memop for range sharing where
the user doesn't have to separately nominate each page in both the source and
destination domain, and the looping over all pages happen in the hypervisor.
This significantly reduces the overhead of sharing a range of memory.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>

v8: style fixes and minor adjustments
---
 tools/libxc/include/xenctrl.h        |  15 ++++
 tools/libxc/xc_memshr.c              |  19 +++++
 tools/tests/mem-sharing/memshrtool.c |  22 ++++++
 xen/arch/x86/mm/mem_sharing.c        | 140 +++++++++++++++++++++++++++++++++++
 xen/include/public/memory.h          |  10 ++-
 5 files changed, 205 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index e904bd5..3782eff 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2334,6 +2334,21 @@ int xc_memshr_add_to_physmap(xc_interface *xch,
                     domid_t client_domain,
                     unsigned long client_gfn);
 
+/* Allows to deduplicate a range of memory of a client domain. Using
+ * this function is equivalent of calling xc_memshr_nominate_gfn for each gfn
+ * in the two domains followed by xc_memshr_share_gfns.
+ *
+ * May fail with -EINVAL if the source and client domain have different
+ * memory size or if memory sharing is not enabled on either of the domains.
+ * May also fail with -ENOMEM if there isn't enough memory available to store
+ * the sharing metadata before deduplication can happen.
+ */
+int xc_memshr_range_share(xc_interface *xch,
+                          domid_t source_domain,
+                          domid_t client_domain,
+                          uint64_t start,
+                          uint64_t end);
+
 /* Debug calls: return the number of pages referencing the shared frame backing
  * the input argument. Should be one or greater. 
  *
diff --git a/tools/libxc/xc_memshr.c b/tools/libxc/xc_memshr.c
index deb0aa4..2b871c7 100644
--- a/tools/libxc/xc_memshr.c
+++ b/tools/libxc/xc_memshr.c
@@ -181,6 +181,25 @@ int xc_memshr_add_to_physmap(xc_interface *xch,
     return xc_memshr_memop(xch, source_domain, &mso);
 }
 
+int xc_memshr_range_share(xc_interface *xch,
+                          domid_t source_domain,
+                          domid_t client_domain,
+                          uint64_t start,
+                          uint64_t end)
+{
+    xen_mem_sharing_op_t mso;
+
+    memset(&mso, 0, sizeof(mso));
+
+    mso.op = XENMEM_sharing_op_range_share;
+
+    mso.u.range.client_domain = client_domain;
+    mso.u.range.start = start;
+    mso.u.range.end = end;
+
+    return xc_memshr_memop(xch, source_domain, &mso);
+}
+
 int xc_memshr_domain_resume(xc_interface *xch,
                             domid_t domid)
 {
diff --git a/tools/tests/mem-sharing/memshrtool.c b/tools/tests/mem-sharing/memshrtool.c
index 437c7c9..2af6a9e 100644
--- a/tools/tests/mem-sharing/memshrtool.c
+++ b/tools/tests/mem-sharing/memshrtool.c
@@ -24,6 +24,8 @@ static int usage(const char* prog)
     printf("  nominate <domid> <gfn>  - Nominate a page for sharing.\n");
     printf("  share <domid> <gfn> <handle> <source> <source-gfn> <source-handle>\n");
     printf("                          - Share two pages.\n");
+    printf("  range <source-domid> <destination-domid> <start-gfn> <end-gfn>\n");
+    printf("                          - Share pages between domains in range.\n");
     printf("  unshare <domid> <gfn>   - Unshare a page by grabbing a writable map.\n");
     printf("  add-to-physmap <domid> <gfn> <source> <source-gfn> <source-handle>\n");
     printf("                          - Populate a page in a domain with a shared page.\n");
@@ -180,6 +182,26 @@ int main(int argc, const char** argv)
         }
         printf("Audit returned %d errors.\n", rc);
     }
+    else if( !strcasecmp(cmd, "range") )
+    {
+        domid_t sdomid, cdomid;
+        int rc;
+        uint64_t start, end;
+
+        if ( argc != 6 )
+            return usage(argv[0]);
 
+        sdomid = strtol(argv[2], NULL, 0);
+        cdomid = strtol(argv[3], NULL, 0);
+        start = strtoul(argv[4], NULL, 0);
+        end = strtoul(argv[5], NULL, 0);
+
+        rc = xc_memshr_range_share(xch, sdomid, cdomid, start, end);
+        if ( rc < 0 )
+        {
+            printf("error executing xc_memshr_range_share: %s\n", strerror(errno));
+            return rc;
+        }
+    }
     return 0;
 }
diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index a522423..329fbd9 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -1294,6 +1294,58 @@ int relinquish_shared_pages(struct domain *d)
     return rc;
 }
 
+static int range_share(struct domain *d, struct domain *cd,
+                       struct mem_sharing_op_range *range)
+{
+    int rc = 0;
+    shr_handle_t sh, ch;
+    unsigned long start = range->_scratchspace ?: range->start;
+
+    while( range->end >= start )
+    {
+        /*
+         * We only break out if we run out of memory as individual pages may
+         * legitimately be unsharable and we just want to skip over those.
+         */
+        rc = mem_sharing_nominate_page(d, start, 0, &sh);
+        if ( rc == -ENOMEM )
+            break;
+
+        if ( !rc )
+        {
+            rc = mem_sharing_nominate_page(cd, start, 0, &ch);
+            if ( rc == -ENOMEM )
+                break;
+
+            if ( !rc )
+            {
+                /* If we get here this should be guaranteed to succeed. */
+                rc = mem_sharing_share_pages(d, start, sh,
+                                             cd, start, ch);
+                ASSERT(!rc);
+            }
+        }
+
+        /* Check for continuation if it's not the last iteration. */
+        if ( range->end >= ++start && hypercall_preempt_check() )
+        {
+            rc = 1;
+            break;
+        }
+    }
+
+    range->_scratchspace = start;
+
+    /*
+     * The last page may fail with -EINVAL, and for range sharing we don't
+     * care about that.
+     */
+    if ( range->end < start && rc == -EINVAL )
+        rc = 0;
+
+    return rc;
+}
+
 int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
 {
     int rc;
@@ -1468,6 +1520,94 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
         }
         break;
 
+        case XENMEM_sharing_op_range_share:
+        {
+            unsigned long max_sgfn, max_cgfn;
+            struct domain *cd;
+
+            rc = -EINVAL;
+            if ( mso.u.range._pad[0] || mso.u.range._pad[1] ||
+                 mso.u.range._pad[2] )
+                 goto out;
+
+            /*
+             * We use _scratchscape for the hypercall continuation value.
+             * Ideally the user sets this to 0 in the beginning but
+             * there is no good way of enforcing that here, so we just check
+             * that it's at least in range.
+             */
+            if ( mso.u.range._scratchspace &&
+                 (mso.u.range._scratchspace < mso.u.range.start ||
+                  mso.u.range._scratchspace > mso.u.range.end) )
+                goto out;
+
+            if ( !mem_sharing_enabled(d) )
+                goto out;
+
+            rc = rcu_lock_live_remote_domain_by_id(mso.u.range.client_domain,
+                                                   &cd);
+            if ( rc )
+                goto out;
+
+            /*
+             * We reuse XENMEM_sharing_op_share XSM check here as this is
+             * essentially the same concept repeated over multiple pages.
+             */
+            rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd,
+                                    XENMEM_sharing_op_share);
+            if ( rc )
+            {
+                rcu_unlock_domain(cd);
+                goto out;
+            }
+
+            if ( !mem_sharing_enabled(cd) )
+            {
+                rcu_unlock_domain(cd);
+                rc = -EINVAL;
+                goto out;
+            }
+
+            /*
+             * Sanity check only, the client should keep the domains paused for
+             * the duration of this op.
+             */
+            if ( !atomic_read(&d->pause_count) ||
+                 !atomic_read(&cd->pause_count) )
+            {
+                rcu_unlock_domain(cd);
+                rc = -EINVAL;
+                goto out;
+            }
+
+            max_sgfn = domain_get_maximum_gpfn(d);
+            max_cgfn = domain_get_maximum_gpfn(cd);
+
+            if ( max_sgfn < mso.u.range.start || max_sgfn < mso.u.range.end ||
+                 max_cgfn < mso.u.range.start || max_cgfn < mso.u.range.end )
+            {
+                rcu_unlock_domain(cd);
+                rc = -EINVAL;
+                goto out;
+            }
+
+            rc = range_share(d, cd, &mso.u.range);
+            rcu_unlock_domain(cd);
+
+            if ( rc > 0 )
+            {
+                if ( __copy_to_guest(arg, &mso, 1) )
+                    rc = -EFAULT;
+                else
+                    rc = hypercall_create_continuation(__HYPERVISOR_memory_op,
+                                                       "lh", XENMEM_sharing_op,
+                                                       arg);
+            }
+            else
+                mso.u.range._scratchspace = 0;
+        }
+        break;
+
         case XENMEM_sharing_op_debug_gfn:
         {
             unsigned long gfn = mso.u.debug.u.gfn;
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index 29ec571..e0bc018 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -465,6 +465,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t);
 #define XENMEM_sharing_op_debug_gref        5
 #define XENMEM_sharing_op_add_physmap       6
 #define XENMEM_sharing_op_audit             7
+#define XENMEM_sharing_op_range_share       8
 
 #define XENMEM_SHARING_OP_S_HANDLE_INVALID  (-10)
 #define XENMEM_SHARING_OP_C_HANDLE_INVALID  (-9)
@@ -500,7 +501,14 @@ struct xen_mem_sharing_op {
             uint64_aligned_t client_gfn;    /* IN: the client gfn */
             uint64_aligned_t client_handle; /* IN: handle to the client page */
             domid_t  client_domain; /* IN: the client domain id */
-        } share; 
+        } share;
+        struct mem_sharing_op_range {         /* OP_RANGE_SHARE */
+            uint64_aligned_t start;          /* IN: start gfn. */
+            uint64_aligned_t end;            /* IN: end gfn (inclusive) */
+            uint64_aligned_t _scratchspace;  /* Must be set to 0 */
+            domid_t client_domain;           /* IN: the client domain id */
+            uint16_t _pad[3];                /* Must be set to 0 */
+        } range;
         struct mem_sharing_op_debug {     /* OP_DEBUG_xxx */
             union {
                 uint64_aligned_t gfn;      /* IN: gfn to debug          */
-- 
2.8.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v8] x86/mem-sharing: mem-sharing a range of memory
  2016-07-20 18:01 [PATCH v8] x86/mem-sharing: mem-sharing a range of memory Tamas K Lengyel
@ 2016-07-26  9:12 ` George Dunlap
  2016-07-26 15:22   ` Tamas K Lengyel
  2016-08-01 11:12 ` Jan Beulich
  1 sibling, 1 reply; 8+ messages in thread
From: George Dunlap @ 2016-07-26  9:12 UTC (permalink / raw)
  To: Tamas K Lengyel; +Cc: xen-devel, Ian Jackson, Jan Beulich, Andrew Cooper

On Wed, Jul 20, 2016 at 7:01 PM, Tamas K Lengyel
<tamas.lengyel@zentific.com> wrote:
> Currently mem-sharing can be performed on a page-by-page basis from the control
> domain. However, this process is quite wasteful when a range of pages have to
> be deduplicated.
>
> This patch introduces a new mem_sharing memop for range sharing where
> the user doesn't have to separately nominate each page in both the source and
> destination domain, and the looping over all pages happen in the hypervisor.
> This significantly reduces the overhead of sharing a range of memory.
>
> Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
> Acked-by: Wei Liu <wei.liu2@citrix.com>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: George Dunlap <george.dunlap@eu.citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>
> v8: style fixes and minor adjustments
> ---
>  tools/libxc/include/xenctrl.h        |  15 ++++
>  tools/libxc/xc_memshr.c              |  19 +++++
>  tools/tests/mem-sharing/memshrtool.c |  22 ++++++
>  xen/arch/x86/mm/mem_sharing.c        | 140 +++++++++++++++++++++++++++++++++++
>  xen/include/public/memory.h          |  10 ++-
>  5 files changed, 205 insertions(+), 1 deletion(-)
>
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index e904bd5..3782eff 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -2334,6 +2334,21 @@ int xc_memshr_add_to_physmap(xc_interface *xch,
>                      domid_t client_domain,
>                      unsigned long client_gfn);
>
> +/* Allows to deduplicate a range of memory of a client domain. Using
> + * this function is equivalent of calling xc_memshr_nominate_gfn for each gfn
> + * in the two domains followed by xc_memshr_share_gfns.
> + *
> + * May fail with -EINVAL if the source and client domain have different
> + * memory size or if memory sharing is not enabled on either of the domains.
> + * May also fail with -ENOMEM if there isn't enough memory available to store
> + * the sharing metadata before deduplication can happen.
> + */
> +int xc_memshr_range_share(xc_interface *xch,
> +                          domid_t source_domain,
> +                          domid_t client_domain,
> +                          uint64_t start,
> +                          uint64_t end);
> +
>  /* Debug calls: return the number of pages referencing the shared frame backing
>   * the input argument. Should be one or greater.
>   *
> diff --git a/tools/libxc/xc_memshr.c b/tools/libxc/xc_memshr.c
> index deb0aa4..2b871c7 100644
> --- a/tools/libxc/xc_memshr.c
> +++ b/tools/libxc/xc_memshr.c
> @@ -181,6 +181,25 @@ int xc_memshr_add_to_physmap(xc_interface *xch,
>      return xc_memshr_memop(xch, source_domain, &mso);
>  }
>
> +int xc_memshr_range_share(xc_interface *xch,
> +                          domid_t source_domain,
> +                          domid_t client_domain,
> +                          uint64_t start,
> +                          uint64_t end)
> +{
> +    xen_mem_sharing_op_t mso;
> +
> +    memset(&mso, 0, sizeof(mso));
> +
> +    mso.op = XENMEM_sharing_op_range_share;
> +
> +    mso.u.range.client_domain = client_domain;
> +    mso.u.range.start = start;
> +    mso.u.range.end = end;
> +
> +    return xc_memshr_memop(xch, source_domain, &mso);
> +}
> +
>  int xc_memshr_domain_resume(xc_interface *xch,
>                              domid_t domid)
>  {
> diff --git a/tools/tests/mem-sharing/memshrtool.c b/tools/tests/mem-sharing/memshrtool.c
> index 437c7c9..2af6a9e 100644
> --- a/tools/tests/mem-sharing/memshrtool.c
> +++ b/tools/tests/mem-sharing/memshrtool.c
> @@ -24,6 +24,8 @@ static int usage(const char* prog)
>      printf("  nominate <domid> <gfn>  - Nominate a page for sharing.\n");
>      printf("  share <domid> <gfn> <handle> <source> <source-gfn> <source-handle>\n");
>      printf("                          - Share two pages.\n");
> +    printf("  range <source-domid> <destination-domid> <start-gfn> <end-gfn>\n");
> +    printf("                          - Share pages between domains in range.\n");
>      printf("  unshare <domid> <gfn>   - Unshare a page by grabbing a writable map.\n");
>      printf("  add-to-physmap <domid> <gfn> <source> <source-gfn> <source-handle>\n");
>      printf("                          - Populate a page in a domain with a shared page.\n");
> @@ -180,6 +182,26 @@ int main(int argc, const char** argv)
>          }
>          printf("Audit returned %d errors.\n", rc);
>      }
> +    else if( !strcasecmp(cmd, "range") )
> +    {
> +        domid_t sdomid, cdomid;
> +        int rc;
> +        uint64_t start, end;
> +
> +        if ( argc != 6 )
> +            return usage(argv[0]);
>
> +        sdomid = strtol(argv[2], NULL, 0);
> +        cdomid = strtol(argv[3], NULL, 0);
> +        start = strtoul(argv[4], NULL, 0);
> +        end = strtoul(argv[5], NULL, 0);
> +
> +        rc = xc_memshr_range_share(xch, sdomid, cdomid, start, end);
> +        if ( rc < 0 )
> +        {
> +            printf("error executing xc_memshr_range_share: %s\n", strerror(errno));
> +            return rc;
> +        }
> +    }
>      return 0;
>  }
> diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
> index a522423..329fbd9 100644
> --- a/xen/arch/x86/mm/mem_sharing.c
> +++ b/xen/arch/x86/mm/mem_sharing.c
> @@ -1294,6 +1294,58 @@ int relinquish_shared_pages(struct domain *d)
>      return rc;
>  }
>
> +static int range_share(struct domain *d, struct domain *cd,
> +                       struct mem_sharing_op_range *range)
> +{
> +    int rc = 0;
> +    shr_handle_t sh, ch;
> +    unsigned long start = range->_scratchspace ?: range->start;
> +
> +    while( range->end >= start )
> +    {
> +        /*
> +         * We only break out if we run out of memory as individual pages may
> +         * legitimately be unsharable and we just want to skip over those.
> +         */
> +        rc = mem_sharing_nominate_page(d, start, 0, &sh);
> +        if ( rc == -ENOMEM )
> +            break;
> +
> +        if ( !rc )
> +        {
> +            rc = mem_sharing_nominate_page(cd, start, 0, &ch);
> +            if ( rc == -ENOMEM )
> +                break;
> +
> +            if ( !rc )
> +            {
> +                /* If we get here this should be guaranteed to succeed. */
> +                rc = mem_sharing_share_pages(d, start, sh,
> +                                             cd, start, ch);
> +                ASSERT(!rc);
> +            }
> +        }
> +
> +        /* Check for continuation if it's not the last iteration. */
> +        if ( range->end >= ++start && hypercall_preempt_check() )
> +        {
> +            rc = 1;
> +            break;
> +        }
> +    }
> +
> +    range->_scratchspace = start;
> +
> +    /*
> +     * The last page may fail with -EINVAL, and for range sharing we don't
> +     * care about that.
> +     */
> +    if ( range->end < start && rc == -EINVAL )
> +        rc = 0;
> +
> +    return rc;
> +}
> +
>  int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>  {
>      int rc;
> @@ -1468,6 +1520,94 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>          }
>          break;
>
> +        case XENMEM_sharing_op_range_share:
> +        {
> +            unsigned long max_sgfn, max_cgfn;
> +            struct domain *cd;
> +
> +            rc = -EINVAL;
> +            if ( mso.u.range._pad[0] || mso.u.range._pad[1] ||
> +                 mso.u.range._pad[2] )
> +                 goto out;
> +
> +            /*
> +             * We use _scratchscape for the hypercall continuation value.
> +             * Ideally the user sets this to 0 in the beginning but
> +             * there is no good way of enforcing that here, so we just check
> +             * that it's at least in range.
> +             */
> +            if ( mso.u.range._scratchspace &&
> +                 (mso.u.range._scratchspace < mso.u.range.start ||
> +                  mso.u.range._scratchspace > mso.u.range.end) )
> +                goto out;
> +
> +            if ( !mem_sharing_enabled(d) )
> +                goto out;
> +
> +            rc = rcu_lock_live_remote_domain_by_id(mso.u.range.client_domain,
> +                                                   &cd);
> +            if ( rc )
> +                goto out;
> +
> +            /*
> +             * We reuse XENMEM_sharing_op_share XSM check here as this is
> +             * essentially the same concept repeated over multiple pages.
> +             */
> +            rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd,
> +                                    XENMEM_sharing_op_share);
> +            if ( rc )
> +            {
> +                rcu_unlock_domain(cd);
> +                goto out;
> +            }
> +
> +            if ( !mem_sharing_enabled(cd) )
> +            {
> +                rcu_unlock_domain(cd);
> +                rc = -EINVAL;
> +                goto out;
> +            }
> +
> +            /*
> +             * Sanity check only, the client should keep the domains paused for
> +             * the duration of this op.
> +             */
> +            if ( !atomic_read(&d->pause_count) ||
> +                 !atomic_read(&cd->pause_count) )
> +            {
> +                rcu_unlock_domain(cd);
> +                rc = -EINVAL;
> +                goto out;
> +            }
> +
> +            max_sgfn = domain_get_maximum_gpfn(d);
> +            max_cgfn = domain_get_maximum_gpfn(cd);
> +
> +            if ( max_sgfn < mso.u.range.start || max_sgfn < mso.u.range.end ||
> +                 max_cgfn < mso.u.range.start || max_cgfn < mso.u.range.end )
> +            {
> +                rcu_unlock_domain(cd);
> +                rc = -EINVAL;
> +                goto out;
> +            }
> +
> +            rc = range_share(d, cd, &mso.u.range);
> +            rcu_unlock_domain(cd);
> +
> +            if ( rc > 0 )
> +            {
> +                if ( __copy_to_guest(arg, &mso, 1) )
> +                    rc = -EFAULT;
> +                else
> +                    rc = hypercall_create_continuation(__HYPERVISOR_memory_op,
> +                                                       "lh", XENMEM_sharing_op,
> +                                                       arg);
> +            }
> +            else
> +                mso.u.range._scratchspace = 0;
> +        }
> +        break;
> +
>          case XENMEM_sharing_op_debug_gfn:
>          {
>              unsigned long gfn = mso.u.debug.u.gfn;
> diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
> index 29ec571..e0bc018 100644
> --- a/xen/include/public/memory.h
> +++ b/xen/include/public/memory.h
> @@ -465,6 +465,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t);
>  #define XENMEM_sharing_op_debug_gref        5
>  #define XENMEM_sharing_op_add_physmap       6
>  #define XENMEM_sharing_op_audit             7
> +#define XENMEM_sharing_op_range_share       8
>
>  #define XENMEM_SHARING_OP_S_HANDLE_INVALID  (-10)
>  #define XENMEM_SHARING_OP_C_HANDLE_INVALID  (-9)
> @@ -500,7 +501,14 @@ struct xen_mem_sharing_op {
>              uint64_aligned_t client_gfn;    /* IN: the client gfn */
>              uint64_aligned_t client_handle; /* IN: handle to the client page */
>              domid_t  client_domain; /* IN: the client domain id */
> -        } share;
> +        } share;
> +        struct mem_sharing_op_range {         /* OP_RANGE_SHARE */
> +            uint64_aligned_t start;          /* IN: start gfn. */
> +            uint64_aligned_t end;            /* IN: end gfn (inclusive) */
> +            uint64_aligned_t _scratchspace;  /* Must be set to 0 */

Tamas,

Why include this "scratchspace" that's not used in the interface,
rather than just doing what the memory operations in
xen/common/memory.c do, and store it in EAX by shifting it over by
MEMOP_EXTENT_SHIFT?  I looked through the history and I see that v1
did something very much like that, but it was changed to using the
scratch space without any explanation.

Having this in the public interface is ugly; and what's worse, it
exposes some of the internal mechanism to the guest.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v8] x86/mem-sharing: mem-sharing a range of memory
  2016-07-26  9:12 ` George Dunlap
@ 2016-07-26 15:22   ` Tamas K Lengyel
  2016-07-26 15:49     ` George Dunlap
  0 siblings, 1 reply; 8+ messages in thread
From: Tamas K Lengyel @ 2016-07-26 15:22 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Ian Jackson, Jan Beulich, Andrew Cooper

On Tue, Jul 26, 2016 at 3:12 AM, George Dunlap <george.dunlap@citrix.com> wrote:
> On Wed, Jul 20, 2016 at 7:01 PM, Tamas K Lengyel
> <tamas.lengyel@zentific.com> wrote:
>> Currently mem-sharing can be performed on a page-by-page basis from the control
>> domain. However, this process is quite wasteful when a range of pages have to
>> be deduplicated.
>>
>> This patch introduces a new mem_sharing memop for range sharing where
>> the user doesn't have to separately nominate each page in both the source and
>> destination domain, and the looping over all pages happen in the hypervisor.
>> This significantly reduces the overhead of sharing a range of memory.
>>
>> Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
>> Acked-by: Wei Liu <wei.liu2@citrix.com>
>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
>> Cc: George Dunlap <george.dunlap@eu.citrix.com>
>> Cc: Jan Beulich <jbeulich@suse.com>
>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>
>> v8: style fixes and minor adjustments
>> ---
>>  tools/libxc/include/xenctrl.h        |  15 ++++
>>  tools/libxc/xc_memshr.c              |  19 +++++
>>  tools/tests/mem-sharing/memshrtool.c |  22 ++++++
>>  xen/arch/x86/mm/mem_sharing.c        | 140 +++++++++++++++++++++++++++++++++++
>>  xen/include/public/memory.h          |  10 ++-
>>  5 files changed, 205 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
>> index e904bd5..3782eff 100644
>> --- a/tools/libxc/include/xenctrl.h
>> +++ b/tools/libxc/include/xenctrl.h
>> @@ -2334,6 +2334,21 @@ int xc_memshr_add_to_physmap(xc_interface *xch,
>>                      domid_t client_domain,
>>                      unsigned long client_gfn);
>>
>> +/* Allows to deduplicate a range of memory of a client domain. Using
>> + * this function is equivalent of calling xc_memshr_nominate_gfn for each gfn
>> + * in the two domains followed by xc_memshr_share_gfns.
>> + *
>> + * May fail with -EINVAL if the source and client domain have different
>> + * memory size or if memory sharing is not enabled on either of the domains.
>> + * May also fail with -ENOMEM if there isn't enough memory available to store
>> + * the sharing metadata before deduplication can happen.
>> + */
>> +int xc_memshr_range_share(xc_interface *xch,
>> +                          domid_t source_domain,
>> +                          domid_t client_domain,
>> +                          uint64_t start,
>> +                          uint64_t end);
>> +
>>  /* Debug calls: return the number of pages referencing the shared frame backing
>>   * the input argument. Should be one or greater.
>>   *
>> diff --git a/tools/libxc/xc_memshr.c b/tools/libxc/xc_memshr.c
>> index deb0aa4..2b871c7 100644
>> --- a/tools/libxc/xc_memshr.c
>> +++ b/tools/libxc/xc_memshr.c
>> @@ -181,6 +181,25 @@ int xc_memshr_add_to_physmap(xc_interface *xch,
>>      return xc_memshr_memop(xch, source_domain, &mso);
>>  }
>>
>> +int xc_memshr_range_share(xc_interface *xch,
>> +                          domid_t source_domain,
>> +                          domid_t client_domain,
>> +                          uint64_t start,
>> +                          uint64_t end)
>> +{
>> +    xen_mem_sharing_op_t mso;
>> +
>> +    memset(&mso, 0, sizeof(mso));
>> +
>> +    mso.op = XENMEM_sharing_op_range_share;
>> +
>> +    mso.u.range.client_domain = client_domain;
>> +    mso.u.range.start = start;
>> +    mso.u.range.end = end;
>> +
>> +    return xc_memshr_memop(xch, source_domain, &mso);
>> +}
>> +
>>  int xc_memshr_domain_resume(xc_interface *xch,
>>                              domid_t domid)
>>  {
>> diff --git a/tools/tests/mem-sharing/memshrtool.c b/tools/tests/mem-sharing/memshrtool.c
>> index 437c7c9..2af6a9e 100644
>> --- a/tools/tests/mem-sharing/memshrtool.c
>> +++ b/tools/tests/mem-sharing/memshrtool.c
>> @@ -24,6 +24,8 @@ static int usage(const char* prog)
>>      printf("  nominate <domid> <gfn>  - Nominate a page for sharing.\n");
>>      printf("  share <domid> <gfn> <handle> <source> <source-gfn> <source-handle>\n");
>>      printf("                          - Share two pages.\n");
>> +    printf("  range <source-domid> <destination-domid> <start-gfn> <end-gfn>\n");
>> +    printf("                          - Share pages between domains in range.\n");
>>      printf("  unshare <domid> <gfn>   - Unshare a page by grabbing a writable map.\n");
>>      printf("  add-to-physmap <domid> <gfn> <source> <source-gfn> <source-handle>\n");
>>      printf("                          - Populate a page in a domain with a shared page.\n");
>> @@ -180,6 +182,26 @@ int main(int argc, const char** argv)
>>          }
>>          printf("Audit returned %d errors.\n", rc);
>>      }
>> +    else if( !strcasecmp(cmd, "range") )
>> +    {
>> +        domid_t sdomid, cdomid;
>> +        int rc;
>> +        uint64_t start, end;
>> +
>> +        if ( argc != 6 )
>> +            return usage(argv[0]);
>>
>> +        sdomid = strtol(argv[2], NULL, 0);
>> +        cdomid = strtol(argv[3], NULL, 0);
>> +        start = strtoul(argv[4], NULL, 0);
>> +        end = strtoul(argv[5], NULL, 0);
>> +
>> +        rc = xc_memshr_range_share(xch, sdomid, cdomid, start, end);
>> +        if ( rc < 0 )
>> +        {
>> +            printf("error executing xc_memshr_range_share: %s\n", strerror(errno));
>> +            return rc;
>> +        }
>> +    }
>>      return 0;
>>  }
>> diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
>> index a522423..329fbd9 100644
>> --- a/xen/arch/x86/mm/mem_sharing.c
>> +++ b/xen/arch/x86/mm/mem_sharing.c
>> @@ -1294,6 +1294,58 @@ int relinquish_shared_pages(struct domain *d)
>>      return rc;
>>  }
>>
>> +static int range_share(struct domain *d, struct domain *cd,
>> +                       struct mem_sharing_op_range *range)
>> +{
>> +    int rc = 0;
>> +    shr_handle_t sh, ch;
>> +    unsigned long start = range->_scratchspace ?: range->start;
>> +
>> +    while( range->end >= start )
>> +    {
>> +        /*
>> +         * We only break out if we run out of memory as individual pages may
>> +         * legitimately be unsharable and we just want to skip over those.
>> +         */
>> +        rc = mem_sharing_nominate_page(d, start, 0, &sh);
>> +        if ( rc == -ENOMEM )
>> +            break;
>> +
>> +        if ( !rc )
>> +        {
>> +            rc = mem_sharing_nominate_page(cd, start, 0, &ch);
>> +            if ( rc == -ENOMEM )
>> +                break;
>> +
>> +            if ( !rc )
>> +            {
>> +                /* If we get here this should be guaranteed to succeed. */
>> +                rc = mem_sharing_share_pages(d, start, sh,
>> +                                             cd, start, ch);
>> +                ASSERT(!rc);
>> +            }
>> +        }
>> +
>> +        /* Check for continuation if it's not the last iteration. */
>> +        if ( range->end >= ++start && hypercall_preempt_check() )
>> +        {
>> +            rc = 1;
>> +            break;
>> +        }
>> +    }
>> +
>> +    range->_scratchspace = start;
>> +
>> +    /*
>> +     * The last page may fail with -EINVAL, and for range sharing we don't
>> +     * care about that.
>> +     */
>> +    if ( range->end < start && rc == -EINVAL )
>> +        rc = 0;
>> +
>> +    return rc;
>> +}
>> +
>>  int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>>  {
>>      int rc;
>> @@ -1468,6 +1520,94 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>>          }
>>          break;
>>
>> +        case XENMEM_sharing_op_range_share:
>> +        {
>> +            unsigned long max_sgfn, max_cgfn;
>> +            struct domain *cd;
>> +
>> +            rc = -EINVAL;
>> +            if ( mso.u.range._pad[0] || mso.u.range._pad[1] ||
>> +                 mso.u.range._pad[2] )
>> +                 goto out;
>> +
>> +            /*
>> +             * We use _scratchscape for the hypercall continuation value.
>> +             * Ideally the user sets this to 0 in the beginning but
>> +             * there is no good way of enforcing that here, so we just check
>> +             * that it's at least in range.
>> +             */
>> +            if ( mso.u.range._scratchspace &&
>> +                 (mso.u.range._scratchspace < mso.u.range.start ||
>> +                  mso.u.range._scratchspace > mso.u.range.end) )
>> +                goto out;
>> +
>> +            if ( !mem_sharing_enabled(d) )
>> +                goto out;
>> +
>> +            rc = rcu_lock_live_remote_domain_by_id(mso.u.range.client_domain,
>> +                                                   &cd);
>> +            if ( rc )
>> +                goto out;
>> +
>> +            /*
>> +             * We reuse XENMEM_sharing_op_share XSM check here as this is
>> +             * essentially the same concept repeated over multiple pages.
>> +             */
>> +            rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd,
>> +                                    XENMEM_sharing_op_share);
>> +            if ( rc )
>> +            {
>> +                rcu_unlock_domain(cd);
>> +                goto out;
>> +            }
>> +
>> +            if ( !mem_sharing_enabled(cd) )
>> +            {
>> +                rcu_unlock_domain(cd);
>> +                rc = -EINVAL;
>> +                goto out;
>> +            }
>> +
>> +            /*
>> +             * Sanity check only, the client should keep the domains paused for
>> +             * the duration of this op.
>> +             */
>> +            if ( !atomic_read(&d->pause_count) ||
>> +                 !atomic_read(&cd->pause_count) )
>> +            {
>> +                rcu_unlock_domain(cd);
>> +                rc = -EINVAL;
>> +                goto out;
>> +            }
>> +
>> +            max_sgfn = domain_get_maximum_gpfn(d);
>> +            max_cgfn = domain_get_maximum_gpfn(cd);
>> +
>> +            if ( max_sgfn < mso.u.range.start || max_sgfn < mso.u.range.end ||
>> +                 max_cgfn < mso.u.range.start || max_cgfn < mso.u.range.end )
>> +            {
>> +                rcu_unlock_domain(cd);
>> +                rc = -EINVAL;
>> +                goto out;
>> +            }
>> +
>> +            rc = range_share(d, cd, &mso.u.range);
>> +            rcu_unlock_domain(cd);
>> +
>> +            if ( rc > 0 )
>> +            {
>> +                if ( __copy_to_guest(arg, &mso, 1) )
>> +                    rc = -EFAULT;
>> +                else
>> +                    rc = hypercall_create_continuation(__HYPERVISOR_memory_op,
>> +                                                       "lh", XENMEM_sharing_op,
>> +                                                       arg);
>> +            }
>> +            else
>> +                mso.u.range._scratchspace = 0;
>> +        }
>> +        break;
>> +
>>          case XENMEM_sharing_op_debug_gfn:
>>          {
>>              unsigned long gfn = mso.u.debug.u.gfn;
>> diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
>> index 29ec571..e0bc018 100644
>> --- a/xen/include/public/memory.h
>> +++ b/xen/include/public/memory.h
>> @@ -465,6 +465,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t);
>>  #define XENMEM_sharing_op_debug_gref        5
>>  #define XENMEM_sharing_op_add_physmap       6
>>  #define XENMEM_sharing_op_audit             7
>> +#define XENMEM_sharing_op_range_share       8
>>
>>  #define XENMEM_SHARING_OP_S_HANDLE_INVALID  (-10)
>>  #define XENMEM_SHARING_OP_C_HANDLE_INVALID  (-9)
>> @@ -500,7 +501,14 @@ struct xen_mem_sharing_op {
>>              uint64_aligned_t client_gfn;    /* IN: the client gfn */
>>              uint64_aligned_t client_handle; /* IN: handle to the client page */
>>              domid_t  client_domain; /* IN: the client domain id */
>> -        } share;
>> +        } share;
>> +        struct mem_sharing_op_range {         /* OP_RANGE_SHARE */
>> +            uint64_aligned_t start;          /* IN: start gfn. */
>> +            uint64_aligned_t end;            /* IN: end gfn (inclusive) */
>> +            uint64_aligned_t _scratchspace;  /* Must be set to 0 */
>
> Tamas,
>
> Why include this "scratchspace" that's not used in the interface,
> rather than just doing what the memory operations in
> xen/common/memory.c do, and store it in EAX by shifting it over by
> MEMOP_EXTENT_SHIFT?  I looked through the history and I see that v1
> did something very much like that, but it was changed to using the
> scratch space without any explanation.
>
> Having this in the public interface is ugly; and what's worse, it
> exposes some of the internal mechanism to the guest.
>
>  -George

Well, that's exactly what I did in an earlier version of the patch but
it was requested that I change it to something like this by Andrew
(see https://lists.xen.org/archives/html/xen-devel/2015-10/msg00434.html).
Then over the various iterations it ended up like looking like this.

Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v8] x86/mem-sharing: mem-sharing a range of memory
  2016-07-26 15:22   ` Tamas K Lengyel
@ 2016-07-26 15:49     ` George Dunlap
  2016-07-26 22:43       ` Andrew Cooper
  0 siblings, 1 reply; 8+ messages in thread
From: George Dunlap @ 2016-07-26 15:49 UTC (permalink / raw)
  To: Tamas K Lengyel; +Cc: xen-devel, Ian Jackson, Jan Beulich, Andrew Cooper

On Tue, Jul 26, 2016 at 4:22 PM, Tamas K Lengyel
<tamas.lengyel@zentific.com> wrote:
> On Tue, Jul 26, 2016 at 3:12 AM, George Dunlap <george.dunlap@citrix.com> wrote:
>> On Wed, Jul 20, 2016 at 7:01 PM, Tamas K Lengyel
>> <tamas.lengyel@zentific.com> wrote:
>>> Currently mem-sharing can be performed on a page-by-page basis from the control
>>> domain. However, this process is quite wasteful when a range of pages have to
>>> be deduplicated.
>>>
>>> This patch introduces a new mem_sharing memop for range sharing where
>>> the user doesn't have to separately nominate each page in both the source and
>>> destination domain, and the looping over all pages happen in the hypervisor.
>>> This significantly reduces the overhead of sharing a range of memory.
>>>
>>> Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
>>> Acked-by: Wei Liu <wei.liu2@citrix.com>
>>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>> ---
>>> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
>>> Cc: George Dunlap <george.dunlap@eu.citrix.com>
>>> Cc: Jan Beulich <jbeulich@suse.com>
>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>
>>> v8: style fixes and minor adjustments
>>> ---
>>>  tools/libxc/include/xenctrl.h        |  15 ++++
>>>  tools/libxc/xc_memshr.c              |  19 +++++
>>>  tools/tests/mem-sharing/memshrtool.c |  22 ++++++
>>>  xen/arch/x86/mm/mem_sharing.c        | 140 +++++++++++++++++++++++++++++++++++
>>>  xen/include/public/memory.h          |  10 ++-
>>>  5 files changed, 205 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
>>> index e904bd5..3782eff 100644
>>> --- a/tools/libxc/include/xenctrl.h
>>> +++ b/tools/libxc/include/xenctrl.h
>>> @@ -2334,6 +2334,21 @@ int xc_memshr_add_to_physmap(xc_interface *xch,
>>>                      domid_t client_domain,
>>>                      unsigned long client_gfn);
>>>
>>> +/* Allows to deduplicate a range of memory of a client domain. Using
>>> + * this function is equivalent of calling xc_memshr_nominate_gfn for each gfn
>>> + * in the two domains followed by xc_memshr_share_gfns.
>>> + *
>>> + * May fail with -EINVAL if the source and client domain have different
>>> + * memory size or if memory sharing is not enabled on either of the domains.
>>> + * May also fail with -ENOMEM if there isn't enough memory available to store
>>> + * the sharing metadata before deduplication can happen.
>>> + */
>>> +int xc_memshr_range_share(xc_interface *xch,
>>> +                          domid_t source_domain,
>>> +                          domid_t client_domain,
>>> +                          uint64_t start,
>>> +                          uint64_t end);
>>> +
>>>  /* Debug calls: return the number of pages referencing the shared frame backing
>>>   * the input argument. Should be one or greater.
>>>   *
>>> diff --git a/tools/libxc/xc_memshr.c b/tools/libxc/xc_memshr.c
>>> index deb0aa4..2b871c7 100644
>>> --- a/tools/libxc/xc_memshr.c
>>> +++ b/tools/libxc/xc_memshr.c
>>> @@ -181,6 +181,25 @@ int xc_memshr_add_to_physmap(xc_interface *xch,
>>>      return xc_memshr_memop(xch, source_domain, &mso);
>>>  }
>>>
>>> +int xc_memshr_range_share(xc_interface *xch,
>>> +                          domid_t source_domain,
>>> +                          domid_t client_domain,
>>> +                          uint64_t start,
>>> +                          uint64_t end)
>>> +{
>>> +    xen_mem_sharing_op_t mso;
>>> +
>>> +    memset(&mso, 0, sizeof(mso));
>>> +
>>> +    mso.op = XENMEM_sharing_op_range_share;
>>> +
>>> +    mso.u.range.client_domain = client_domain;
>>> +    mso.u.range.start = start;
>>> +    mso.u.range.end = end;
>>> +
>>> +    return xc_memshr_memop(xch, source_domain, &mso);
>>> +}
>>> +
>>>  int xc_memshr_domain_resume(xc_interface *xch,
>>>                              domid_t domid)
>>>  {
>>> diff --git a/tools/tests/mem-sharing/memshrtool.c b/tools/tests/mem-sharing/memshrtool.c
>>> index 437c7c9..2af6a9e 100644
>>> --- a/tools/tests/mem-sharing/memshrtool.c
>>> +++ b/tools/tests/mem-sharing/memshrtool.c
>>> @@ -24,6 +24,8 @@ static int usage(const char* prog)
>>>      printf("  nominate <domid> <gfn>  - Nominate a page for sharing.\n");
>>>      printf("  share <domid> <gfn> <handle> <source> <source-gfn> <source-handle>\n");
>>>      printf("                          - Share two pages.\n");
>>> +    printf("  range <source-domid> <destination-domid> <start-gfn> <end-gfn>\n");
>>> +    printf("                          - Share pages between domains in range.\n");
>>>      printf("  unshare <domid> <gfn>   - Unshare a page by grabbing a writable map.\n");
>>>      printf("  add-to-physmap <domid> <gfn> <source> <source-gfn> <source-handle>\n");
>>>      printf("                          - Populate a page in a domain with a shared page.\n");
>>> @@ -180,6 +182,26 @@ int main(int argc, const char** argv)
>>>          }
>>>          printf("Audit returned %d errors.\n", rc);
>>>      }
>>> +    else if( !strcasecmp(cmd, "range") )
>>> +    {
>>> +        domid_t sdomid, cdomid;
>>> +        int rc;
>>> +        uint64_t start, end;
>>> +
>>> +        if ( argc != 6 )
>>> +            return usage(argv[0]);
>>>
>>> +        sdomid = strtol(argv[2], NULL, 0);
>>> +        cdomid = strtol(argv[3], NULL, 0);
>>> +        start = strtoul(argv[4], NULL, 0);
>>> +        end = strtoul(argv[5], NULL, 0);
>>> +
>>> +        rc = xc_memshr_range_share(xch, sdomid, cdomid, start, end);
>>> +        if ( rc < 0 )
>>> +        {
>>> +            printf("error executing xc_memshr_range_share: %s\n", strerror(errno));
>>> +            return rc;
>>> +        }
>>> +    }
>>>      return 0;
>>>  }
>>> diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
>>> index a522423..329fbd9 100644
>>> --- a/xen/arch/x86/mm/mem_sharing.c
>>> +++ b/xen/arch/x86/mm/mem_sharing.c
>>> @@ -1294,6 +1294,58 @@ int relinquish_shared_pages(struct domain *d)
>>>      return rc;
>>>  }
>>>
>>> +static int range_share(struct domain *d, struct domain *cd,
>>> +                       struct mem_sharing_op_range *range)
>>> +{
>>> +    int rc = 0;
>>> +    shr_handle_t sh, ch;
>>> +    unsigned long start = range->_scratchspace ?: range->start;
>>> +
>>> +    while( range->end >= start )
>>> +    {
>>> +        /*
>>> +         * We only break out if we run out of memory as individual pages may
>>> +         * legitimately be unsharable and we just want to skip over those.
>>> +         */
>>> +        rc = mem_sharing_nominate_page(d, start, 0, &sh);
>>> +        if ( rc == -ENOMEM )
>>> +            break;
>>> +
>>> +        if ( !rc )
>>> +        {
>>> +            rc = mem_sharing_nominate_page(cd, start, 0, &ch);
>>> +            if ( rc == -ENOMEM )
>>> +                break;
>>> +
>>> +            if ( !rc )
>>> +            {
>>> +                /* If we get here this should be guaranteed to succeed. */
>>> +                rc = mem_sharing_share_pages(d, start, sh,
>>> +                                             cd, start, ch);
>>> +                ASSERT(!rc);
>>> +            }
>>> +        }
>>> +
>>> +        /* Check for continuation if it's not the last iteration. */
>>> +        if ( range->end >= ++start && hypercall_preempt_check() )
>>> +        {
>>> +            rc = 1;
>>> +            break;
>>> +        }
>>> +    }
>>> +
>>> +    range->_scratchspace = start;
>>> +
>>> +    /*
>>> +     * The last page may fail with -EINVAL, and for range sharing we don't
>>> +     * care about that.
>>> +     */
>>> +    if ( range->end < start && rc == -EINVAL )
>>> +        rc = 0;
>>> +
>>> +    return rc;
>>> +}
>>> +
>>>  int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>>>  {
>>>      int rc;
>>> @@ -1468,6 +1520,94 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>>>          }
>>>          break;
>>>
>>> +        case XENMEM_sharing_op_range_share:
>>> +        {
>>> +            unsigned long max_sgfn, max_cgfn;
>>> +            struct domain *cd;
>>> +
>>> +            rc = -EINVAL;
>>> +            if ( mso.u.range._pad[0] || mso.u.range._pad[1] ||
>>> +                 mso.u.range._pad[2] )
>>> +                 goto out;
>>> +
>>> +            /*
>>> +             * We use _scratchscape for the hypercall continuation value.
>>> +             * Ideally the user sets this to 0 in the beginning but
>>> +             * there is no good way of enforcing that here, so we just check
>>> +             * that it's at least in range.
>>> +             */
>>> +            if ( mso.u.range._scratchspace &&
>>> +                 (mso.u.range._scratchspace < mso.u.range.start ||
>>> +                  mso.u.range._scratchspace > mso.u.range.end) )
>>> +                goto out;
>>> +
>>> +            if ( !mem_sharing_enabled(d) )
>>> +                goto out;
>>> +
>>> +            rc = rcu_lock_live_remote_domain_by_id(mso.u.range.client_domain,
>>> +                                                   &cd);
>>> +            if ( rc )
>>> +                goto out;
>>> +
>>> +            /*
>>> +             * We reuse XENMEM_sharing_op_share XSM check here as this is
>>> +             * essentially the same concept repeated over multiple pages.
>>> +             */
>>> +            rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd,
>>> +                                    XENMEM_sharing_op_share);
>>> +            if ( rc )
>>> +            {
>>> +                rcu_unlock_domain(cd);
>>> +                goto out;
>>> +            }
>>> +
>>> +            if ( !mem_sharing_enabled(cd) )
>>> +            {
>>> +                rcu_unlock_domain(cd);
>>> +                rc = -EINVAL;
>>> +                goto out;
>>> +            }
>>> +
>>> +            /*
>>> +             * Sanity check only, the client should keep the domains paused for
>>> +             * the duration of this op.
>>> +             */
>>> +            if ( !atomic_read(&d->pause_count) ||
>>> +                 !atomic_read(&cd->pause_count) )
>>> +            {
>>> +                rcu_unlock_domain(cd);
>>> +                rc = -EINVAL;
>>> +                goto out;
>>> +            }
>>> +
>>> +            max_sgfn = domain_get_maximum_gpfn(d);
>>> +            max_cgfn = domain_get_maximum_gpfn(cd);
>>> +
>>> +            if ( max_sgfn < mso.u.range.start || max_sgfn < mso.u.range.end ||
>>> +                 max_cgfn < mso.u.range.start || max_cgfn < mso.u.range.end )
>>> +            {
>>> +                rcu_unlock_domain(cd);
>>> +                rc = -EINVAL;
>>> +                goto out;
>>> +            }
>>> +
>>> +            rc = range_share(d, cd, &mso.u.range);
>>> +            rcu_unlock_domain(cd);
>>> +
>>> +            if ( rc > 0 )
>>> +            {
>>> +                if ( __copy_to_guest(arg, &mso, 1) )
>>> +                    rc = -EFAULT;
>>> +                else
>>> +                    rc = hypercall_create_continuation(__HYPERVISOR_memory_op,
>>> +                                                       "lh", XENMEM_sharing_op,
>>> +                                                       arg);
>>> +            }
>>> +            else
>>> +                mso.u.range._scratchspace = 0;
>>> +        }
>>> +        break;
>>> +
>>>          case XENMEM_sharing_op_debug_gfn:
>>>          {
>>>              unsigned long gfn = mso.u.debug.u.gfn;
>>> diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
>>> index 29ec571..e0bc018 100644
>>> --- a/xen/include/public/memory.h
>>> +++ b/xen/include/public/memory.h
>>> @@ -465,6 +465,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t);
>>>  #define XENMEM_sharing_op_debug_gref        5
>>>  #define XENMEM_sharing_op_add_physmap       6
>>>  #define XENMEM_sharing_op_audit             7
>>> +#define XENMEM_sharing_op_range_share       8
>>>
>>>  #define XENMEM_SHARING_OP_S_HANDLE_INVALID  (-10)
>>>  #define XENMEM_SHARING_OP_C_HANDLE_INVALID  (-9)
>>> @@ -500,7 +501,14 @@ struct xen_mem_sharing_op {
>>>              uint64_aligned_t client_gfn;    /* IN: the client gfn */
>>>              uint64_aligned_t client_handle; /* IN: handle to the client page */
>>>              domid_t  client_domain; /* IN: the client domain id */
>>> -        } share;
>>> +        } share;
>>> +        struct mem_sharing_op_range {         /* OP_RANGE_SHARE */
>>> +            uint64_aligned_t start;          /* IN: start gfn. */
>>> +            uint64_aligned_t end;            /* IN: end gfn (inclusive) */
>>> +            uint64_aligned_t _scratchspace;  /* Must be set to 0 */
>>
>> Tamas,
>>
>> Why include this "scratchspace" that's not used in the interface,
>> rather than just doing what the memory operations in
>> xen/common/memory.c do, and store it in EAX by shifting it over by
>> MEMOP_EXTENT_SHIFT?  I looked through the history and I see that v1
>> did something very much like that, but it was changed to using the
>> scratch space without any explanation.
>>
>> Having this in the public interface is ugly; and what's worse, it
>> exposes some of the internal mechanism to the guest.
>>
>>  -George
>
> Well, that's exactly what I did in an earlier version of the patch but
> it was requested that I change it to something like this by Andrew
> (see https://lists.xen.org/archives/html/xen-devel/2015-10/msg00434.html).
> Then over the various iterations it ended up like looking like this.

Oh right -- sorry, I did look but somehow I missed that Andrew had requested it.

I would have read his comment to  mean to put the _scratchspace
variable in the larger structure.  But it has his R-b, so I'll
consider myself answered.

Thanks,
 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v8] x86/mem-sharing: mem-sharing a range of memory
  2016-07-26 15:49     ` George Dunlap
@ 2016-07-26 22:43       ` Andrew Cooper
  2016-07-27  9:01         ` George Dunlap
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Cooper @ 2016-07-26 22:43 UTC (permalink / raw)
  To: George Dunlap, Tamas K Lengyel; +Cc: xen-devel, Ian Jackson, Jan Beulich

On 26/07/2016 16:49, George Dunlap wrote:
> On Tue, Jul 26, 2016 at 4:22 PM, Tamas K Lengyel
> <tamas.lengyel@zentific.com> wrote:
>> On Tue, Jul 26, 2016 at 3:12 AM, George Dunlap <george.dunlap@citrix.com> wrote:
>>> On Wed, Jul 20, 2016 at 7:01 PM, Tamas K Lengyel
>>> <tamas.lengyel@zentific.com> wrote:
>>>> Currently mem-sharing can be performed on a page-by-page basis from the control
>>>> domain. However, this process is quite wasteful when a range of pages have to
>>>> be deduplicated.
>>>>
>>>> This patch introduces a new mem_sharing memop for range sharing where
>>>> the user doesn't have to separately nominate each page in both the source and
>>>> destination domain, and the looping over all pages happen in the hypervisor.
>>>> This significantly reduces the overhead of sharing a range of memory.
>>>>
>>>> Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
>>>> Acked-by: Wei Liu <wei.liu2@citrix.com>
>>>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>> ---
>>>> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
>>>> Cc: George Dunlap <george.dunlap@eu.citrix.com>
>>>> Cc: Jan Beulich <jbeulich@suse.com>
>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>
>>>> v8: style fixes and minor adjustments
>>>> ---
>>>>  tools/libxc/include/xenctrl.h        |  15 ++++
>>>>  tools/libxc/xc_memshr.c              |  19 +++++
>>>>  tools/tests/mem-sharing/memshrtool.c |  22 ++++++
>>>>  xen/arch/x86/mm/mem_sharing.c        | 140 +++++++++++++++++++++++++++++++++++
>>>>  xen/include/public/memory.h          |  10 ++-
>>>>  5 files changed, 205 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
>>>> index e904bd5..3782eff 100644
>>>> --- a/tools/libxc/include/xenctrl.h
>>>> +++ b/tools/libxc/include/xenctrl.h
>>>> @@ -2334,6 +2334,21 @@ int xc_memshr_add_to_physmap(xc_interface *xch,
>>>>                      domid_t client_domain,
>>>>                      unsigned long client_gfn);
>>>>
>>>> +/* Allows to deduplicate a range of memory of a client domain. Using
>>>> + * this function is equivalent of calling xc_memshr_nominate_gfn for each gfn
>>>> + * in the two domains followed by xc_memshr_share_gfns.
>>>> + *
>>>> + * May fail with -EINVAL if the source and client domain have different
>>>> + * memory size or if memory sharing is not enabled on either of the domains.
>>>> + * May also fail with -ENOMEM if there isn't enough memory available to store
>>>> + * the sharing metadata before deduplication can happen.
>>>> + */
>>>> +int xc_memshr_range_share(xc_interface *xch,
>>>> +                          domid_t source_domain,
>>>> +                          domid_t client_domain,
>>>> +                          uint64_t start,
>>>> +                          uint64_t end);
>>>> +
>>>>  /* Debug calls: return the number of pages referencing the shared frame backing
>>>>   * the input argument. Should be one or greater.
>>>>   *
>>>> diff --git a/tools/libxc/xc_memshr.c b/tools/libxc/xc_memshr.c
>>>> index deb0aa4..2b871c7 100644
>>>> --- a/tools/libxc/xc_memshr.c
>>>> +++ b/tools/libxc/xc_memshr.c
>>>> @@ -181,6 +181,25 @@ int xc_memshr_add_to_physmap(xc_interface *xch,
>>>>      return xc_memshr_memop(xch, source_domain, &mso);
>>>>  }
>>>>
>>>> +int xc_memshr_range_share(xc_interface *xch,
>>>> +                          domid_t source_domain,
>>>> +                          domid_t client_domain,
>>>> +                          uint64_t start,
>>>> +                          uint64_t end)
>>>> +{
>>>> +    xen_mem_sharing_op_t mso;
>>>> +
>>>> +    memset(&mso, 0, sizeof(mso));
>>>> +
>>>> +    mso.op = XENMEM_sharing_op_range_share;
>>>> +
>>>> +    mso.u.range.client_domain = client_domain;
>>>> +    mso.u.range.start = start;
>>>> +    mso.u.range.end = end;
>>>> +
>>>> +    return xc_memshr_memop(xch, source_domain, &mso);
>>>> +}
>>>> +
>>>>  int xc_memshr_domain_resume(xc_interface *xch,
>>>>                              domid_t domid)
>>>>  {
>>>> diff --git a/tools/tests/mem-sharing/memshrtool.c b/tools/tests/mem-sharing/memshrtool.c
>>>> index 437c7c9..2af6a9e 100644
>>>> --- a/tools/tests/mem-sharing/memshrtool.c
>>>> +++ b/tools/tests/mem-sharing/memshrtool.c
>>>> @@ -24,6 +24,8 @@ static int usage(const char* prog)
>>>>      printf("  nominate <domid> <gfn>  - Nominate a page for sharing.\n");
>>>>      printf("  share <domid> <gfn> <handle> <source> <source-gfn> <source-handle>\n");
>>>>      printf("                          - Share two pages.\n");
>>>> +    printf("  range <source-domid> <destination-domid> <start-gfn> <end-gfn>\n");
>>>> +    printf("                          - Share pages between domains in range.\n");
>>>>      printf("  unshare <domid> <gfn>   - Unshare a page by grabbing a writable map.\n");
>>>>      printf("  add-to-physmap <domid> <gfn> <source> <source-gfn> <source-handle>\n");
>>>>      printf("                          - Populate a page in a domain with a shared page.\n");
>>>> @@ -180,6 +182,26 @@ int main(int argc, const char** argv)
>>>>          }
>>>>          printf("Audit returned %d errors.\n", rc);
>>>>      }
>>>> +    else if( !strcasecmp(cmd, "range") )
>>>> +    {
>>>> +        domid_t sdomid, cdomid;
>>>> +        int rc;
>>>> +        uint64_t start, end;
>>>> +
>>>> +        if ( argc != 6 )
>>>> +            return usage(argv[0]);
>>>>
>>>> +        sdomid = strtol(argv[2], NULL, 0);
>>>> +        cdomid = strtol(argv[3], NULL, 0);
>>>> +        start = strtoul(argv[4], NULL, 0);
>>>> +        end = strtoul(argv[5], NULL, 0);
>>>> +
>>>> +        rc = xc_memshr_range_share(xch, sdomid, cdomid, start, end);
>>>> +        if ( rc < 0 )
>>>> +        {
>>>> +            printf("error executing xc_memshr_range_share: %s\n", strerror(errno));
>>>> +            return rc;
>>>> +        }
>>>> +    }
>>>>      return 0;
>>>>  }
>>>> diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
>>>> index a522423..329fbd9 100644
>>>> --- a/xen/arch/x86/mm/mem_sharing.c
>>>> +++ b/xen/arch/x86/mm/mem_sharing.c
>>>> @@ -1294,6 +1294,58 @@ int relinquish_shared_pages(struct domain *d)
>>>>      return rc;
>>>>  }
>>>>
>>>> +static int range_share(struct domain *d, struct domain *cd,
>>>> +                       struct mem_sharing_op_range *range)
>>>> +{
>>>> +    int rc = 0;
>>>> +    shr_handle_t sh, ch;
>>>> +    unsigned long start = range->_scratchspace ?: range->start;
>>>> +
>>>> +    while( range->end >= start )
>>>> +    {
>>>> +        /*
>>>> +         * We only break out if we run out of memory as individual pages may
>>>> +         * legitimately be unsharable and we just want to skip over those.
>>>> +         */
>>>> +        rc = mem_sharing_nominate_page(d, start, 0, &sh);
>>>> +        if ( rc == -ENOMEM )
>>>> +            break;
>>>> +
>>>> +        if ( !rc )
>>>> +        {
>>>> +            rc = mem_sharing_nominate_page(cd, start, 0, &ch);
>>>> +            if ( rc == -ENOMEM )
>>>> +                break;
>>>> +
>>>> +            if ( !rc )
>>>> +            {
>>>> +                /* If we get here this should be guaranteed to succeed. */
>>>> +                rc = mem_sharing_share_pages(d, start, sh,
>>>> +                                             cd, start, ch);
>>>> +                ASSERT(!rc);
>>>> +            }
>>>> +        }
>>>> +
>>>> +        /* Check for continuation if it's not the last iteration. */
>>>> +        if ( range->end >= ++start && hypercall_preempt_check() )
>>>> +        {
>>>> +            rc = 1;
>>>> +            break;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    range->_scratchspace = start;
>>>> +
>>>> +    /*
>>>> +     * The last page may fail with -EINVAL, and for range sharing we don't
>>>> +     * care about that.
>>>> +     */
>>>> +    if ( range->end < start && rc == -EINVAL )
>>>> +        rc = 0;
>>>> +
>>>> +    return rc;
>>>> +}
>>>> +
>>>>  int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>>>>  {
>>>>      int rc;
>>>> @@ -1468,6 +1520,94 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>>>>          }
>>>>          break;
>>>>
>>>> +        case XENMEM_sharing_op_range_share:
>>>> +        {
>>>> +            unsigned long max_sgfn, max_cgfn;
>>>> +            struct domain *cd;
>>>> +
>>>> +            rc = -EINVAL;
>>>> +            if ( mso.u.range._pad[0] || mso.u.range._pad[1] ||
>>>> +                 mso.u.range._pad[2] )
>>>> +                 goto out;
>>>> +
>>>> +            /*
>>>> +             * We use _scratchscape for the hypercall continuation value.
>>>> +             * Ideally the user sets this to 0 in the beginning but
>>>> +             * there is no good way of enforcing that here, so we just check
>>>> +             * that it's at least in range.
>>>> +             */
>>>> +            if ( mso.u.range._scratchspace &&
>>>> +                 (mso.u.range._scratchspace < mso.u.range.start ||
>>>> +                  mso.u.range._scratchspace > mso.u.range.end) )
>>>> +                goto out;
>>>> +
>>>> +            if ( !mem_sharing_enabled(d) )
>>>> +                goto out;
>>>> +
>>>> +            rc = rcu_lock_live_remote_domain_by_id(mso.u.range.client_domain,
>>>> +                                                   &cd);
>>>> +            if ( rc )
>>>> +                goto out;
>>>> +
>>>> +            /*
>>>> +             * We reuse XENMEM_sharing_op_share XSM check here as this is
>>>> +             * essentially the same concept repeated over multiple pages.
>>>> +             */
>>>> +            rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd,
>>>> +                                    XENMEM_sharing_op_share);
>>>> +            if ( rc )
>>>> +            {
>>>> +                rcu_unlock_domain(cd);
>>>> +                goto out;
>>>> +            }
>>>> +
>>>> +            if ( !mem_sharing_enabled(cd) )
>>>> +            {
>>>> +                rcu_unlock_domain(cd);
>>>> +                rc = -EINVAL;
>>>> +                goto out;
>>>> +            }
>>>> +
>>>> +            /*
>>>> +             * Sanity check only, the client should keep the domains paused for
>>>> +             * the duration of this op.
>>>> +             */
>>>> +            if ( !atomic_read(&d->pause_count) ||
>>>> +                 !atomic_read(&cd->pause_count) )
>>>> +            {
>>>> +                rcu_unlock_domain(cd);
>>>> +                rc = -EINVAL;
>>>> +                goto out;
>>>> +            }
>>>> +
>>>> +            max_sgfn = domain_get_maximum_gpfn(d);
>>>> +            max_cgfn = domain_get_maximum_gpfn(cd);
>>>> +
>>>> +            if ( max_sgfn < mso.u.range.start || max_sgfn < mso.u.range.end ||
>>>> +                 max_cgfn < mso.u.range.start || max_cgfn < mso.u.range.end )
>>>> +            {
>>>> +                rcu_unlock_domain(cd);
>>>> +                rc = -EINVAL;
>>>> +                goto out;
>>>> +            }
>>>> +
>>>> +            rc = range_share(d, cd, &mso.u.range);
>>>> +            rcu_unlock_domain(cd);
>>>> +
>>>> +            if ( rc > 0 )
>>>> +            {
>>>> +                if ( __copy_to_guest(arg, &mso, 1) )
>>>> +                    rc = -EFAULT;
>>>> +                else
>>>> +                    rc = hypercall_create_continuation(__HYPERVISOR_memory_op,
>>>> +                                                       "lh", XENMEM_sharing_op,
>>>> +                                                       arg);
>>>> +            }
>>>> +            else
>>>> +                mso.u.range._scratchspace = 0;
>>>> +        }
>>>> +        break;
>>>> +
>>>>          case XENMEM_sharing_op_debug_gfn:
>>>>          {
>>>>              unsigned long gfn = mso.u.debug.u.gfn;
>>>> diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
>>>> index 29ec571..e0bc018 100644
>>>> --- a/xen/include/public/memory.h
>>>> +++ b/xen/include/public/memory.h
>>>> @@ -465,6 +465,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t);
>>>>  #define XENMEM_sharing_op_debug_gref        5
>>>>  #define XENMEM_sharing_op_add_physmap       6
>>>>  #define XENMEM_sharing_op_audit             7
>>>> +#define XENMEM_sharing_op_range_share       8
>>>>
>>>>  #define XENMEM_SHARING_OP_S_HANDLE_INVALID  (-10)
>>>>  #define XENMEM_SHARING_OP_C_HANDLE_INVALID  (-9)
>>>> @@ -500,7 +501,14 @@ struct xen_mem_sharing_op {
>>>>              uint64_aligned_t client_gfn;    /* IN: the client gfn */
>>>>              uint64_aligned_t client_handle; /* IN: handle to the client page */
>>>>              domid_t  client_domain; /* IN: the client domain id */
>>>> -        } share;
>>>> +        } share;
>>>> +        struct mem_sharing_op_range {         /* OP_RANGE_SHARE */
>>>> +            uint64_aligned_t start;          /* IN: start gfn. */
>>>> +            uint64_aligned_t end;            /* IN: end gfn (inclusive) */
>>>> +            uint64_aligned_t _scratchspace;  /* Must be set to 0 */
>>> Tamas,
>>>
>>> Why include this "scratchspace" that's not used in the interface,
>>> rather than just doing what the memory operations in
>>> xen/common/memory.c do, and store it in EAX by shifting it over by
>>> MEMOP_EXTENT_SHIFT?  I looked through the history and I see that v1
>>> did something very much like that, but it was changed to using the
>>> scratch space without any explanation.
>>>
>>> Having this in the public interface is ugly; and what's worse, it
>>> exposes some of the internal mechanism to the guest.
>>>
>>>  -George
>> Well, that's exactly what I did in an earlier version of the patch but
>> it was requested that I change it to something like this by Andrew
>> (see https://lists.xen.org/archives/html/xen-devel/2015-10/msg00434.html).
>> Then over the various iterations it ended up like looking like this.
> Oh right -- sorry, I did look but somehow I missed that Andrew had requested it.
>
> I would have read his comment to  mean to put the _scratchspace
> variable in the larger structure.  But it has his R-b, so I'll
> consider myself answered.

I am open to other suggestion wrt naming, but wasn't looking to bikeshed
the issue.

_opaque is a common name used.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v8] x86/mem-sharing: mem-sharing a range of memory
  2016-07-26 22:43       ` Andrew Cooper
@ 2016-07-27  9:01         ` George Dunlap
  2016-07-27  9:59           ` Andrew Cooper
  0 siblings, 1 reply; 8+ messages in thread
From: George Dunlap @ 2016-07-27  9:01 UTC (permalink / raw)
  To: Andrew Cooper, Tamas K Lengyel
  Cc: xen-devel, Ian Jackson, Andres Lagar-Cavilla, Jan Beulich, Tim Deegan

On 26/07/16 23:43, Andrew Cooper wrote:
> On 26/07/2016 16:49, George Dunlap wrote:
>> On Tue, Jul 26, 2016 at 4:22 PM, Tamas K Lengyel
>> <tamas.lengyel@zentific.com> wrote:
>>> On Tue, Jul 26, 2016 at 3:12 AM, George Dunlap <george.dunlap@citrix.com> wrote:
>>>> On Wed, Jul 20, 2016 at 7:01 PM, Tamas K Lengyel
>>>> <tamas.lengyel@zentific.com> wrote:
>>>>> Currently mem-sharing can be performed on a page-by-page basis from the control
>>>>> domain. However, this process is quite wasteful when a range of pages have to
>>>>> be deduplicated.
>>>>>
>>>>> This patch introduces a new mem_sharing memop for range sharing where
>>>>> the user doesn't have to separately nominate each page in both the source and
>>>>> destination domain, and the looping over all pages happen in the hypervisor.
>>>>> This significantly reduces the overhead of sharing a range of memory.
>>>>>
>>>>> Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
>>>>> Acked-by: Wei Liu <wei.liu2@citrix.com>
>>>>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>> ---
>>>>> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
>>>>> Cc: George Dunlap <george.dunlap@eu.citrix.com>
>>>>> Cc: Jan Beulich <jbeulich@suse.com>
>>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>>
>>>>> v8: style fixes and minor adjustments
>>>>> ---
>>>>>  tools/libxc/include/xenctrl.h        |  15 ++++
>>>>>  tools/libxc/xc_memshr.c              |  19 +++++
>>>>>  tools/tests/mem-sharing/memshrtool.c |  22 ++++++
>>>>>  xen/arch/x86/mm/mem_sharing.c        | 140 +++++++++++++++++++++++++++++++++++
>>>>>  xen/include/public/memory.h          |  10 ++-
>>>>>  5 files changed, 205 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
>>>>> index e904bd5..3782eff 100644
>>>>> --- a/tools/libxc/include/xenctrl.h
>>>>> +++ b/tools/libxc/include/xenctrl.h
>>>>> @@ -2334,6 +2334,21 @@ int xc_memshr_add_to_physmap(xc_interface *xch,
>>>>>                      domid_t client_domain,
>>>>>                      unsigned long client_gfn);
>>>>>
>>>>> +/* Allows to deduplicate a range of memory of a client domain. Using
>>>>> + * this function is equivalent of calling xc_memshr_nominate_gfn for each gfn
>>>>> + * in the two domains followed by xc_memshr_share_gfns.
>>>>> + *
>>>>> + * May fail with -EINVAL if the source and client domain have different
>>>>> + * memory size or if memory sharing is not enabled on either of the domains.
>>>>> + * May also fail with -ENOMEM if there isn't enough memory available to store
>>>>> + * the sharing metadata before deduplication can happen.
>>>>> + */
>>>>> +int xc_memshr_range_share(xc_interface *xch,
>>>>> +                          domid_t source_domain,
>>>>> +                          domid_t client_domain,
>>>>> +                          uint64_t start,
>>>>> +                          uint64_t end);
>>>>> +
>>>>>  /* Debug calls: return the number of pages referencing the shared frame backing
>>>>>   * the input argument. Should be one or greater.
>>>>>   *
>>>>> diff --git a/tools/libxc/xc_memshr.c b/tools/libxc/xc_memshr.c
>>>>> index deb0aa4..2b871c7 100644
>>>>> --- a/tools/libxc/xc_memshr.c
>>>>> +++ b/tools/libxc/xc_memshr.c
>>>>> @@ -181,6 +181,25 @@ int xc_memshr_add_to_physmap(xc_interface *xch,
>>>>>      return xc_memshr_memop(xch, source_domain, &mso);
>>>>>  }
>>>>>
>>>>> +int xc_memshr_range_share(xc_interface *xch,
>>>>> +                          domid_t source_domain,
>>>>> +                          domid_t client_domain,
>>>>> +                          uint64_t start,
>>>>> +                          uint64_t end)
>>>>> +{
>>>>> +    xen_mem_sharing_op_t mso;
>>>>> +
>>>>> +    memset(&mso, 0, sizeof(mso));
>>>>> +
>>>>> +    mso.op = XENMEM_sharing_op_range_share;
>>>>> +
>>>>> +    mso.u.range.client_domain = client_domain;
>>>>> +    mso.u.range.start = start;
>>>>> +    mso.u.range.end = end;
>>>>> +
>>>>> +    return xc_memshr_memop(xch, source_domain, &mso);
>>>>> +}
>>>>> +
>>>>>  int xc_memshr_domain_resume(xc_interface *xch,
>>>>>                              domid_t domid)
>>>>>  {
>>>>> diff --git a/tools/tests/mem-sharing/memshrtool.c b/tools/tests/mem-sharing/memshrtool.c
>>>>> index 437c7c9..2af6a9e 100644
>>>>> --- a/tools/tests/mem-sharing/memshrtool.c
>>>>> +++ b/tools/tests/mem-sharing/memshrtool.c
>>>>> @@ -24,6 +24,8 @@ static int usage(const char* prog)
>>>>>      printf("  nominate <domid> <gfn>  - Nominate a page for sharing.\n");
>>>>>      printf("  share <domid> <gfn> <handle> <source> <source-gfn> <source-handle>\n");
>>>>>      printf("                          - Share two pages.\n");
>>>>> +    printf("  range <source-domid> <destination-domid> <start-gfn> <end-gfn>\n");
>>>>> +    printf("                          - Share pages between domains in range.\n");
>>>>>      printf("  unshare <domid> <gfn>   - Unshare a page by grabbing a writable map.\n");
>>>>>      printf("  add-to-physmap <domid> <gfn> <source> <source-gfn> <source-handle>\n");
>>>>>      printf("                          - Populate a page in a domain with a shared page.\n");
>>>>> @@ -180,6 +182,26 @@ int main(int argc, const char** argv)
>>>>>          }
>>>>>          printf("Audit returned %d errors.\n", rc);
>>>>>      }
>>>>> +    else if( !strcasecmp(cmd, "range") )
>>>>> +    {
>>>>> +        domid_t sdomid, cdomid;
>>>>> +        int rc;
>>>>> +        uint64_t start, end;
>>>>> +
>>>>> +        if ( argc != 6 )
>>>>> +            return usage(argv[0]);
>>>>>
>>>>> +        sdomid = strtol(argv[2], NULL, 0);
>>>>> +        cdomid = strtol(argv[3], NULL, 0);
>>>>> +        start = strtoul(argv[4], NULL, 0);
>>>>> +        end = strtoul(argv[5], NULL, 0);
>>>>> +
>>>>> +        rc = xc_memshr_range_share(xch, sdomid, cdomid, start, end);
>>>>> +        if ( rc < 0 )
>>>>> +        {
>>>>> +            printf("error executing xc_memshr_range_share: %s\n", strerror(errno));
>>>>> +            return rc;
>>>>> +        }
>>>>> +    }
>>>>>      return 0;
>>>>>  }
>>>>> diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
>>>>> index a522423..329fbd9 100644
>>>>> --- a/xen/arch/x86/mm/mem_sharing.c
>>>>> +++ b/xen/arch/x86/mm/mem_sharing.c
>>>>> @@ -1294,6 +1294,58 @@ int relinquish_shared_pages(struct domain *d)
>>>>>      return rc;
>>>>>  }
>>>>>
>>>>> +static int range_share(struct domain *d, struct domain *cd,
>>>>> +                       struct mem_sharing_op_range *range)
>>>>> +{
>>>>> +    int rc = 0;
>>>>> +    shr_handle_t sh, ch;
>>>>> +    unsigned long start = range->_scratchspace ?: range->start;
>>>>> +
>>>>> +    while( range->end >= start )
>>>>> +    {
>>>>> +        /*
>>>>> +         * We only break out if we run out of memory as individual pages may
>>>>> +         * legitimately be unsharable and we just want to skip over those.
>>>>> +         */
>>>>> +        rc = mem_sharing_nominate_page(d, start, 0, &sh);
>>>>> +        if ( rc == -ENOMEM )
>>>>> +            break;
>>>>> +
>>>>> +        if ( !rc )
>>>>> +        {
>>>>> +            rc = mem_sharing_nominate_page(cd, start, 0, &ch);
>>>>> +            if ( rc == -ENOMEM )
>>>>> +                break;
>>>>> +
>>>>> +            if ( !rc )
>>>>> +            {
>>>>> +                /* If we get here this should be guaranteed to succeed. */
>>>>> +                rc = mem_sharing_share_pages(d, start, sh,
>>>>> +                                             cd, start, ch);
>>>>> +                ASSERT(!rc);
>>>>> +            }
>>>>> +        }
>>>>> +
>>>>> +        /* Check for continuation if it's not the last iteration. */
>>>>> +        if ( range->end >= ++start && hypercall_preempt_check() )
>>>>> +        {
>>>>> +            rc = 1;
>>>>> +            break;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    range->_scratchspace = start;
>>>>> +
>>>>> +    /*
>>>>> +     * The last page may fail with -EINVAL, and for range sharing we don't
>>>>> +     * care about that.
>>>>> +     */
>>>>> +    if ( range->end < start && rc == -EINVAL )
>>>>> +        rc = 0;
>>>>> +
>>>>> +    return rc;
>>>>> +}
>>>>> +
>>>>>  int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>>>>>  {
>>>>>      int rc;
>>>>> @@ -1468,6 +1520,94 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>>>>>          }
>>>>>          break;
>>>>>
>>>>> +        case XENMEM_sharing_op_range_share:
>>>>> +        {
>>>>> +            unsigned long max_sgfn, max_cgfn;
>>>>> +            struct domain *cd;
>>>>> +
>>>>> +            rc = -EINVAL;
>>>>> +            if ( mso.u.range._pad[0] || mso.u.range._pad[1] ||
>>>>> +                 mso.u.range._pad[2] )
>>>>> +                 goto out;
>>>>> +
>>>>> +            /*
>>>>> +             * We use _scratchscape for the hypercall continuation value.
>>>>> +             * Ideally the user sets this to 0 in the beginning but
>>>>> +             * there is no good way of enforcing that here, so we just check
>>>>> +             * that it's at least in range.
>>>>> +             */
>>>>> +            if ( mso.u.range._scratchspace &&
>>>>> +                 (mso.u.range._scratchspace < mso.u.range.start ||
>>>>> +                  mso.u.range._scratchspace > mso.u.range.end) )
>>>>> +                goto out;
>>>>> +
>>>>> +            if ( !mem_sharing_enabled(d) )
>>>>> +                goto out;
>>>>> +
>>>>> +            rc = rcu_lock_live_remote_domain_by_id(mso.u.range.client_domain,
>>>>> +                                                   &cd);
>>>>> +            if ( rc )
>>>>> +                goto out;
>>>>> +
>>>>> +            /*
>>>>> +             * We reuse XENMEM_sharing_op_share XSM check here as this is
>>>>> +             * essentially the same concept repeated over multiple pages.
>>>>> +             */
>>>>> +            rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd,
>>>>> +                                    XENMEM_sharing_op_share);
>>>>> +            if ( rc )
>>>>> +            {
>>>>> +                rcu_unlock_domain(cd);
>>>>> +                goto out;
>>>>> +            }
>>>>> +
>>>>> +            if ( !mem_sharing_enabled(cd) )
>>>>> +            {
>>>>> +                rcu_unlock_domain(cd);
>>>>> +                rc = -EINVAL;
>>>>> +                goto out;
>>>>> +            }
>>>>> +
>>>>> +            /*
>>>>> +             * Sanity check only, the client should keep the domains paused for
>>>>> +             * the duration of this op.
>>>>> +             */
>>>>> +            if ( !atomic_read(&d->pause_count) ||
>>>>> +                 !atomic_read(&cd->pause_count) )
>>>>> +            {
>>>>> +                rcu_unlock_domain(cd);
>>>>> +                rc = -EINVAL;
>>>>> +                goto out;
>>>>> +            }
>>>>> +
>>>>> +            max_sgfn = domain_get_maximum_gpfn(d);
>>>>> +            max_cgfn = domain_get_maximum_gpfn(cd);
>>>>> +
>>>>> +            if ( max_sgfn < mso.u.range.start || max_sgfn < mso.u.range.end ||
>>>>> +                 max_cgfn < mso.u.range.start || max_cgfn < mso.u.range.end )
>>>>> +            {
>>>>> +                rcu_unlock_domain(cd);
>>>>> +                rc = -EINVAL;
>>>>> +                goto out;
>>>>> +            }
>>>>> +
>>>>> +            rc = range_share(d, cd, &mso.u.range);
>>>>> +            rcu_unlock_domain(cd);
>>>>> +
>>>>> +            if ( rc > 0 )
>>>>> +            {
>>>>> +                if ( __copy_to_guest(arg, &mso, 1) )
>>>>> +                    rc = -EFAULT;
>>>>> +                else
>>>>> +                    rc = hypercall_create_continuation(__HYPERVISOR_memory_op,
>>>>> +                                                       "lh", XENMEM_sharing_op,
>>>>> +                                                       arg);
>>>>> +            }
>>>>> +            else
>>>>> +                mso.u.range._scratchspace = 0;
>>>>> +        }
>>>>> +        break;
>>>>> +
>>>>>          case XENMEM_sharing_op_debug_gfn:
>>>>>          {
>>>>>              unsigned long gfn = mso.u.debug.u.gfn;
>>>>> diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
>>>>> index 29ec571..e0bc018 100644
>>>>> --- a/xen/include/public/memory.h
>>>>> +++ b/xen/include/public/memory.h
>>>>> @@ -465,6 +465,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t);
>>>>>  #define XENMEM_sharing_op_debug_gref        5
>>>>>  #define XENMEM_sharing_op_add_physmap       6
>>>>>  #define XENMEM_sharing_op_audit             7
>>>>> +#define XENMEM_sharing_op_range_share       8
>>>>>
>>>>>  #define XENMEM_SHARING_OP_S_HANDLE_INVALID  (-10)
>>>>>  #define XENMEM_SHARING_OP_C_HANDLE_INVALID  (-9)
>>>>> @@ -500,7 +501,14 @@ struct xen_mem_sharing_op {
>>>>>              uint64_aligned_t client_gfn;    /* IN: the client gfn */
>>>>>              uint64_aligned_t client_handle; /* IN: handle to the client page */
>>>>>              domid_t  client_domain; /* IN: the client domain id */
>>>>> -        } share;
>>>>> +        } share;
>>>>> +        struct mem_sharing_op_range {         /* OP_RANGE_SHARE */
>>>>> +            uint64_aligned_t start;          /* IN: start gfn. */
>>>>> +            uint64_aligned_t end;            /* IN: end gfn (inclusive) */
>>>>> +            uint64_aligned_t _scratchspace;  /* Must be set to 0 */
>>>> Tamas,
>>>>
>>>> Why include this "scratchspace" that's not used in the interface,
>>>> rather than just doing what the memory operations in
>>>> xen/common/memory.c do, and store it in EAX by shifting it over by
>>>> MEMOP_EXTENT_SHIFT?  I looked through the history and I see that v1
>>>> did something very much like that, but it was changed to using the
>>>> scratch space without any explanation.
>>>>
>>>> Having this in the public interface is ugly; and what's worse, it
>>>> exposes some of the internal mechanism to the guest.
>>>>
>>>>  -George
>>> Well, that's exactly what I did in an earlier version of the patch but
>>> it was requested that I change it to something like this by Andrew
>>> (see https://lists.xen.org/archives/html/xen-devel/2015-10/msg00434.html).
>>> Then over the various iterations it ended up like looking like this.
>> Oh right -- sorry, I did look but somehow I missed that Andrew had requested it.
>>
>> I would have read his comment to  mean to put the _scratchspace
>> variable in the larger structure.  But it has his R-b, so I'll
>> consider myself answered.
> 
> I am open to other suggestion wrt naming, but wasn't looking to bikeshed
> the issue.

My question wasn't about naming so much as the idea of stashing internal
implementation mechanism in the public interface in the first place,
rather than just doing what all the other memory operations do, which is
stash it in EAX.

I personally would have preferred not to have what seems to me a bit of
an ugly break in the abstraction in the public interface (regardless of
the name), but I don't feel strongly enough to press the point if that's
what you've asked for (particularly at v8 of this patch).

(Tamas is listed as the maintainer of mem_sharing.c, so this doesn't
need my Ack AFAICT.)

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v8] x86/mem-sharing: mem-sharing a range of memory
  2016-07-27  9:01         ` George Dunlap
@ 2016-07-27  9:59           ` Andrew Cooper
  0 siblings, 0 replies; 8+ messages in thread
From: Andrew Cooper @ 2016-07-27  9:59 UTC (permalink / raw)
  To: George Dunlap, Tamas K Lengyel
  Cc: xen-devel, Ian Jackson, Andres Lagar-Cavilla, Jan Beulich, Tim Deegan

On 27/07/16 10:01, George Dunlap wrote:
> On 26/07/16 23:43, Andrew Cooper wrote:
>> On 26/07/2016 16:49, George Dunlap wrote:
>>> On Tue, Jul 26, 2016 at 4:22 PM, Tamas K Lengyel
>>> <tamas.lengyel@zentific.com> wrote:
>>>> On Tue, Jul 26, 2016 at 3:12 AM, George Dunlap <george.dunlap@citrix.com> wrote:
>>>>> On Wed, Jul 20, 2016 at 7:01 PM, Tamas K Lengyel
>>>>> <tamas.lengyel@zentific.com> wrote:
>>>>>> Currently mem-sharing can be performed on a page-by-page basis from the control
>>>>>> domain. However, this process is quite wasteful when a range of pages have to
>>>>>> be deduplicated.
>>>>>>
>>>>>> This patch introduces a new mem_sharing memop for range sharing where
>>>>>> the user doesn't have to separately nominate each page in both the source and
>>>>>> destination domain, and the looping over all pages happen in the hypervisor.
>>>>>> This significantly reduces the overhead of sharing a range of memory.
>>>>>>
>>>>>> Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
>>>>>> Acked-by: Wei Liu <wei.liu2@citrix.com>
>>>>>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>>> ---
>>>>>> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
>>>>>> Cc: George Dunlap <george.dunlap@eu.citrix.com>
>>>>>> Cc: Jan Beulich <jbeulich@suse.com>
>>>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>>>
>>>>>> v8: style fixes and minor adjustments
>>>>>> ---
>>>>>>  tools/libxc/include/xenctrl.h        |  15 ++++
>>>>>>  tools/libxc/xc_memshr.c              |  19 +++++
>>>>>>  tools/tests/mem-sharing/memshrtool.c |  22 ++++++
>>>>>>  xen/arch/x86/mm/mem_sharing.c        | 140 +++++++++++++++++++++++++++++++++++
>>>>>>  xen/include/public/memory.h          |  10 ++-
>>>>>>  5 files changed, 205 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
>>>>>> index e904bd5..3782eff 100644
>>>>>> --- a/tools/libxc/include/xenctrl.h
>>>>>> +++ b/tools/libxc/include/xenctrl.h
>>>>>> @@ -2334,6 +2334,21 @@ int xc_memshr_add_to_physmap(xc_interface *xch,
>>>>>>                      domid_t client_domain,
>>>>>>                      unsigned long client_gfn);
>>>>>>
>>>>>> +/* Allows to deduplicate a range of memory of a client domain. Using
>>>>>> + * this function is equivalent of calling xc_memshr_nominate_gfn for each gfn
>>>>>> + * in the two domains followed by xc_memshr_share_gfns.
>>>>>> + *
>>>>>> + * May fail with -EINVAL if the source and client domain have different
>>>>>> + * memory size or if memory sharing is not enabled on either of the domains.
>>>>>> + * May also fail with -ENOMEM if there isn't enough memory available to store
>>>>>> + * the sharing metadata before deduplication can happen.
>>>>>> + */
>>>>>> +int xc_memshr_range_share(xc_interface *xch,
>>>>>> +                          domid_t source_domain,
>>>>>> +                          domid_t client_domain,
>>>>>> +                          uint64_t start,
>>>>>> +                          uint64_t end);
>>>>>> +
>>>>>>  /* Debug calls: return the number of pages referencing the shared frame backing
>>>>>>   * the input argument. Should be one or greater.
>>>>>>   *
>>>>>> diff --git a/tools/libxc/xc_memshr.c b/tools/libxc/xc_memshr.c
>>>>>> index deb0aa4..2b871c7 100644
>>>>>> --- a/tools/libxc/xc_memshr.c
>>>>>> +++ b/tools/libxc/xc_memshr.c
>>>>>> @@ -181,6 +181,25 @@ int xc_memshr_add_to_physmap(xc_interface *xch,
>>>>>>      return xc_memshr_memop(xch, source_domain, &mso);
>>>>>>  }
>>>>>>
>>>>>> +int xc_memshr_range_share(xc_interface *xch,
>>>>>> +                          domid_t source_domain,
>>>>>> +                          domid_t client_domain,
>>>>>> +                          uint64_t start,
>>>>>> +                          uint64_t end)
>>>>>> +{
>>>>>> +    xen_mem_sharing_op_t mso;
>>>>>> +
>>>>>> +    memset(&mso, 0, sizeof(mso));
>>>>>> +
>>>>>> +    mso.op = XENMEM_sharing_op_range_share;
>>>>>> +
>>>>>> +    mso.u.range.client_domain = client_domain;
>>>>>> +    mso.u.range.start = start;
>>>>>> +    mso.u.range.end = end;
>>>>>> +
>>>>>> +    return xc_memshr_memop(xch, source_domain, &mso);
>>>>>> +}
>>>>>> +
>>>>>>  int xc_memshr_domain_resume(xc_interface *xch,
>>>>>>                              domid_t domid)
>>>>>>  {
>>>>>> diff --git a/tools/tests/mem-sharing/memshrtool.c b/tools/tests/mem-sharing/memshrtool.c
>>>>>> index 437c7c9..2af6a9e 100644
>>>>>> --- a/tools/tests/mem-sharing/memshrtool.c
>>>>>> +++ b/tools/tests/mem-sharing/memshrtool.c
>>>>>> @@ -24,6 +24,8 @@ static int usage(const char* prog)
>>>>>>      printf("  nominate <domid> <gfn>  - Nominate a page for sharing.\n");
>>>>>>      printf("  share <domid> <gfn> <handle> <source> <source-gfn> <source-handle>\n");
>>>>>>      printf("                          - Share two pages.\n");
>>>>>> +    printf("  range <source-domid> <destination-domid> <start-gfn> <end-gfn>\n");
>>>>>> +    printf("                          - Share pages between domains in range.\n");
>>>>>>      printf("  unshare <domid> <gfn>   - Unshare a page by grabbing a writable map.\n");
>>>>>>      printf("  add-to-physmap <domid> <gfn> <source> <source-gfn> <source-handle>\n");
>>>>>>      printf("                          - Populate a page in a domain with a shared page.\n");
>>>>>> @@ -180,6 +182,26 @@ int main(int argc, const char** argv)
>>>>>>          }
>>>>>>          printf("Audit returned %d errors.\n", rc);
>>>>>>      }
>>>>>> +    else if( !strcasecmp(cmd, "range") )
>>>>>> +    {
>>>>>> +        domid_t sdomid, cdomid;
>>>>>> +        int rc;
>>>>>> +        uint64_t start, end;
>>>>>> +
>>>>>> +        if ( argc != 6 )
>>>>>> +            return usage(argv[0]);
>>>>>>
>>>>>> +        sdomid = strtol(argv[2], NULL, 0);
>>>>>> +        cdomid = strtol(argv[3], NULL, 0);
>>>>>> +        start = strtoul(argv[4], NULL, 0);
>>>>>> +        end = strtoul(argv[5], NULL, 0);
>>>>>> +
>>>>>> +        rc = xc_memshr_range_share(xch, sdomid, cdomid, start, end);
>>>>>> +        if ( rc < 0 )
>>>>>> +        {
>>>>>> +            printf("error executing xc_memshr_range_share: %s\n", strerror(errno));
>>>>>> +            return rc;
>>>>>> +        }
>>>>>> +    }
>>>>>>      return 0;
>>>>>>  }
>>>>>> diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
>>>>>> index a522423..329fbd9 100644
>>>>>> --- a/xen/arch/x86/mm/mem_sharing.c
>>>>>> +++ b/xen/arch/x86/mm/mem_sharing.c
>>>>>> @@ -1294,6 +1294,58 @@ int relinquish_shared_pages(struct domain *d)
>>>>>>      return rc;
>>>>>>  }
>>>>>>
>>>>>> +static int range_share(struct domain *d, struct domain *cd,
>>>>>> +                       struct mem_sharing_op_range *range)
>>>>>> +{
>>>>>> +    int rc = 0;
>>>>>> +    shr_handle_t sh, ch;
>>>>>> +    unsigned long start = range->_scratchspace ?: range->start;
>>>>>> +
>>>>>> +    while( range->end >= start )
>>>>>> +    {
>>>>>> +        /*
>>>>>> +         * We only break out if we run out of memory as individual pages may
>>>>>> +         * legitimately be unsharable and we just want to skip over those.
>>>>>> +         */
>>>>>> +        rc = mem_sharing_nominate_page(d, start, 0, &sh);
>>>>>> +        if ( rc == -ENOMEM )
>>>>>> +            break;
>>>>>> +
>>>>>> +        if ( !rc )
>>>>>> +        {
>>>>>> +            rc = mem_sharing_nominate_page(cd, start, 0, &ch);
>>>>>> +            if ( rc == -ENOMEM )
>>>>>> +                break;
>>>>>> +
>>>>>> +            if ( !rc )
>>>>>> +            {
>>>>>> +                /* If we get here this should be guaranteed to succeed. */
>>>>>> +                rc = mem_sharing_share_pages(d, start, sh,
>>>>>> +                                             cd, start, ch);
>>>>>> +                ASSERT(!rc);
>>>>>> +            }
>>>>>> +        }
>>>>>> +
>>>>>> +        /* Check for continuation if it's not the last iteration. */
>>>>>> +        if ( range->end >= ++start && hypercall_preempt_check() )
>>>>>> +        {
>>>>>> +            rc = 1;
>>>>>> +            break;
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>> +    range->_scratchspace = start;
>>>>>> +
>>>>>> +    /*
>>>>>> +     * The last page may fail with -EINVAL, and for range sharing we don't
>>>>>> +     * care about that.
>>>>>> +     */
>>>>>> +    if ( range->end < start && rc == -EINVAL )
>>>>>> +        rc = 0;
>>>>>> +
>>>>>> +    return rc;
>>>>>> +}
>>>>>> +
>>>>>>  int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>>>>>>  {
>>>>>>      int rc;
>>>>>> @@ -1468,6 +1520,94 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>>>>>>          }
>>>>>>          break;
>>>>>>
>>>>>> +        case XENMEM_sharing_op_range_share:
>>>>>> +        {
>>>>>> +            unsigned long max_sgfn, max_cgfn;
>>>>>> +            struct domain *cd;
>>>>>> +
>>>>>> +            rc = -EINVAL;
>>>>>> +            if ( mso.u.range._pad[0] || mso.u.range._pad[1] ||
>>>>>> +                 mso.u.range._pad[2] )
>>>>>> +                 goto out;
>>>>>> +
>>>>>> +            /*
>>>>>> +             * We use _scratchscape for the hypercall continuation value.
>>>>>> +             * Ideally the user sets this to 0 in the beginning but
>>>>>> +             * there is no good way of enforcing that here, so we just check
>>>>>> +             * that it's at least in range.
>>>>>> +             */
>>>>>> +            if ( mso.u.range._scratchspace &&
>>>>>> +                 (mso.u.range._scratchspace < mso.u.range.start ||
>>>>>> +                  mso.u.range._scratchspace > mso.u.range.end) )
>>>>>> +                goto out;
>>>>>> +
>>>>>> +            if ( !mem_sharing_enabled(d) )
>>>>>> +                goto out;
>>>>>> +
>>>>>> +            rc = rcu_lock_live_remote_domain_by_id(mso.u.range.client_domain,
>>>>>> +                                                   &cd);
>>>>>> +            if ( rc )
>>>>>> +                goto out;
>>>>>> +
>>>>>> +            /*
>>>>>> +             * We reuse XENMEM_sharing_op_share XSM check here as this is
>>>>>> +             * essentially the same concept repeated over multiple pages.
>>>>>> +             */
>>>>>> +            rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd,
>>>>>> +                                    XENMEM_sharing_op_share);
>>>>>> +            if ( rc )
>>>>>> +            {
>>>>>> +                rcu_unlock_domain(cd);
>>>>>> +                goto out;
>>>>>> +            }
>>>>>> +
>>>>>> +            if ( !mem_sharing_enabled(cd) )
>>>>>> +            {
>>>>>> +                rcu_unlock_domain(cd);
>>>>>> +                rc = -EINVAL;
>>>>>> +                goto out;
>>>>>> +            }
>>>>>> +
>>>>>> +            /*
>>>>>> +             * Sanity check only, the client should keep the domains paused for
>>>>>> +             * the duration of this op.
>>>>>> +             */
>>>>>> +            if ( !atomic_read(&d->pause_count) ||
>>>>>> +                 !atomic_read(&cd->pause_count) )
>>>>>> +            {
>>>>>> +                rcu_unlock_domain(cd);
>>>>>> +                rc = -EINVAL;
>>>>>> +                goto out;
>>>>>> +            }
>>>>>> +
>>>>>> +            max_sgfn = domain_get_maximum_gpfn(d);
>>>>>> +            max_cgfn = domain_get_maximum_gpfn(cd);
>>>>>> +
>>>>>> +            if ( max_sgfn < mso.u.range.start || max_sgfn < mso.u.range.end ||
>>>>>> +                 max_cgfn < mso.u.range.start || max_cgfn < mso.u.range.end )
>>>>>> +            {
>>>>>> +                rcu_unlock_domain(cd);
>>>>>> +                rc = -EINVAL;
>>>>>> +                goto out;
>>>>>> +            }
>>>>>> +
>>>>>> +            rc = range_share(d, cd, &mso.u.range);
>>>>>> +            rcu_unlock_domain(cd);
>>>>>> +
>>>>>> +            if ( rc > 0 )
>>>>>> +            {
>>>>>> +                if ( __copy_to_guest(arg, &mso, 1) )
>>>>>> +                    rc = -EFAULT;
>>>>>> +                else
>>>>>> +                    rc = hypercall_create_continuation(__HYPERVISOR_memory_op,
>>>>>> +                                                       "lh", XENMEM_sharing_op,
>>>>>> +                                                       arg);
>>>>>> +            }
>>>>>> +            else
>>>>>> +                mso.u.range._scratchspace = 0;
>>>>>> +        }
>>>>>> +        break;
>>>>>> +
>>>>>>          case XENMEM_sharing_op_debug_gfn:
>>>>>>          {
>>>>>>              unsigned long gfn = mso.u.debug.u.gfn;
>>>>>> diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
>>>>>> index 29ec571..e0bc018 100644
>>>>>> --- a/xen/include/public/memory.h
>>>>>> +++ b/xen/include/public/memory.h
>>>>>> @@ -465,6 +465,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t);
>>>>>>  #define XENMEM_sharing_op_debug_gref        5
>>>>>>  #define XENMEM_sharing_op_add_physmap       6
>>>>>>  #define XENMEM_sharing_op_audit             7
>>>>>> +#define XENMEM_sharing_op_range_share       8
>>>>>>
>>>>>>  #define XENMEM_SHARING_OP_S_HANDLE_INVALID  (-10)
>>>>>>  #define XENMEM_SHARING_OP_C_HANDLE_INVALID  (-9)
>>>>>> @@ -500,7 +501,14 @@ struct xen_mem_sharing_op {
>>>>>>              uint64_aligned_t client_gfn;    /* IN: the client gfn */
>>>>>>              uint64_aligned_t client_handle; /* IN: handle to the client page */
>>>>>>              domid_t  client_domain; /* IN: the client domain id */
>>>>>> -        } share;
>>>>>> +        } share;
>>>>>> +        struct mem_sharing_op_range {         /* OP_RANGE_SHARE */
>>>>>> +            uint64_aligned_t start;          /* IN: start gfn. */
>>>>>> +            uint64_aligned_t end;            /* IN: end gfn (inclusive) */
>>>>>> +            uint64_aligned_t _scratchspace;  /* Must be set to 0 */
>>>>> Tamas,
>>>>>
>>>>> Why include this "scratchspace" that's not used in the interface,
>>>>> rather than just doing what the memory operations in
>>>>> xen/common/memory.c do, and store it in EAX by shifting it over by
>>>>> MEMOP_EXTENT_SHIFT?  I looked through the history and I see that v1
>>>>> did something very much like that, but it was changed to using the
>>>>> scratch space without any explanation.
>>>>>
>>>>> Having this in the public interface is ugly; and what's worse, it
>>>>> exposes some of the internal mechanism to the guest.
>>>>>
>>>>>  -George
>>>> Well, that's exactly what I did in an earlier version of the patch but
>>>> it was requested that I change it to something like this by Andrew
>>>> (see https://lists.xen.org/archives/html/xen-devel/2015-10/msg00434.html).
>>>> Then over the various iterations it ended up like looking like this.
>>> Oh right -- sorry, I did look but somehow I missed that Andrew had requested it.
>>>
>>> I would have read his comment to  mean to put the _scratchspace
>>> variable in the larger structure.  But it has his R-b, so I'll
>>> consider myself answered.
>> I am open to other suggestion wrt naming, but wasn't looking to bikeshed
>> the issue.
> My question wasn't about naming so much as the idea of stashing internal
> implementation mechanism in the public interface in the first place,
> rather than just doing what all the other memory operations do, which is
> stash it in EAX.

Stashing in eax is a gross hack ^W necessary resort (for a previous XSA
iirc) where we needed to retrofit continuations into an interface which
couldn't support them.

Since then, all new interfaces liable to require continuations have been
having explicit room to do so, to avoid stashing in eax.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v8] x86/mem-sharing: mem-sharing a range of memory
  2016-07-20 18:01 [PATCH v8] x86/mem-sharing: mem-sharing a range of memory Tamas K Lengyel
  2016-07-26  9:12 ` George Dunlap
@ 2016-08-01 11:12 ` Jan Beulich
  1 sibling, 0 replies; 8+ messages in thread
From: Jan Beulich @ 2016-08-01 11:12 UTC (permalink / raw)
  To: Tamas K Lengyel; +Cc: George Dunlap, Andrew Cooper, Ian Jackson, xen-devel

>>> On 20.07.16 at 20:01, <tamas.lengyel@zentific.com> wrote:
> +static int range_share(struct domain *d, struct domain *cd,
> +                       struct mem_sharing_op_range *range)
> +{
> +    int rc = 0;
> +    shr_handle_t sh, ch;
> +    unsigned long start = range->_scratchspace ?: range->start;
> +
> +    while( range->end >= start )

Coding style.

> @@ -500,7 +501,14 @@ struct xen_mem_sharing_op {
>              uint64_aligned_t client_gfn;    /* IN: the client gfn */
>              uint64_aligned_t client_handle; /* IN: handle to the client page */
>              domid_t  client_domain; /* IN: the client domain id */
> -        } share; 
> +        } share;
> +        struct mem_sharing_op_range {         /* OP_RANGE_SHARE */
> +            uint64_aligned_t start;          /* IN: start gfn. */
> +            uint64_aligned_t end;            /* IN: end gfn (inclusive) */

Please include _gfn" in these names. And afaic using first_gfn and
last_gfn would better represent the inclusiveness of the range.

> +            uint64_aligned_t _scratchspace;  /* Must be set to 0 */

A little long a name. How about either the suggested "opaque" or
just "scratch"? In any event no reason for a leading underscore I
think.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-08-01 11:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-20 18:01 [PATCH v8] x86/mem-sharing: mem-sharing a range of memory Tamas K Lengyel
2016-07-26  9:12 ` George Dunlap
2016-07-26 15:22   ` Tamas K Lengyel
2016-07-26 15:49     ` George Dunlap
2016-07-26 22:43       ` Andrew Cooper
2016-07-27  9:01         ` George Dunlap
2016-07-27  9:59           ` Andrew Cooper
2016-08-01 11:12 ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).