xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: George Dunlap <dunlapg@umich.edu>
To: Tamas K Lengyel <tamas@tklengyel.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH v5 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
Date: Wed, 22 Jun 2016 16:38:13 +0100	[thread overview]
Message-ID: <CAFLBxZbWGTqn4dtpJaJW9fKcNqrBgKTMA2ohWAkom4Y+1YXafw@mail.gmail.com> (raw)
In-Reply-To: <1465687481-10095-1-git-send-email-tamas@tklengyel.com>

On Sun, Jun 12, 2016 at 12:24 AM, Tamas K Lengyel <tamas@tklengyel.com> wrote:
> Currently mem-sharing can be performed on a page-by-page base from the control
> domain. However, when completely deduplicating (cloning) a VM, this requires
> at least 3 hypercalls per page. As the user has to loop through all pages up
> to max_gpfn, this process is very slow and wasteful.
>
> This patch introduces a new mem_sharing memop for bulk deduplication where
> the user doesn't have to separately nominate each page in both the source and
> destination domain, and the looping over all pages happen in the hypervisor.
> This significantly reduces the overhead of completely deduplicating entire
> domains.
>
> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
> Acked-by: Wei Liu <wei.liu2@citrix.com>
> ---
> Ian Jackson <ian.jackson@eu.citrix.com>
> George Dunlap <george.dunlap@eu.citrix.com>
> Jan Beulich <jbeulich@suse.com>
> Andrew Cooper <andrew.cooper3@citrix.com>

I'm sorry I'm a bit late to this party -- I'm not sure how I managed
to miss the earlier posts of this.  A couple of questions...


> @@ -1468,6 +1516,79 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>          }
>          break;
>
> +        case XENMEM_sharing_op_bulk_share:
> +        {
> +            unsigned long max_sgfn, max_cgfn;
> +            struct domain *cd;
> +
> +            rc = -EINVAL;
> +            if( mso.u.bulk._pad[0] || mso.u.bulk._pad[1] || mso.u.bulk._pad[2] )
> +                goto out;
> +
> +            if ( !mem_sharing_enabled(d) )
> +                goto out;
> +
> +            rc = rcu_lock_live_remote_domain_by_id(mso.u.bulk.client_domain,
> +                                                   &cd);
> +            if ( rc )
> +                goto out;
> +
> +            /*
> +             * We reuse XENMEM_sharing_op_share XSM check here as this is essentially
> +             * the same concept repeated over multiple pages.
> +             */
> +            rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd, XENMEM_sharing_op_share);
> +            if ( rc )
> +            {
> +                rcu_unlock_domain(cd);
> +                goto out;
> +            }
> +
> +            if ( !mem_sharing_enabled(cd) )
> +            {
> +                rcu_unlock_domain(cd);
> +                rc = -EINVAL;
> +                goto out;
> +            }
> +
> +            if ( !atomic_read(&d->pause_count) ||
> +                 !atomic_read(&cd->pause_count) )
> +            {
> +                rcu_unlock_domain(cd);
> +                rc = -EINVAL;
> +                goto out;
> +            }

I realize that Jan asked for this, but I'm really not sure what good
this is supposed to do, since the guest can be un-paused at any point
halfway through the transaction.

Wouldn't it make more sense to have this function pause and unpause
the domains itself?

> +
> +            max_sgfn = domain_get_maximum_gpfn(d);
> +            max_cgfn = domain_get_maximum_gpfn(cd);
> +
> +            if ( max_sgfn != max_cgfn || max_sgfn < mso.u.bulk.start )
> +            {
> +                rcu_unlock_domain(cd);
> +                rc = -EINVAL;
> +                goto out;
> +            }
> +
> +            rc = bulk_share(d, cd, max_sgfn + 1, &mso.u.bulk);
> +            if ( rc > 0 )
> +            {
> +                if ( __copy_to_guest(arg, &mso, 1) )
> +                    rc = -EFAULT;
> +                else
> +                    rc = hypercall_create_continuation(__HYPERVISOR_memory_op,
> +                                                       "lh", XENMEM_sharing_op,
> +                                                       arg);
> +            }
> +            else
> +            {
> +                mso.u.bulk.start = 0;
> +                mso.u.bulk.shared = atomic_read(&cd->shr_pages);
> +            }
> +
> +            rcu_unlock_domain(cd);
> +        }
> +        break;
> +
>          case XENMEM_sharing_op_debug_gfn:
>          {
>              unsigned long gfn = mso.u.debug.u.gfn;
> diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
> index 29ec571..084f06e 100644
> --- a/xen/include/public/memory.h
> +++ b/xen/include/public/memory.h
> @@ -465,6 +465,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t);
>  #define XENMEM_sharing_op_debug_gref        5
>  #define XENMEM_sharing_op_add_physmap       6
>  #define XENMEM_sharing_op_audit             7
> +#define XENMEM_sharing_op_bulk_share        8
>
>  #define XENMEM_SHARING_OP_S_HANDLE_INVALID  (-10)
>  #define XENMEM_SHARING_OP_C_HANDLE_INVALID  (-9)
> @@ -500,7 +501,19 @@ struct xen_mem_sharing_op {
>              uint64_aligned_t client_gfn;    /* IN: the client gfn */
>              uint64_aligned_t client_handle; /* IN: handle to the client page */
>              domid_t  client_domain; /* IN: the client domain id */
> -        } share;
> +        } share;
> +        struct mem_sharing_op_bulk {         /* OP_BULK_SHARE */
> +            uint64_aligned_t start;          /* IN: start gfn. Set to 0 for
> +                                                full deduplication. Field is
> +                                                reset to 0 when hypercall
> +                                                completes */

It took me a while to figure out that this value isn't actually
intended to be used as described.  You're actually intended to always
set this to zero, and the hypervisor just uses it for scratch space.

To start with, it seems like having a "bulk share" option which could
do just a specific range would be useful as well as a "bulk share"
which automatically deduped the entire set of memory.

Secondly, structuring the information like the other memory operations
-- for example, "nr_extents" and "nr_done" -- would make it more
consistent, and would allow you to also to avoid overwriting one of
the "in" values and having to restore it when you were done.

Other than that, it looks OK.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  parent reply	other threads:[~2016-06-22 15:38 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-11 23:24 [PATCH v5 1/2] x86/mem-sharing: Bulk mem-sharing entire domains Tamas K Lengyel
2016-06-11 23:24 ` [PATCH v5 2/2] tests/mem-sharing: Add bulk option to memshrtool Tamas K Lengyel
2016-06-12  0:08 ` [PATCH v5 1/2] x86/mem-sharing: Bulk mem-sharing entire domains Tamas K Lengyel
2016-06-14  9:56 ` Jan Beulich
2016-06-14 16:33 ` Konrad Rzeszutek Wilk
     [not found]   ` <CABfawhmqhAsEk=J5F3y0tAvd74xfi1_2-A+tfKzwCapcmL1zPg@mail.gmail.com>
     [not found]     ` <CABfawh=Hw-mZi+d751pRhA-C-wLOJ3TQPkW6rLAYp+GGGbxx1Q@mail.gmail.com>
     [not found]       ` <CABfawhnZVFt2Hwrfb+DNH3ft3gbB3Bc4+h7EpJfXbf=1_P-LWQ@mail.gmail.com>
     [not found]         ` <CABfawhkTvytb7NvHJ-AHJBCxA3wJ5Fo-ciV66JpSY-xdK+3v5Q@mail.gmail.com>
2016-06-14 16:42           ` Tamas K Lengyel
2016-06-15  8:14   ` Jan Beulich
2016-06-15 14:02     ` Konrad Rzeszutek Wilk
2016-06-22 15:38 ` George Dunlap [this message]
2016-06-23 15:42   ` Tamas K Lengyel
2016-07-05 14:35     ` George Dunlap
2016-07-05 14:58       ` Tamas K Lengyel
2016-07-05 14:59       ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFLBxZbWGTqn4dtpJaJW9fKcNqrBgKTMA2ohWAkom4Y+1YXafw@mail.gmail.com \
    --to=dunlapg@umich.edu \
    --cc=tamas@tklengyel.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).