All of lore.kernel.org
 help / color / mirror / Atom feed
From: Julien Grall <julien@xen.org>
To: Tamas K Lengyel <tamas@tklengyel.com>
Cc: "Stefano Stabellini" <sstabellini@kernel.org>,
	"Tamas K Lengyel" <tamas.lengyel@intel.com>,
	"Wei Liu" <wl@xen.org>,
	"Konrad Rzeszutek Wilk" <konrad.wilk@oracle.com>,
	"George Dunlap" <george.dunlap@eu.citrix.com>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Ian Jackson" <ian.jackson@eu.citrix.com>,
	"Jan Beulich" <jbeulich@suse.com>,
	Xen-devel <xen-devel@lists.xenproject.org>,
	"Roger Pau Monné" <roger.pau@citrix.com>
Subject: Re: [Xen-devel] [PATCH v2 19/20] x86/mem_sharing: reset a fork
Date: Thu, 19 Dec 2019 07:45:56 +0000	[thread overview]
Message-ID: <18414678-5d0a-a2b4-f6ba-61464a0127c0@xen.org> (raw)
In-Reply-To: <CABfawhmzjE6c0msjpPEBTciTnCTVQGd46ovSuB_7qqXdY6BvsQ@mail.gmail.com>

Hi Tamas,

On 19/12/2019 00:15, Tamas K Lengyel wrote:
> On Wed, Dec 18, 2019 at 4:02 PM Julien Grall <julien@xen.org> wrote:
>>
>> Hi,
>>
>> On 18/12/2019 22:33, Tamas K Lengyel wrote:
>>> On Wed, Dec 18, 2019 at 3:00 PM Julien Grall <julien@xen.org> wrote:
>>>>
>>>> Hi Tamas,
>>>>
>>>> On 18/12/2019 19:40, Tamas K Lengyel wrote:
>>>>> Implement hypercall that allows a fork to shed all memory that got allocated
>>>>> for it during its execution and re-load its vCPU context from the parent VM.
>>>>> This allows the forked VM to reset into the same state the parent VM is in a
>>>>> faster way then creating a new fork would be. Measurements show about a 2x
>>>>> speedup during normal fuzzing operations. Performance may vary depending how
>>>>> much memory got allocated for the forked VM. If it has been completely
>>>>> deduplicated from the parent VM then creating a new fork would likely be more
>>>>> performant.
>>>>>
>>>>> Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
>>>>> ---
>>>>>     xen/arch/x86/mm/mem_sharing.c | 105 ++++++++++++++++++++++++++++++++++
>>>>>     xen/include/public/memory.h   |   1 +
>>>>>     2 files changed, 106 insertions(+)
>>>>>
>>>>> diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
>>>>> index e93ad2ec5a..4735a334b9 100644
>>>>> --- a/xen/arch/x86/mm/mem_sharing.c
>>>>> +++ b/xen/arch/x86/mm/mem_sharing.c
>>>>> @@ -1622,6 +1622,87 @@ static int mem_sharing_fork(struct domain *d, struct domain *cd)
>>>>>         return 0;
>>>>>     }
>>>>>
>>>>> +struct gfn_free;
>>>>> +struct gfn_free {
>>>>> +    struct gfn_free *next;
>>>>> +    struct page_info *page;
>>>>> +    gfn_t gfn;
>>>>> +};
>>>>> +
>>>>> +static int mem_sharing_fork_reset(struct domain *d, struct domain *cd)
>>>>> +{
>>>>> +    int rc;
>>>>> +
>>>>> +    struct p2m_domain* p2m = p2m_get_hostp2m(cd);
>>>>> +    struct gfn_free *list = NULL;
>>>>> +    struct page_info *page;
>>>>> +
>>>>> +    page_list_for_each(page, &cd->page_list)
>>>>
>>>> AFAICT, your domain is not paused, so it would be possible to have page
>>>> added/remove in that list behind your back.
>>>
>>> Well, it's not that it's not paused, it's just that I haven't added a
>>> sanity check to make sure it is. The toolstack can (and should) pause
>>> it, so that sanity check would be warranted.
>> I have only read the hypervisor part, so I didn't know what the
>> toolstack has done.
> 
> I've added the same enforced VM paused operation that is present for
> the fork hypercall handler.
> 
>>
>>>
>>>>
>>>> You also have multiple loop on the page_list in this function. Given the
>>>> number of page_list can be quite big, this is a call for hogging the
>>>> pCPU and an RCU lock on the domain vCPU running this call.
>>>
>>> There is just one loop over page_list itself, the second loop is on
>>> the internal list that is being built here which will be a subset. The
>>> list itself in fact should be small (in our tests usually <100).
>>
>> For a first, nothing in this function tells me that there will be only
>> 100 pages. But then, I don't think this is right to implement your
>> hypercall based only the  "normal" scenario. You should also think about
>> the "worst" case scenario.
>>
>> In this case the worst case scenario is have hundreds of page in page_list.
> 
> Well, this is only an experimental system that's completely disabled
> by default. Making the assumption that people who make use of it will
> know what they are doing I think is fair.

I assume that if you submit to upstream this new hypercall then there is 
longer plan to have more people to use it and potentially making 
"stable". If not, then it raises the question why this is pushed upstream...

In any case, all the known assumptions should be documented so they can 
be fixed rather than forgotten until it is rediscovered via an XSA.

> 
>>
>>> Granted the list can grow larger, but in those cases its likely better
>>> to just discard the fork and create a new one. So in my opinion adding
>>> a hypercall continuation to this not needed
>>
>> How would the caller know it? What would happen if the caller ends up to
>> call this with a growing list.
> 
> The caller knows by virtue of knowing how long the VM was executed
> for. In the usecase this is targeted at the VM was executing only for
> a couple seconds at most. Usually much less then that (we get about
> ~80 resets/s with AFL). During that time its extremely unlikely you
> get more then a ~100 pages deduplicated (that is, written to). But
> even if there are more pages, it just means the hypercall might take a
> bit longer to run for that iteration.

I assume if you upstream the code then you want more people to use it 
(otherwise what's the point?). In this case, you will likely have people 
that heard about the feature, wants to test but don't know the internal.

Such users need to know how this can be call safely without reading the 
implementation. In other words, some documentation for your hypercall is 
needed.

> I don't see any issue with not
> breaking up this hypercall with continuation even under the worst case
> situation though.

Xen only supports voluntary preemption, this means that an hypercall can 
only be preempted if there is code for it. Otherwise the preemption will 
mostly only happen when returning to the guest.

In other words, the vCPU executing the hypercall may go past its 
timeslice and prevent other vCPU to run.

There are other problem with long running hypercalls. Anyway, in short, 
if you ever want to get you code supported then you will need the 
hypercall to be broken down.

> But if others feel that strongly as well about
> having to have continuation for this I don't really mind adding it.

I don't think the continuation work is going to be difficult, but if you 
want to delay it, then the minimum is to document such assumptions for 
your users.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2019-12-19  7:46 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-18 19:40 [Xen-devel] [PATCH v2 00/20] VM forking Tamas K Lengyel
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 01/20] x86: make hvm_{get/set}_param accessible Tamas K Lengyel
2019-12-19 19:07   ` Andrew Cooper
2019-12-19 19:38     ` Tamas K Lengyel
2019-12-19 19:40       ` Andrew Cooper
2019-12-19 19:49         ` Tamas K Lengyel
2019-12-19 19:57           ` Andrew Cooper
2019-12-19 20:09             ` Tamas K Lengyel
2019-12-20 16:46   ` Jan Beulich
2019-12-20 17:27     ` Tamas K Lengyel
2019-12-20 17:32       ` Andrew Cooper
2019-12-20 17:36         ` Tamas K Lengyel
2019-12-20 17:46           ` Andrew Cooper
2019-12-20 17:50             ` Tamas K Lengyel
2019-12-20 18:00               ` Andrew Cooper
2019-12-20 18:05                 ` Tamas K Lengyel
2019-12-23  9:37         ` Jan Beulich
2019-12-23 14:55           ` Tamas K Lengyel
2019-12-27  8:02             ` Jan Beulich
2019-12-27 13:10               ` Tamas K Lengyel
2019-12-27 13:44                 ` Jan Beulich
2019-12-27 14:06                   ` Tamas K Lengyel
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 02/20] xen/x86: Make hap_get_allocation accessible Tamas K Lengyel
2019-12-19 19:08   ` Andrew Cooper
2019-12-20 16:48   ` Jan Beulich
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 03/20] tools/libxc: clean up memory sharing files Tamas K Lengyel
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 04/20] x86/mem_sharing: cleanup code and comments in various locations Tamas K Lengyel
2019-12-19 11:18   ` Andrew Cooper
2019-12-19 16:20     ` Tamas K Lengyel
2019-12-19 16:21     ` Tamas K Lengyel
2019-12-19 18:51       ` Andrew Cooper
2019-12-19 19:26         ` Tamas K Lengyel
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 05/20] x86/mem_sharing: make get_two_gfns take locks conditionally Tamas K Lengyel
2019-12-19 19:12   ` Andrew Cooper
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 06/20] x86/mem_sharing: drop flags from mem_sharing_unshare_page Tamas K Lengyel
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 07/20] x86/mem_sharing: don't try to unshare twice during page fault Tamas K Lengyel
2019-12-19 19:19   ` Andrew Cooper
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 08/20] x86/mem_sharing: define mem_sharing_domain to hold some scattered variables Tamas K Lengyel
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 09/20] x86/mem_sharing: Use INVALID_MFN and p2m_is_shared in relinquish_shared_pages Tamas K Lengyel
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 10/20] x86/mem_sharing: Make add_to_physmap static and shorten name Tamas K Lengyel
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 11/20] x86/mem_sharing: Convert MEM_SHARING_DESTROY_GFN to a bool Tamas K Lengyel
2019-12-18 21:29   ` Julien Grall
2019-12-18 22:19     ` Tamas K Lengyel
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 12/20] x86/mem_sharing: Replace MEM_SHARING_DEBUG with gdprintk Tamas K Lengyel
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 13/20] x86/mem_sharing: ASSERT that p2m_set_entry succeeds Tamas K Lengyel
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 14/20] x86/mem_sharing: Enable mem_sharing on first memop Tamas K Lengyel
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 15/20] x86/mem_sharing: Skip xen heap pages in memshr nominate Tamas K Lengyel
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 16/20] x86/mem_sharing: check page type count earlier Tamas K Lengyel
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 17/20] xen/mem_sharing: VM forking Tamas K Lengyel
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 18/20] xen/mem_access: Use __get_gfn_type_access in set_mem_access Tamas K Lengyel
2019-12-19  7:59   ` Alexandru Stefan ISAILA
2019-12-19 16:00     ` Tamas K Lengyel
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 19/20] x86/mem_sharing: reset a fork Tamas K Lengyel
2019-12-18 22:00   ` Julien Grall
2019-12-18 22:33     ` Tamas K Lengyel
2019-12-18 23:01       ` Julien Grall
2019-12-19  0:15         ` Tamas K Lengyel
2019-12-19  7:45           ` Julien Grall [this message]
2019-12-19 16:11             ` Tamas K Lengyel
2019-12-19 16:57               ` Julien Grall
2019-12-19 17:23                 ` Tamas K Lengyel
2019-12-19 17:38                   ` Julien Grall
2019-12-19 18:00                     ` Tamas K Lengyel
2019-12-19 11:06           ` Jan Beulich
2019-12-19 16:02             ` Tamas K Lengyel
2019-12-18 19:40 ` [Xen-devel] [PATCH v2 20/20] xen/tools: VM forking toolstack side Tamas K Lengyel
2019-12-19  9:48 ` [Xen-devel] [PATCH v2 00/20] VM forking Roger Pau Monné
2019-12-19 15:58   ` Tamas K Lengyel
2019-12-30 17:59     ` Roger Pau Monné
2019-12-30 18:15       ` Tamas K Lengyel
2019-12-30 18:43         ` Julien Grall
2019-12-30 20:46           ` Tamas K Lengyel
2019-12-31  0:20             ` Julien Grall
2019-12-31  0:37               ` Tamas K Lengyel
2019-12-31 10:40                 ` Roger Pau Monné
2019-12-31 15:00                   ` Tamas K Lengyel
2019-12-31 15:11                     ` Roger Pau Monné
2019-12-31 16:08                       ` Tamas K Lengyel
2019-12-31 16:36                         ` Tamas K Lengyel
2020-01-08  9:42                           ` Julien Grall
2020-01-08 15:08                           ` Roger Pau Monné
2020-01-08 15:32                             ` Tamas K Lengyel
2020-01-08 18:00                               ` Roger Pau Monné
2020-01-08 18:14                                 ` Tamas K Lengyel
2020-01-08 18:23                                   ` Tamas K Lengyel
2020-01-08 18:44                                     ` Roger Pau Monné
2020-01-08 19:47                                       ` Tamas K Lengyel
2020-01-08 18:36                                   ` Roger Pau Monné
2020-01-08 19:51                                     ` Tamas K Lengyel
2020-01-09  9:47                                       ` Roger Pau Monné
2020-01-09 13:31                                         ` Tamas K Lengyel
2020-01-08 16:34                       ` George Dunlap
2020-01-08 17:06                         ` Tamas K Lengyel
2020-01-08 17:16                           ` George Dunlap
2020-01-08 17:25                             ` Tamas K Lengyel
2020-01-08 18:07                         ` Roger Pau Monné

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=18414678-5d0a-a2b4-f6ba-61464a0127c0@xen.org \
    --to=julien@xen.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=konrad.wilk@oracle.com \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=tamas.lengyel@intel.com \
    --cc=tamas@tklengyel.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.