Re: [PATCH for-4.12 v2 15/17] xen/arm: p2m: Add support for preemption in p2m_cache_flush_range

From: Julien Grall <julien.grall@arm.com>
To: Stefano Stabellini <sstabellini@kernel.org>
Cc: xen-devel@lists.xenproject.org
Subject: Re: [PATCH for-4.12 v2 15/17] xen/arm: p2m: Add support for preemption in p2m_cache_flush_range
Date: Tue, 11 Dec 2018 16:11:26 +0000	[thread overview]
Message-ID: <bb791d1e-9238-158c-93bb-d8202b68a60d@arm.com> (raw)
In-Reply-To: <alpine.DEB.2.10.1812071311200.18779@sstabellini-ThinkPad-X260>

Hi,

On 07/12/2018 22:11, Stefano Stabellini wrote:
> On Fri, 7 Dec 2018, Julien Grall wrote:
>>>> @@ -1547,6 +1551,25 @@ int p2m_cache_flush_range(struct domain *d, gfn_t
>>>> start, gfn_t end)
>>>>          while ( gfn_x(start) < gfn_x(end) )
>>>>        {
>>>> +       /*
>>>> +         * Cleaning the cache for the P2M may take a long time. So we
>>>> +         * need to be able to preempt. We will arbitrarily preempt every
>>>> +         * time count reach 512 or above.
>>>> +
>>>> +         *
>>>> +         * The count will be incremented by:
>>>> +         *  - 1 on region skipped
>>>> +         *  - 10 for each page requiring a flush
>>>
>>> Why this choice? A page flush should cost much more than 10x a region
>>> skipped, more like 100x or 1000x. In fact, doing the full loop without
>>> calling flush_page_to_ram should be cheap and fast, right?.
>>
>> It is cheaper than a flush of the page but it still has a cost. You have to
>> walk the stage-2 in software that will require to map the tables. As all the
>> memory is not mapped in the hypervisor on arm32 this will require a map/unmap
>> operation. On arm64, so far the full memory is mapped, so the map/unmap is
>> pretty much a NOP.
> 
> Good point, actually the cost of an "empty" iteration is significantly
> different on arm32 and arm64.
> 
> 
>>> I would:
>>>
>>> - not increase count on region skipped at all
>>> - increase it by 1 on each page requiring a flush
>>> - set the limit lower, if we go with your proposal it would be about 50,
>>>     I am not sure what the limit should be though
>> I don't think you can avoid incrementing count on region skipped. While one
>> lookup is pretty cheap, all the lookups for hole added together may result to
>> a pretty long time.
> 
> Thinking of the arm32 case where map/unmap need to be done, you are
> right.
> 
> 
>> Even if stage-2 mappings are handled by the hypervisor, the guest is still
>> somewhat in control of it because it can balloon in/out pages. The operation
>> may result to shatter stage-2 mappings.
>>
>> It would be feasible for a guest to shatter 1GB of memory in 4KB mappings in
>> stage-2 entries and then remove all the entries. This means the stage-2 would
>> contains 262144 holes. This would result to 262144 iterations, so no matter
>> how cheap it is the resulting time spent without preemption is going to be
>> quite important.
> 
> OK
> 
> 
>> The choice in the numbers 1 vs 10 is pretty much random. The question is how
>> often we want to check for pending softirq. The check is pretty much trivial,
>> yet it has a cost to preempt. With the current solution, we check preemption
>> every 512 holes or 51 pages flushed (~204KB flushed).
>>
>> This sounds ok to me. Feel free to suggest better number.
> 
> One suggestion is that we might want to treat the arm32 case differently
> from the arm64 case, given the different cost of mapping/unmapping pages
> during the walk. Would it be fair to say that if an arm64 empty
> iteration is worth "1" unit of work, then an arm32 empty iteration is
> worth at least "2" unit of work? Or more? My gut feeling is that it is
> more like:

I don't want to treat arm32 and arm64 different. That's a call to open up a 
security hole in Xen if we ever decide to separate domain heap and xen heap on 
arm64.

> 
> empty arm64:       1
> empty arm32:       5
> flush arm32/arm64: 10
> 
> Or even:
> 
> empty arm64:       1
> empty arm32:       10
> flush arm32/arm64: 20

Bear in mind that in the flush case on arm32, you also need to map/unmap the 
page. So this is more like 10/30 here.

> 
> and the overall limits for checks could be 512 or 1024 like you
> suggested.

What I can suggest is:
	empty: 1
	flush: 3

Limit: 120

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel