Re: [PATCH v2 2/9] mm: Place unscrubbed pages at the end of pagelist

From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: tim@xen.org, sstabellini@kernel.org, wei.liu2@citrix.com,
	George.Dunlap@eu.citrix.com, andrew.cooper3@citrix.com,
	ian.jackson@eu.citrix.com, xen-devel@lists.xen.org
Subject: Re: [PATCH v2 2/9] mm: Place unscrubbed pages at the end of pagelist
Date: Tue, 4 Apr 2017 11:14:57 -0400	[thread overview]
Message-ID: <10dcbf5e-5c8c-f8f2-7819-6b31749b0a73@oracle.com> (raw)
In-Reply-To: <58E3CDED020000780014CB80@prv-mh.provo.novell.com>

On 04/04/2017 10:46 AM, Jan Beulich wrote:
>> @@ -897,8 +916,8 @@ static int reserve_offlined_page(struct page_info *head)
>>              {
>>              merge:
>>                  /* We don't consider merging outside the head_order. */
>> -                page_list_add_tail(cur_head, &heap(node, zone, cur_order));
>>                  PFN_ORDER(cur_head) = cur_order;
>> +                page_list_add_scrub(cur_head, node, zone, cur_order, need_scrub);
> With this re-arrangement, what's the point of also passing a
> separate order argument to the function?

No reason indeed.

>
>> @@ -933,6 +952,10 @@ static bool_t can_merge(struct page_info *buddy, unsigned int node,
>>           (phys_to_nid(page_to_maddr(buddy)) != node) )
>>          return false;
>>  
>> +    if ( need_scrub !=
>> +         !!test_bit(_PGC_need_scrub, &buddy->count_info) )
>> +        return false;
> I don't think leaving the tree in a state where larger order chunks
> don't become available for allocation right away is going to be
> acceptable. Hence with this issue being dealt with only in patch 7
> as it seems, you should state clearly and visibly that (at least)
> patches 2...7 should only be committed together.

The dirty pages are available for allocation as result of this patch but
they might not be merged with higher orders (which is what this check is
for)

Patch 7 is not named well, it should be called "Keep heap available for
allocation during idle-loop scrubbing".

>
>> @@ -952,9 +977,10 @@ static struct page_info *merge_chunks(struct page_info *pg, unsigned int node,
>>          {
>>              /* Merge with predecessor block? */
>>              buddy = pg - mask;
>> -            if ( !can_merge(buddy, node, order) )
>> +            if ( !can_merge(buddy, node, order, need_scrub) )
>>                  break;
>>  
>> +            pg->count_info &= ~PGC_need_scrub;
>>              pg = buddy;
>>              page_list_del(pg, &heap(node, zone, order));
>>          }
>> @@ -962,9 +988,10 @@ static struct page_info *merge_chunks(struct page_info *pg, unsigned int node,
>>          {
>>              /* Merge with successor block? */
>>              buddy = pg + mask;
>> -            if ( !can_merge(buddy, node, order) )
>> +            if ( !can_merge(buddy, node, order, need_scrub) )
>>                  break;
>>  
>> +            buddy->count_info &= ~PGC_need_scrub;
>>              page_list_del(buddy, &heap(node, zone, order));
>>          }
> For both of these, how come you can / want to clear the need-scrub
> flag? Wouldn't it be better for each individual page to retain it, so
> when encountering a higher-order one you know which pages need
> scrubbing and which don't? Couldn't that also be used to avoid
> suppressing their merging here right away?

I am trying to avoid having to keep dirty bit for each page since a
buddy is either fully clean or fully dirty. That way we shouldn't need
to walk the list and clear the bit. (I, in fact, suspect that there may
be other state bits/fields that we might be able to keep at a buddy only)

Later, in patch 5, we can safely break a buddy that is being cleaned
into a clean and dirty subsets if preemption is needed.

>
>> +static void scrub_free_pages(unsigned int node)
>> +{
>> +    struct page_info *pg;
>> +    unsigned int i, zone;
>> +    int order;
> There are no negative orders.

It actually becomes negative in the loop below and this is loop exit
condition.

>
>> +    ASSERT(spin_is_locked(&heap_lock));
>> +
>> +    if ( !node_need_scrub[node] )
>> +        return;
>> +
>> +    for ( zone = 0; zone < NR_ZONES; zone++ )
>> +    {
>> +        for ( order = MAX_ORDER; order >= 0; order-- )
>> +        {
>> +            while ( !page_list_empty(&heap(node, zone, order)) )
>> +            {
>> +                /* Unscrubbed pages are always at the end of the list. */
>> +                pg = page_list_last(&heap(node, zone, order));
>> +                if ( !test_bit(_PGC_need_scrub, &pg->count_info) )
>> +                    break;
>> +
>> +                for ( i = 0; i < (1UL << order); i++)
> Types of loop variable and upper bound do not match.
>
>> +                    scrub_one_page(&pg[i]);
>> +
>> +                pg->count_info &= ~PGC_need_scrub;
>> +
>> +                page_list_del(pg, &heap(node, zone, order));
>> +                (void)merge_chunks(pg, node, zone, order);
> Pointless cast.

Didn't coverity complain about those types of things? That was the
reason I have the cast here. If not I'll drop it.

>
>> +                node_need_scrub[node] -= (1UL << order);
> Perhaps worth returning right away if the new value is zero?

Yes.

>> --- a/xen/include/asm-x86/mm.h
>> +++ b/xen/include/asm-x86/mm.h
>> @@ -233,6 +233,10 @@ struct page_info
>>  #define PGC_count_width   PG_shift(9)
>>  #define PGC_count_mask    ((1UL<<PGC_count_width)-1)
>>  
>> +/* Page needs to be scrubbed */
>> +#define _PGC_need_scrub   PG_shift(10)
>> +#define PGC_need_scrub    PG_mask(1, 10)
> So why not a new PGC_state_dirty instead of this independent
> flag? Pages other than PGC_state_free should never make it
> to the scrubber, so the flag is meaningless for all other
> PGC_state_*.

Wouldn't doing this require possibly making two checks ---
page_state_is(pg, free) || page_state_is(pg, dirty)?

-boris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel