Re: [PATCH v10 2/5] mm: page_isolation: check specified range for unmovable pages

From: Zi Yan <ziy@nvidia.com>
To: David Hildenbrand <david@redhat.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	Vlastimil Babka <vbabka@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>,
	Eric Ren <renzhengeek@gmail.com>, Mike Rapoport <rppt@kernel.org>,
	Oscar Salvador <osalvador@suse.de>,
	Christophe Leroy <christophe.leroy@csgroup.eu>
Subject: Re: [PATCH v10 2/5] mm: page_isolation: check specified range for unmovable pages
Date: Tue, 12 Apr 2022 15:34:26 -0400	[thread overview]
Message-ID: <0F913CA1-4530-465F-A2CE-7CB11E96B43F@nvidia.com> (raw)
In-Reply-To: <D53C41E3-E934-4577-9286-387AF3DDBE03@nvidia.com>

[-- Attachment #1: Type: text/plain, Size: 8089 bytes --]

On 12 Apr 2022, at 13:41, Zi Yan wrote:

> On 12 Apr 2022, at 11:06, David Hildenbrand wrote:
>
>> On 12.04.22 17:01, Zi Yan wrote:
>>> On 12 Apr 2022, at 10:49, David Hildenbrand wrote:
>>>
>>>> On 12.04.22 16:07, Zi Yan wrote:
>>>>> On 12 Apr 2022, at 9:10, David Hildenbrand wrote:
>>>>>
>>>>>> On 06.04.22 17:18, Zi Yan wrote:
>>>>>>> From: Zi Yan <ziy@nvidia.com>
>>>>>>>
>>>>>>> Enable set_migratetype_isolate() to check specified sub-range for
>>>>>>> unmovable pages during isolation. Page isolation is done
>>>>>>> at MAX_ORDER_NR_PAEGS granularity, but not all pages within that
>>>>>>> granularity are intended to be isolated. For example,
>>>>>>> alloc_contig_range(), which uses page isolation, allows ranges without
>>>>>>> alignment. This commit makes unmovable page check only look for
>>>>>>> interesting pages, so that page isolation can succeed for any
>>>>>>> non-overlapping ranges.
>>>>>>>
>>>>>>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>>>>>>> ---
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>>>  /*
>>>>>>> - * This function checks whether pageblock includes unmovable pages or not.
>>>>>>> + * This function checks whether the range [start_pfn, end_pfn) includes
>>>>>>> + * unmovable pages or not. The range must fall into a single pageblock and
>>>>>>> + * consequently belong to a single zone.
>>>>>>>   *
>>>>>>>   * PageLRU check without isolation or lru_lock could race so that
>>>>>>>   * MIGRATE_MOVABLE block might include unmovable pages. And __PageMovable
>>>>>>> @@ -28,12 +30,14 @@
>>>>>>>   * cannot get removed (e.g., via memory unplug) concurrently.
>>>>>>>   *
>>>>>>>   */
>>>>>>> -static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>>>>>>> -				 int migratetype, int flags)
>>>>>>> +static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long end_pfn,
>>>>>>> +				int migratetype, int flags)
>>>>>>>  {
>>>>>>> -	unsigned long iter = 0;
>>>>>>> -	unsigned long pfn = page_to_pfn(page);
>>>>>>> -	unsigned long offset = pfn % pageblock_nr_pages;
>>>>>>> +	unsigned long pfn = start_pfn;
>>>>>>> +	struct page *page = pfn_to_page(pfn);
>>>>>>
>>>>>>
>>>>>> Just do
>>>>>>
>>>>>> struct page *page = pfn_to_page(start_pfn);
>>>>>> struct zone *zone = page_zone(page);
>>>>>>
>>>>>> here. No need to lookup the zone again in the loop because, as you
>>>>>> document "must ... belong to a single zone.".
>>>>>>
>>>>>> Then, there is also no need to initialize "pfn" here. In the loop header
>>>>>> is sufficient.
>>>>>>
>>>>>
>>>>> Sure.
>>>>>
>>>>>>> +
>>>>>>> +	VM_BUG_ON(ALIGN_DOWN(start_pfn, pageblock_nr_pages) !=
>>>>>>> +		  ALIGN_DOWN(end_pfn - 1, pageblock_nr_pages));
>>>>>>>
>>>>>>>  	if (is_migrate_cma_page(page)) {
>>>>>>>  		/*
>>>>>>> @@ -47,8 +51,11 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>>>>>>>  		return page;
>>>>>>>  	}
>>>>>>>
>>>>>>> -	for (; iter < pageblock_nr_pages - offset; iter++) {
>>>>>>> -		page = pfn_to_page(pfn + iter);
>>>>>>> +	for (pfn = start_pfn; pfn < end_pfn; pfn++) {
>>>>>>> +		struct zone *zone;
>>>>>>> +
>>>>>>> +		page = pfn_to_page(pfn);
>>>>>>> +		zone = page_zone(page);
>>>>>>>
>>>>>>>  		/*
>>>>>>>  		 * Both, bootmem allocations and memory holes are marked
>>>>>>> @@ -85,7 +92,7 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>>>>>>>  			}
>>>>>>>
>>>>>>>  			skip_pages = compound_nr(head) - (page - head);
>>>>>>> -			iter += skip_pages - 1;
>>>>>>> +			pfn += skip_pages - 1;
>>>>>>>  			continue;
>>>>>>>  		}
>>>>>>>
>>>>>>> @@ -97,7 +104,7 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>>>>>>>  		 */
>>>>>>>  		if (!page_ref_count(page)) {
>>>>>>>  			if (PageBuddy(page))
>>>>>>> -				iter += (1 << buddy_order(page)) - 1;
>>>>>>> +				pfn += (1 << buddy_order(page)) - 1;
>>>>>>>  			continue;
>>>>>>>  		}
>>>>>>>
>>>>>>> @@ -134,11 +141,18 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>>>>>>>  	return NULL;
>>>>>>>  }
>>>>>>>
>>>>>>> -static int set_migratetype_isolate(struct page *page, int migratetype, int isol_flags)
>>>>>>> +/*
>>>>>>> + * This function set pageblock migratetype to isolate if no unmovable page is
>>>>>>> + * present in [start_pfn, end_pfn). The pageblock must intersect with
>>>>>>> + * [start_pfn, end_pfn).
>>>>>>> + */
>>>>>>> +static int set_migratetype_isolate(struct page *page, int migratetype, int isol_flags,
>>>>>>> +			unsigned long start_pfn, unsigned long end_pfn)
>>>>>>
>>>>>> I think we might be able do better, eventually not passing start_pfn at
>>>>>> all. Hmm.
>>>>>
>>>>> IMHO, having start_pfn and end_pfn in the parameter list would make the
>>>>> interface easier to understand. Otherwise if we remove start_pfn,
>>>>> the caller needs to adjust @page to be within the range of [start_pfn,
>>>>> end_pfn)
>>>>>
>>>>>>
>>>>>> I think we want to pull out the
>>>>>> start_isolate_page_range()/undo_isolate_page_range() interface change
>>>>>> into a separate patch.
>>>>>
>>>>> You mean a patch just adding
>>>>>
>>>>> unsigned long isolate_start = pfn_max_align_down(start_pfn);
>>>>> unsigned long isolate_end = pfn_max_align_up(end_pfn);
>>>>>
>>>>> in start_isolate_page_range()/undo_isolate_page_range()?
>>>>>
>>>>> Yes I can do that.
>>>>
>>>> I think we have to be careful with memory onlining/offlining. There are
>>>> corner cases where we get called with only pageblock alignment and
>>>> must not adjust the range.
>>>
>>> In the patch below, you added a new set of start_isolate_pageblocks()
>>> and undo_isolate_pageblocks(), where start_isolate_pageblocks() still
>>> calls set_migratetype_isolate() and noted their range should not be
>>> adjusted. But in my patch, set_migratetype_isolate() adjusts
>>> the range for has_unmovable_pages() check. Do you mean
>>
>> Right, that's broken in your patch. Memory onlining/offlining behavior
>> changed recently when "vmemmap on memory" was added. The start range
>> might only be aligned to pageblocks but not MAX_ORDER -1 -- and we must
>> not align u..
>>
>> The core thing is that there are two types of users: memory offlining
>> that knows what it's doing when it aligns to less then MAX_ORDER -1 ,
>> and range allocators, that just pass in the range of interest.
>
> Oh, you mean the pfn_max_align_down() and pfn_max_align_up() are wrong
> for memory onlining/offlining callers. Got it. And in patch 3, this is
> not a concern any more, since we move to pageblock_nr_pages alignment.
>
>>
>>> start_isolate_pageblocks() should call a different version of
>>> set_migratetype_isolate() without range adjustment? That can be done
>>> with an additional parameter in start_isolate_page_range(), like
>>> bool strict, right?
>>
>> Random boolean flags are in general frowned upon ;)
>
> I misunderstood about the alignment	issue. No need for this additional
> parameter.
>
> Thanks. Will take your patch and adjust this one based on yours.
>
> I will wait for your reviews on patch 3 and onwards before sending
> out a new version.

As I am editing my patch 2 and 3 based on your patch, it becomes redundant
to have start_isolate_pageblocks() in patch 3, since start_isolate_page_range()
will only require pageblock size alignment after patch 3.
start_isolate_pageblocks() and start_isolate_page_range() do the same thing.

Since moving MAX_ORDER-1 alignment into start_isolate_page_range() is wrong
in patch 2, I think it is better to move it in patch 3, when
start_isolate_page_range() is ready for pageblock size alignment. Patch 2
then will have no functional change, since the range does not change and
has_unmovable_pages() will make its in in patch 3 eventually.

If you think moving the range alignment adjustment should be a separate
patch, I can move it to patch 4 after patch 3 when all the related
functions are ready for the new pageblock size alignment.

--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]