All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zi Yan <ziy@nvidia.com>
To: David Hildenbrand <david@redhat.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	Vlastimil Babka <vbabka@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>,
	Eric Ren <renzhengeek@gmail.com>, Mike Rapoport <rppt@kernel.org>,
	Oscar Salvador <osalvador@suse.de>,
	Christophe Leroy <christophe.leroy@csgroup.eu>
Subject: Re: [PATCH v10 2/5] mm: page_isolation: check specified range for unmovable pages
Date: Tue, 12 Apr 2022 15:34:26 -0400	[thread overview]
Message-ID: <0F913CA1-4530-465F-A2CE-7CB11E96B43F@nvidia.com> (raw)
In-Reply-To: <D53C41E3-E934-4577-9286-387AF3DDBE03@nvidia.com>

[-- Attachment #1: Type: text/plain, Size: 8089 bytes --]

On 12 Apr 2022, at 13:41, Zi Yan wrote:

> On 12 Apr 2022, at 11:06, David Hildenbrand wrote:
>
>> On 12.04.22 17:01, Zi Yan wrote:
>>> On 12 Apr 2022, at 10:49, David Hildenbrand wrote:
>>>
>>>> On 12.04.22 16:07, Zi Yan wrote:
>>>>> On 12 Apr 2022, at 9:10, David Hildenbrand wrote:
>>>>>
>>>>>> On 06.04.22 17:18, Zi Yan wrote:
>>>>>>> From: Zi Yan <ziy@nvidia.com>
>>>>>>>
>>>>>>> Enable set_migratetype_isolate() to check specified sub-range for
>>>>>>> unmovable pages during isolation. Page isolation is done
>>>>>>> at MAX_ORDER_NR_PAEGS granularity, but not all pages within that
>>>>>>> granularity are intended to be isolated. For example,
>>>>>>> alloc_contig_range(), which uses page isolation, allows ranges without
>>>>>>> alignment. This commit makes unmovable page check only look for
>>>>>>> interesting pages, so that page isolation can succeed for any
>>>>>>> non-overlapping ranges.
>>>>>>>
>>>>>>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>>>>>>> ---
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>>>  /*
>>>>>>> - * This function checks whether pageblock includes unmovable pages or not.
>>>>>>> + * This function checks whether the range [start_pfn, end_pfn) includes
>>>>>>> + * unmovable pages or not. The range must fall into a single pageblock and
>>>>>>> + * consequently belong to a single zone.
>>>>>>>   *
>>>>>>>   * PageLRU check without isolation or lru_lock could race so that
>>>>>>>   * MIGRATE_MOVABLE block might include unmovable pages. And __PageMovable
>>>>>>> @@ -28,12 +30,14 @@
>>>>>>>   * cannot get removed (e.g., via memory unplug) concurrently.
>>>>>>>   *
>>>>>>>   */
>>>>>>> -static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>>>>>>> -				 int migratetype, int flags)
>>>>>>> +static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long end_pfn,
>>>>>>> +				int migratetype, int flags)
>>>>>>>  {
>>>>>>> -	unsigned long iter = 0;
>>>>>>> -	unsigned long pfn = page_to_pfn(page);
>>>>>>> -	unsigned long offset = pfn % pageblock_nr_pages;
>>>>>>> +	unsigned long pfn = start_pfn;
>>>>>>> +	struct page *page = pfn_to_page(pfn);
>>>>>>
>>>>>>
>>>>>> Just do
>>>>>>
>>>>>> struct page *page = pfn_to_page(start_pfn);
>>>>>> struct zone *zone = page_zone(page);
>>>>>>
>>>>>> here. No need to lookup the zone again in the loop because, as you
>>>>>> document "must ... belong to a single zone.".
>>>>>>
>>>>>> Then, there is also no need to initialize "pfn" here. In the loop header
>>>>>> is sufficient.
>>>>>>
>>>>>
>>>>> Sure.
>>>>>
>>>>>>> +
>>>>>>> +	VM_BUG_ON(ALIGN_DOWN(start_pfn, pageblock_nr_pages) !=
>>>>>>> +		  ALIGN_DOWN(end_pfn - 1, pageblock_nr_pages));
>>>>>>>
>>>>>>>  	if (is_migrate_cma_page(page)) {
>>>>>>>  		/*
>>>>>>> @@ -47,8 +51,11 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>>>>>>>  		return page;
>>>>>>>  	}
>>>>>>>
>>>>>>> -	for (; iter < pageblock_nr_pages - offset; iter++) {
>>>>>>> -		page = pfn_to_page(pfn + iter);
>>>>>>> +	for (pfn = start_pfn; pfn < end_pfn; pfn++) {
>>>>>>> +		struct zone *zone;
>>>>>>> +
>>>>>>> +		page = pfn_to_page(pfn);
>>>>>>> +		zone = page_zone(page);
>>>>>>>
>>>>>>>  		/*
>>>>>>>  		 * Both, bootmem allocations and memory holes are marked
>>>>>>> @@ -85,7 +92,7 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>>>>>>>  			}
>>>>>>>
>>>>>>>  			skip_pages = compound_nr(head) - (page - head);
>>>>>>> -			iter += skip_pages - 1;
>>>>>>> +			pfn += skip_pages - 1;
>>>>>>>  			continue;
>>>>>>>  		}
>>>>>>>
>>>>>>> @@ -97,7 +104,7 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>>>>>>>  		 */
>>>>>>>  		if (!page_ref_count(page)) {
>>>>>>>  			if (PageBuddy(page))
>>>>>>> -				iter += (1 << buddy_order(page)) - 1;
>>>>>>> +				pfn += (1 << buddy_order(page)) - 1;
>>>>>>>  			continue;
>>>>>>>  		}
>>>>>>>
>>>>>>> @@ -134,11 +141,18 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>>>>>>>  	return NULL;
>>>>>>>  }
>>>>>>>
>>>>>>> -static int set_migratetype_isolate(struct page *page, int migratetype, int isol_flags)
>>>>>>> +/*
>>>>>>> + * This function set pageblock migratetype to isolate if no unmovable page is
>>>>>>> + * present in [start_pfn, end_pfn). The pageblock must intersect with
>>>>>>> + * [start_pfn, end_pfn).
>>>>>>> + */
>>>>>>> +static int set_migratetype_isolate(struct page *page, int migratetype, int isol_flags,
>>>>>>> +			unsigned long start_pfn, unsigned long end_pfn)
>>>>>>
>>>>>> I think we might be able do better, eventually not passing start_pfn at
>>>>>> all. Hmm.
>>>>>
>>>>> IMHO, having start_pfn and end_pfn in the parameter list would make the
>>>>> interface easier to understand. Otherwise if we remove start_pfn,
>>>>> the caller needs to adjust @page to be within the range of [start_pfn,
>>>>> end_pfn)
>>>>>
>>>>>>
>>>>>> I think we want to pull out the
>>>>>> start_isolate_page_range()/undo_isolate_page_range() interface change
>>>>>> into a separate patch.
>>>>>
>>>>> You mean a patch just adding
>>>>>
>>>>> unsigned long isolate_start = pfn_max_align_down(start_pfn);
>>>>> unsigned long isolate_end = pfn_max_align_up(end_pfn);
>>>>>
>>>>> in start_isolate_page_range()/undo_isolate_page_range()?
>>>>>
>>>>> Yes I can do that.
>>>>
>>>> I think we have to be careful with memory onlining/offlining. There are
>>>> corner cases where we get called with only pageblock alignment and
>>>> must not adjust the range.
>>>
>>> In the patch below, you added a new set of start_isolate_pageblocks()
>>> and undo_isolate_pageblocks(), where start_isolate_pageblocks() still
>>> calls set_migratetype_isolate() and noted their range should not be
>>> adjusted. But in my patch, set_migratetype_isolate() adjusts
>>> the range for has_unmovable_pages() check. Do you mean
>>
>> Right, that's broken in your patch. Memory onlining/offlining behavior
>> changed recently when "vmemmap on memory" was added. The start range
>> might only be aligned to pageblocks but not MAX_ORDER -1 -- and we must
>> not align u..
>>
>> The core thing is that there are two types of users: memory offlining
>> that knows what it's doing when it aligns to less then MAX_ORDER -1 ,
>> and range allocators, that just pass in the range of interest.
>
> Oh, you mean the pfn_max_align_down() and pfn_max_align_up() are wrong
> for memory onlining/offlining callers. Got it. And in patch 3, this is
> not a concern any more, since we move to pageblock_nr_pages alignment.
>
>>
>>> start_isolate_pageblocks() should call a different version of
>>> set_migratetype_isolate() without range adjustment? That can be done
>>> with an additional parameter in start_isolate_page_range(), like
>>> bool strict, right?
>>
>> Random boolean flags are in general frowned upon ;)
>
> I misunderstood about the alignment	issue. No need for this additional
> parameter.
>
> Thanks. Will take your patch and adjust this one based on yours.
>
> I will wait for your reviews on patch 3 and onwards before sending
> out a new version.

As I am editing my patch 2 and 3 based on your patch, it becomes redundant
to have start_isolate_pageblocks() in patch 3, since start_isolate_page_range()
will only require pageblock size alignment after patch 3.
start_isolate_pageblocks() and start_isolate_page_range() do the same thing.

Since moving MAX_ORDER-1 alignment into start_isolate_page_range() is wrong
in patch 2, I think it is better to move it in patch 3, when
start_isolate_page_range() is ready for pageblock size alignment. Patch 2
then will have no functional change, since the range does not change and
has_unmovable_pages() will make its in in patch 3 eventually.

If you think moving the range alignment adjustment should be a separate
patch, I can move it to patch 4 after patch 3 when all the related
functions are ready for the new pageblock size alignment.

--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

  reply	other threads:[~2022-04-12 19:34 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-06 15:18 [PATCH v10 0/5] Use pageblock_order for cma and alloc_contig_range alignment Zi Yan
2022-04-06 15:18 ` [PATCH v10 1/5] mm: page_isolation: move has_unmovable_pages() to mm/page_isolation.c Zi Yan
2022-04-06 15:18 ` [PATCH v10 2/5] mm: page_isolation: check specified range for unmovable pages Zi Yan
2022-04-12 13:10   ` David Hildenbrand
2022-04-12 13:10     ` David Hildenbrand
2022-04-12 14:07     ` Zi Yan
2022-04-12 14:49       ` David Hildenbrand
2022-04-12 14:49         ` David Hildenbrand
2022-04-12 15:01         ` Zi Yan
2022-04-12 15:06           ` David Hildenbrand
2022-04-12 15:06             ` David Hildenbrand
2022-04-12 17:41             ` Zi Yan
2022-04-12 19:34               ` Zi Yan [this message]
2022-04-06 15:18 ` [PATCH v10 3/5] mm: make alloc_contig_range work at pageblock granularity Zi Yan
2022-04-06 15:18 ` [PATCH v10 4/5] mm: cma: use pageblock_order as the single alignment Zi Yan
2022-04-06 15:18 ` [PATCH v10 5/5] drivers: virtio_mem: use pageblock size as the minimum virtio_mem size Zi Yan
2022-04-12 12:35 ` [PATCH v10 0/5] Use pageblock_order for cma and alloc_contig_range alignment David Hildenbrand
2022-04-12 12:35   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0F913CA1-4530-465F-A2CE-7CB11E96B43F@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=osalvador@suse.de \
    --cc=renzhengeek@gmail.com \
    --cc=rppt@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.