All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: David Hildenbrand <david@redhat.com>, akpm@linux-foundation.org
Cc: ying.huang@intel.com, wangkefeng.wang@huawei.com,
	willy@infradead.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, John Hubbard <jhubbard@nvidia.com>
Subject: Re: [RFC PATCH] mm: support large folio numa balancing
Date: Tue, 14 Nov 2023 18:53:55 +0800	[thread overview]
Message-ID: <5a510d8f-2b63-4032-947a-99d1a1aab161@linux.alibaba.com> (raw)
In-Reply-To: <bce69d21-14cc-4e0b-93a2-425f40ca91ad@redhat.com>



On 11/13/2023 10:49 PM, David Hildenbrand wrote:
> On 13.11.23 13:59, Baolin Wang wrote:
>>
>>
>> On 11/13/2023 6:53 PM, David Hildenbrand wrote:
>>> On 13.11.23 11:45, Baolin Wang wrote:
>>>> Currently, the file pages already support large folio, and 
>>>> supporting for
>>>> anonymous pages is also under discussion[1]. Moreover, the numa 
>>>> balancing
>>>> code are converted to use a folio by previous thread[2], and the
>>>> migrate_pages
>>>> function also already supports the large folio migration.
>>>>
>>>> So now I did not see any reason to continue restricting NUMA balancing
>>>> for
>>>> large folio.
>>>
>>> I recall John wanted to look into that. CCing him.
>>>
>>> I'll note that the "head page mapcount" heuristic to detect sharers will
>>> now strike on the PTE path and make us believe that a large folios is
>>> exclusive, although it isn't.
>>>
>>> As spelled out in the commit you are referencing:
>>>
>>> commit 6695cf68b15c215d33b8add64c33e01e3cbe236c
>>> Author: Kefeng Wang <wangkefeng.wang@huawei.com>
>>> Date:   Thu Sep 21 15:44:14 2023 +0800
>>>
>>>       mm: memory: use a folio in do_numa_page()
>>>       Numa balancing only try to migrate non-compound page in
>>> do_numa_page(),
>>>       use a folio in it to save several compound_head calls, note we use
>>>       folio_estimated_sharers(), it is enough to check the folio sharers
>>> since
>>>       only normal page is handled, if large folio numa balancing is
>>> supported, a
>>>       precise folio sharers check would be used, no functional change
>>> intended.
>>
>> Thanks for pointing out the part I missed.
>>
>> I saw the migrate_pages() syscall is also using
>> folio_estimated_sharers() to check if the folio is shared, and I wonder
>> it will bring about any significant issues?
> 
> It's now used all over the place, in some places for making manual 
> decisions (e.g., MADV_PAGEOUT works although it shouldn't) and more and 
> more automatic places (e.g., the system ends up migrating a folio 
> although it shouldn't). The nasty thing about it is that it doesn't give 
> you "certainly exclusive" vs. "maybe shared" but "maybe exclusive" vs. 
> "certainly shared".
> 
> IIUC, the side effect could be that we migrate folios because we assume 
> they are exclusive even though they are actually shared. Right now, it's 
> sufficient to not have the first page of the folio mapped anymore for 
> that to happen.

Yes.

> Anyhow, it's worth mentioning that in the commit message as long as we 
> have no better solution for that. For many cases it might be just 
> tolerable.

Agree. The 'maybe shared' folio may affect the numa group statistics, 
which is used to accumulate the numa faults in one group to choose a 
prefered node for the tasks. For this case, it may be tolerable too, but 
I have no performance numbers now. Let me think about it.

>>> I'll send WIP patches for one approach that can improve the situation
>>> soonish.
>>
>> Great. Look forward to seeing this:)
> 
> I'm still trying to evaluate the performance hit of the additional 
> tracking ... turns out there is no such thing as free food ;)

Make sense.

  reply	other threads:[~2023-11-14 10:53 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-13 10:45 [RFC PATCH] mm: support large folio numa balancing Baolin Wang
2023-11-13 10:53 ` David Hildenbrand
2023-11-13 12:10   ` Kefeng Wang
2023-11-13 13:01     ` Baolin Wang
2023-11-13 22:15       ` John Hubbard
2023-11-14 11:35         ` David Hildenbrand
2023-11-14 13:12           ` Kefeng Wang
2023-11-13 12:59   ` Baolin Wang
2023-11-13 14:49     ` David Hildenbrand
2023-11-14 10:53       ` Baolin Wang [this message]
2023-11-14  1:12   ` Huang, Ying
2023-11-14 11:11     ` Baolin Wang
2023-11-15  2:58       ` Huang, Ying
2023-11-17 10:07         ` Mel Gorman
2023-11-17 10:13           ` Peter Zijlstra
2023-11-17 16:04             ` Mel Gorman
2023-11-20  8:01           ` Baolin Wang
2023-11-15 10:46 ` David Hildenbrand
2023-11-15 10:47   ` David Hildenbrand
2023-11-20  3:28     ` Baolin Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5a510d8f-2b63-4032-947a-99d1a1aab161@linux.alibaba.com \
    --to=baolin.wang@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=wangkefeng.wang@huawei.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.