linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Rientjes <rientjes@google.com>
To: Peter Xu <peterx@redhat.com>, Zach O'Keefe <zokeefe@google.com>,
	SeongJae Park <sj@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>,
	David Hildenbrand <david@redhat.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Yang Shi <shy828301@gmail.com>, Zi Yan <ziy@nvidia.com>,
	Matthew Wilcox <willy@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: split thp synchronously on MADV_DONTNEED
Date: Mon, 24 Jan 2022 10:48:55 -0800 (PST)	[thread overview]
Message-ID: <e55d1f78-46c2-9ced-d7ad-c6deba4cb7b8@google.com> (raw)
In-Reply-To: <YaBevbuNuR+ToJ1o@xz-m1.local>

On Fri, 26 Nov 2021, Peter Xu wrote:

> Some side notes: I digged out the old MADV_COLLAPSE proposal right after I
> thought about MADV_SPLIT (or any of its variance):
> 
> https://lore.kernel.org/all/d098c392-273a-36a4-1a29-59731cdf5d3d@google.com/
> 
> My memory was that there's some issue to be solved so that was blocked, however
> when I read the thread it sounds like the list was mostly reaching a consensus
> on considering MADV_COLLAPSE being beneficial.  Still copying DavidR in case I
> missed something important.
> 
> If we think MADV_COLLAPSE can help to implement an userspace (and more
> importantly, data-aware) khugepaged, then MADV_SPLIT can be the other side of
> kcompactd, perhaps.
> 
> That's probably a bit off topic of this specific discussion on the specific use
> case, but so far it seems all reasonable and discussable.
> 

Hi Peter,

Providing a (late) update since we now have some better traction on this, 
I think we'll be ready to post an RFC soon that introduces MADV_COLLAPSE.  
The work is being driven by Zach, now cc'd.

Let's also include SeongJae Park <sj@kernel.org> as well and keep him in 
the loop since DAMON could easily be extended with a DAMOS_COLLAPSE action 
to use MADV_COLLAPSE for hot regions of memory.

Idea for initial approach:

 - MADV_COLLAPSE core code based on the proposal you cite above for anon 
   memory as the inaugural support, collapse memory into thp in process 
   context

 - Batching support to collapse ranges of memory into multiple THP

 - Wire this up for madvise(2) (and process_madvise(2))

 - Enlightenment for file-backed thp

I think Zach's RFC will cover the first three, it could be debated if the 
initial patch series *must* support file-backed thp.  We'll see based on 
the feedback to the RFC.

There's also an extension where MADV_COLLAPSE could be potentially useful 
for hugetlb backed memory.  We have another effort underway that we've 
been talking with Mike Kravetz about that allows hugetlb memory to be 
mapped at multiple levels of the page tables.  There are several use cases 
but one of the driving factors is the performance of post-copy live 
migration; in this case, you'd be able to send smaller sized pages over 
the wire rather than, say, a 1GB gigantic page.

In this case, MADV_COLLAPSE could be useful to map smaller pages by 
a larger page table entry before all of the smaller pages have been live 
migrated.

That said, we have not invested time into an MADV_SPLIT yet.

Do you (or anybody else) have concerns about this approach?  Ideas for 
extensions?

Thanks!

      parent reply	other threads:[~2022-01-24 18:49 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-20 20:12 [PATCH] mm: split thp synchronously on MADV_DONTNEED Shakeel Butt
2021-11-21  4:35 ` Matthew Wilcox
2021-11-21  5:25   ` Shakeel Butt
2021-11-22  0:50 ` Kirill A. Shutemov
2021-11-22  3:42   ` Shakeel Butt
2021-11-22  4:56 ` Matthew Wilcox
2021-11-22  9:19   ` David Hildenbrand
2021-12-08 13:23     ` Pankaj Gupta
2021-11-22  8:32 ` David Hildenbrand
2021-11-22 18:40   ` Shakeel Butt
2021-11-22 18:59     ` David Hildenbrand
2021-11-23  1:20       ` Shakeel Butt
2021-11-23 16:56         ` David Hildenbrand
2021-11-23 17:17           ` Shakeel Butt
2021-11-23 17:20             ` David Hildenbrand
2021-11-23 17:24               ` Shakeel Butt
2021-11-23 17:26                 ` David Hildenbrand
2021-11-23 17:28                   ` Shakeel Butt
2021-11-25 10:09                     ` Peter Xu
2021-11-25 17:14                       ` Shakeel Butt
2021-11-26  0:00                         ` Peter Xu
2021-11-25 10:24     ` Peter Xu
2021-11-25 10:32       ` David Hildenbrand
2021-11-26  2:52         ` Peter Xu
2021-11-26  9:04           ` David Hildenbrand
2021-11-29 22:00             ` Yang Shi
2021-11-26  3:21       ` Shakeel Butt
2021-11-26  4:12         ` Peter Xu
2021-11-26  9:16           ` David Hildenbrand
2021-11-26  9:39             ` Peter Xu
2021-11-29 21:32             ` Yang Shi
2022-01-24 18:48           ` David Rientjes [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e55d1f78-46c2-9ced-d7ad-c6deba4cb7b8@google.com \
    --to=rientjes@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=peterx@redhat.com \
    --cc=shakeelb@google.com \
    --cc=shy828301@gmail.com \
    --cc=sj@kernel.org \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).