linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Zach O'Keefe" <zokeefe@google.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: David Rientjes <rientjes@google.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: mm/khugepaged: collapse file/shmem compound pages
Date: Wed, 25 May 2022 18:23:52 -0700	[thread overview]
Message-ID: <CAAa6QmQhqHrKa5M4vRCAPtOa4pTet6MfELprN2Wb0rv46PSjTA@mail.gmail.com> (raw)
In-Reply-To: <Yo5+c6vFyLgjtVsG@casper.infradead.org>

On Wed, May 25, 2022 at 12:07 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, May 24, 2022 at 03:42:55PM -0700, Zach O'Keefe wrote:
> > Hey Matthew,
> >
> > I'm leading an attempt to add a new madvise mode, MADV_COLLAPSE, to
> > allow userspace-directed collapse of memory into THPs[1]. The initial
> > proposal only supports anonymous memory, but I'm
> > working on adding support for file-backed and shmem memory.
> >
> > The intended behavior of MADV_COLLAPSE is that it should return
> > "success" if all hugepage-aligned / sized regions requested are backed
> > by pmd-mapped THPs on return (races aside). IOW: we were able to
> > successfully collapse the memory, or it was already backed by
> > pmd-mapped THPs.
> >
> > Currently there is a nice "XXX: khugepaged should compact smaller
> > compound pages into a PMD sized page" in khugepaged_scan_file() when
> > we encounter a compound page during scanning. Do you know what kind of
> > gotchas or technical difficulties would be involved in doing this? I
> > presume this work would also benefit those relying on khugepaged to
> > collapse read-only file and shmem memory, and I'd be happy to help
> > move it forward.

Hey Matthew,

Thanks for taking the time!

>
> Hi Zach,
>
> Thanks for your interest, and I'd love some help on this.
>
> The khugepaged code (like much of the mm used to) assumes that memory
> comes in two sizes, PTE and PMD.  That's still true for anon and shmem
> for now, but hopefully we'll start managing both anon & shmem memory in
> larger chunks, without necessarily going as far as PMD.
>
> I think the purpose of khugepaged should continue to be to construct
> PMD-size pages; I don't see the point of it wandering through process VMs
> replacing order-2 pages with order-5 pages.  I may be wrong about that,
> of course, so feel free to argue with me.

I'd agree here.

> Anyway, that meaning behind that comment is that the PageTransCompound()
> test is going to be true on any compound page (TransCompound doesn't
> check that the page is necessarily a THP).  So that particular test should
> be folio_test_pmd_mappable(), but there are probably other things which
> ought to be changed, including converting the entire file from dealing
> in pages to dealing in folios.

Right, at this point, the page might be a pmd-mapped THP, or it could
be a pte-mapped compound page (I'm unsure if we can encounter compound
pages outside hugepages).

If we could tell it's already pmd-mapped, we're done :) IIUC,
folio_test_pmd_mappable() is a necessary but not sufficient condition
to determine this.

Else, if it's not, is it safe to try and continue? Suppose we find a
folio of 0 < order < HPAGE_PMD_ORDER. Are we safely able to try and
extend it, or will we break some filesystems that expect a certain
order folio?

> I actually have one patch which starts in that direction, but I haven't
> followed it up yet with all the other patches to that file which will
> be needed:

Thanks for the head start! Not an expert here, but would you say
converting this file to use folios is a necessary first step?

Again, thanks for your time,
Zach

> From a64ac45ad951557103a1040c8bcc3f229022cd26 Mon Sep 17 00:00:00 2001
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> Date: Fri, 7 May 2021 23:40:19 -0400
> Subject: [PATCH] mm/khugepaged: Allocate folios
>
> khugepaged only wants to deal in terms of folios, so switch to
> using the folio allocation functions.  This eliminates the calls to
> prep_transhuge_page() and saves dozens of bytes of text.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  mm/khugepaged.c | 32 ++++++++++++--------------------
>  1 file changed, 12 insertions(+), 20 deletions(-)
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 637bfecd6bf5..ec60ee4e14c9 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -854,18 +854,20 @@ static bool khugepaged_prealloc_page(struct page **hpage, bool *wait)
>  static struct page *
>  khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node)
>  {
> +       struct folio *folio;
> +
>         VM_BUG_ON_PAGE(*hpage, *hpage);
>
> -       *hpage = __alloc_pages_node(node, gfp, HPAGE_PMD_ORDER);
> -       if (unlikely(!*hpage)) {
> +       folio = __folio_alloc_node(gfp, HPAGE_PMD_ORDER, node);
> +       if (unlikely(!folio)) {
>                 count_vm_event(THP_COLLAPSE_ALLOC_FAILED);
>                 *hpage = ERR_PTR(-ENOMEM);
>                 return NULL;
>         }
>
> -       prep_transhuge_page(*hpage);
>         count_vm_event(THP_COLLAPSE_ALLOC);
> -       return *hpage;
> +       *hpage = &folio->page;
> +       return &folio->page;
>  }
>  #else
>  static int khugepaged_find_target_node(void)
> @@ -873,24 +875,14 @@ static int khugepaged_find_target_node(void)
>         return 0;
>  }
>
> -static inline struct page *alloc_khugepaged_hugepage(void)
> -{
> -       struct page *page;
> -
> -       page = alloc_pages(alloc_hugepage_khugepaged_gfpmask(),
> -                          HPAGE_PMD_ORDER);
> -       if (page)
> -               prep_transhuge_page(page);
> -       return page;
> -}
> -
>  static struct page *khugepaged_alloc_hugepage(bool *wait)
>  {
> -       struct page *hpage;
> +       struct folio *folio;
>
>         do {
> -               hpage = alloc_khugepaged_hugepage();
> -               if (!hpage) {
> +               folio = folio_alloc(alloc_hugepage_khugepaged_gfpmask(),
> +                                       HPAGE_PMD_ORDER);
> +               if (!folio) {
>                         count_vm_event(THP_COLLAPSE_ALLOC_FAILED);
>                         if (!*wait)
>                                 return NULL;
> @@ -899,9 +891,9 @@ static struct page *khugepaged_alloc_hugepage(bool *wait)
>                         khugepaged_alloc_sleep();
>                 } else
>                         count_vm_event(THP_COLLAPSE_ALLOC);
> -       } while (unlikely(!hpage) && likely(khugepaged_enabled()));
> +       } while (unlikely(!folio) && likely(khugepaged_enabled()));
>
> -       return hpage;
> +       return &folio->page;
>  }
>
>  static bool khugepaged_prealloc_page(struct page **hpage, bool *wait)
> --
> 2.34.1
>


  reply	other threads:[~2022-05-26  1:24 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-24 22:42 mm/khugepaged: collapse file/shmem compound pages Zach O'Keefe
2022-05-25 19:07 ` Matthew Wilcox
2022-05-26  1:23   ` Zach O'Keefe [this message]
2022-05-26  3:36     ` Matthew Wilcox
2022-05-27  0:54       ` Zach O'Keefe
2022-05-27  3:47         ` Matthew Wilcox
2022-05-27 16:27           ` Zach O'Keefe
2022-05-28  3:48             ` Matthew Wilcox
2022-05-29 21:36               ` Zach O'Keefe
2022-05-30  1:25                 ` Rongwei Wang
2022-06-01  5:19                   ` Zach O'Keefe
2022-06-01 11:26                     ` Rongwei Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAa6QmQhqHrKa5M4vRCAPtOa4pTet6MfELprN2Wb0rv46PSjTA@mail.gmail.com \
    --to=zokeefe@google.com \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).