linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Naoya Horiguchi <naoya.horiguchi@linux.dev>,
	David Hildenbrand <david@redhat.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Mina Almasry <almasrymina@google.com>,
	Michal Hocko <mhocko@suse.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Shuah Khan <shuah@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v2 1/3] mm: enable MADV_DONTNEED for hugetlb mappings
Date: Fri, 11 Feb 2022 10:28:49 +0800	[thread overview]
Message-ID: <YgXJ4VjYBJC9ZfbF@xz-m1.local> (raw)
In-Reply-To: <bf1f7a47-5d57-492a-03dd-e42afe186d47@oracle.com>

On Thu, Feb 10, 2022 at 01:36:57PM -0800, Mike Kravetz wrote:
> > Another use case of DONTNEED upon hugetlbfs could be uffd-minor, because afaiu
> > this is the only api that can force strip the hugetlb mapped pgtable without
> > losing pagecache data.
> 
> Correct.  However, I do not know if uffd-minor users would ever want to
> do this.  Perhaps?

My understanding is before this patch uffd-minor upon hugetlbfs requires the
huge file to be mapped twice, one to populate the content, then we'll be able
to trap MINOR faults via the other mapping.  Or we could munmap() the range and
remap it again on the same file offset to drop the pgtables, I think. But that
sounds tricky.  MINOR faults only works with pgtables dropped.

With DONTNEED upon hugetlbfs we can rely on one single mapping of the file,
because we can explicitly drop the pgtables of hugetlbfs files without any
other tricks.

However I have no real use case of it.  Initially I thought it could be useful
for QEMU because QEMU migration routine is run with the same mm context with
the hypervisor, so by default is doesn't have two mappings of the same guest
memory.  If QEMU wants to leverage minor faults, DONTNEED could help.

However when I was measuring bitmap transfer (assuming that's what minor fault
could help with qemu's postcopy) there some months ago I found it's not as slow
as I thought at all..  Either I could have missed something, or we're facing
different problems with what it is when uffd minor is firstly proposed by Axel.

This is probably too out of topic, though..  Let me go back..

Said that, one thing I'm not sure about DONTNEED on hugetlb is whether this
could further abuse DONTNEED, as the original POSIX definition is as simple as:

  The application expects that it will not access the specified address range
  in the near future.

Linux did it by tearing down pgtable, which looks okay so far.  It could be a
bit more weird to apply it to hugetlbfs because from its definition it's a hint
to page reclaims, however hugetlbfs is not a target of page reclaim, neither is
it LRU-aware.  It goes further into some MADV_ZAP styled syscall.

I think it could still be fine as posix doesn't define that behavior
specifically on hugetlb so it can be defined by Linux, but not sure whether
there can be other implications.

Thanks,

-- 
Peter Xu



  reply	other threads:[~2022-02-11  2:29 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-02  1:40 [PATCH v2 0/3] Add hugetlb MADV_DONTNEED support Mike Kravetz
2022-02-02  1:40 ` [PATCH v2 1/3] mm: enable MADV_DONTNEED for hugetlb mappings Mike Kravetz
2022-02-02  8:14   ` David Hildenbrand
2022-02-02 19:32     ` Mike Kravetz
2022-02-04  8:35       ` David Hildenbrand
2022-02-07 23:47         ` Mike Kravetz
2022-02-10 13:09           ` David Hildenbrand
2022-02-10 22:11             ` Mike Kravetz
2022-02-11  8:43               ` David Hildenbrand
2022-02-10  3:21   ` Peter Xu
2022-02-10 21:36     ` Mike Kravetz
2022-02-11  2:28       ` Peter Xu [this message]
2022-02-11 19:08         ` Axel Rasmussen
2022-02-11 19:18           ` Mike Kravetz
2022-02-02  1:40 ` [PATCH v2 2/3] selftests/vm: add hugetlb madvise MADV_DONTNEED MADV_REMOVE test Mike Kravetz
2022-02-02  1:40 ` [PATCH v2 3/3] userfaultfd/selftests: enable huegtlb remap and remove event testing Mike Kravetz
2022-02-02  6:11   ` Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YgXJ4VjYBJC9ZfbF@xz-m1.local \
    --to=peterx@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=axelrasmussen@google.com \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=naoya.horiguchi@linux.dev \
    --cc=shuah@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).