From: "Zach O'Keefe" <zokeefe@google.com>
To: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-api@vger.kernel.org,
Axel Rasmussen <axelrasmussen@google.com>,
James Houghton <jthoughton@google.com>,
Hugh Dickins <hughd@google.com>, Yang Shi <shy828301@gmail.com>,
Miaohe Lin <linmiaohe@huawei.com>,
David Hildenbrand <david@redhat.com>,
David Rientjes <rientjes@google.com>,
Matthew Wilcox <willy@infradead.org>,
Pasha Tatashin <pasha.tatashin@soleen.com>,
Peter Xu <peterx@redhat.com>,
Rongwei Wang <rongwei.wang@linux.alibaba.com>,
SeongJae Park <sj@kernel.org>, Song Liu <songliubraving@fb.com>,
Vlastimil Babka <vbabka@suse.cz>,
Chris Kennelly <ckennelly@google.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Minchan Kim <minchan@kernel.org>,
Patrick Xia <patrickx@google.com>,
"Zach O'Keefe" <zokeefe@google.com>
Subject:
Date: Fri, 26 Aug 2022 15:03:19 -0700 [thread overview]
Message-ID: <20220826220329.1495407-1-zokeefe@google.com> (raw)
Subject: [PATCH mm-unstable v2 0/9] mm: add file/shmem support to MADV_COLLAPSE
v2 Forward
Mostly a RESEND: rebase on latest mm-unstable + minor bug fixes from
kernel test robot.
--------------------------------
This series builds on top of the previous "mm: userspace hugepage collapse"
series which introduced the MADV_COLLAPSE madvise mode and added support
for private, anonymous mappings[1], by adding support for file and shmem
backed memory to CONFIG_READ_ONLY_THP_FOR_FS=y kernels.
File and shmem support have been added with effort to align with existing
MADV_COLLAPSE semantics and policy decisions[2]. Collapse of shmem-backed
memory ignores kernel-guiding directives and heuristics including all
sysfs settings (transparent_hugepage/shmem_enabled), and tmpfs huge= mount
options (shmem always supports large folios). Like anonymous mappings, on
successful return of MADV_COLLAPSE on file/shmem memory, the contents of
memory mapped by the addresses provided will be synchronously pmd-mapped
THPs.
This functionality unlocks two important uses:
(1) Immediately back executable text by THPs. Current support provided
by CONFIG_READ_ONLY_THP_FOR_FS may take a long time on a large
system which might impair services from serving at their full rated
load after (re)starting. Tricks like mremap(2)'ing text onto
anonymous memory to immediately realize iTLB performance prevents
page sharing and demand paging, both of which increase steady state
memory footprint. Now, we can have the best of both worlds: Peak
upfront performance and lower RAM footprints.
(2) userfaultfd-based live migration of virtual machines satisfy UFFD
faults by fetching native-sized pages over the network (to avoid
latency of transferring an entire hugepage). However, after guest
memory has been fully copied to the new host, MADV_COLLAPSE can
be used to immediately increase guest performance.
khugepaged has received a small improvement by association and can now
detect and collapse pte-mapped THPs. However, there is still work to be
done along the file collapse path. Compound pages of arbitrary order still
needs to be supported and THP collapse needs to be converted to using
folios in general. Eventually, we'd like to move away from the read-only
and executable-mapped constraints currently imposed on eligible files and
support any inode claiming huge folio support. That said, I think the
series as-is covers enough to claim that MADV_COLLAPSE supports file/shmem
memory.
Patches 1-3 Implement the guts of the series.
Patch 4 Is a tracepoint for debugging.
Patches 5-8 Refactor existing khugepaged selftests to work with new
memory types.
Patch 9 Adds a userfaultfd selftest mode to mimic a functional test
of UFFDIO_REGISTER_MODE_MINOR+MADV_COLLAPSE live migration.
Applies against mm-unstable.
[1] https://lore.kernel.org/linux-mm/20220706235936.2197195-1-zokeefe@google.com/
[2] https://lore.kernel.org/linux-mm/YtBmhaiPHUTkJml8@google.com/
v1 -> v2:
- Add missing definition for khugepaged_add_pte_mapped_thp() in
!CONFIG_SHEM builds, in "mm/khugepaged: attempt to map
file/shmem-backed pte-mapped THPs by pmds"
- Minor bugfixes in "mm/madvise: add file and shmem support to
MADV_COLLAPSE" for !CONFIG_SHMEM, !CONFIG_TRANSPARENT_HUGEPAGE and some
compiler settings.
- Rebased on latest mm-unstable
Zach O'Keefe (9):
mm/shmem: add flag to enforce shmem THP in hugepage_vma_check()
mm/khugepaged: attempt to map file/shmem-backed pte-mapped THPs by
pmds
mm/madvise: add file and shmem support to MADV_COLLAPSE
mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
selftests/vm: dedup THP helpers
selftests/vm: modularize thp collapse memory operations
selftests/vm: add thp collapse file and tmpfs testing
selftests/vm: add thp collapse shmem testing
selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
include/linux/khugepaged.h | 13 +-
include/linux/shmem_fs.h | 10 +-
include/trace/events/huge_memory.h | 36 +
kernel/events/uprobes.c | 2 +-
mm/huge_memory.c | 2 +-
mm/khugepaged.c | 289 ++++--
mm/shmem.c | 18 +-
tools/testing/selftests/vm/Makefile | 2 +
tools/testing/selftests/vm/khugepaged.c | 828 ++++++++++++------
tools/testing/selftests/vm/soft-dirty.c | 2 +-
.../selftests/vm/split_huge_page_test.c | 12 +-
tools/testing/selftests/vm/userfaultfd.c | 171 +++-
tools/testing/selftests/vm/vm_util.c | 36 +-
tools/testing/selftests/vm/vm_util.h | 5 +-
14 files changed, 1040 insertions(+), 386 deletions(-)
--
2.37.2.672.g94769d06f0-goog
next reply other threads:[~2022-08-26 22:03 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-26 22:03 Zach O'Keefe [this message]
2022-08-26 22:03 ` [PATCH mm-unstable v2 1/9] mm/shmem: add flag to enforce shmem THP in hugepage_vma_check() Zach O'Keefe
2022-08-26 22:03 ` [PATCH mm-unstable v2 2/9] mm/khugepaged: attempt to map file/shmem-backed pte-mapped THPs by pmds Zach O'Keefe
2022-08-26 22:03 ` [PATCH mm-unstable v2 3/9] mm/madvise: add file and shmem support to MADV_COLLAPSE Zach O'Keefe
2022-08-26 22:03 ` [PATCH mm-unstable v2 4/9] mm/khugepaged: add tracepoint to hpage_collapse_scan_file() Zach O'Keefe
2022-08-26 22:03 ` [PATCH mm-unstable v2 5/9] selftests/vm: dedup THP helpers Zach O'Keefe
2022-08-26 22:03 ` [PATCH mm-unstable v2 6/9] selftests/vm: modularize thp collapse memory operations Zach O'Keefe
2022-08-26 22:03 ` [PATCH mm-unstable v2 7/9] selftests/vm: add thp collapse file and tmpfs testing Zach O'Keefe
2022-08-26 22:03 ` [PATCH mm-unstable v2 8/9] selftests/vm: add thp collapse shmem testing Zach O'Keefe
2022-08-26 22:03 ` [PATCH mm-unstable v2 9/9] selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory Zach O'Keefe
2022-08-31 21:47 ` Yang Shi
2022-09-01 0:24 ` Re: Zach O'Keefe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220826220329.1495407-1-zokeefe@google.com \
--to=zokeefe@google.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=ckennelly@google.com \
--cc=david@redhat.com \
--cc=hughd@google.com \
--cc=jthoughton@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linmiaohe@huawei.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=pasha.tatashin@soleen.com \
--cc=patrickx@google.com \
--cc=peterx@redhat.com \
--cc=rientjes@google.com \
--cc=rongwei.wang@linux.alibaba.com \
--cc=shy828301@gmail.com \
--cc=sj@kernel.org \
--cc=songliubraving@fb.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.