linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Zach O'Keefe" <zokeefe@google.com>
To: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-api@vger.kernel.org,
	 Axel Rasmussen <axelrasmussen@google.com>,
	James Houghton <jthoughton@google.com>,
	 Hugh Dickins <hughd@google.com>, Yang Shi <shy828301@gmail.com>,
	 Miaohe Lin <linmiaohe@huawei.com>,
	David Hildenbrand <david@redhat.com>,
	 David Rientjes <rientjes@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	 Pasha Tatashin <pasha.tatashin@soleen.com>,
	Peter Xu <peterx@redhat.com>,
	 Rongwei Wang <rongwei.wang@linux.alibaba.com>,
	SeongJae Park <sj@kernel.org>,  Song Liu <songliubraving@fb.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	 Chris Kennelly <ckennelly@google.com>,
	 "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Minchan Kim <minchan@kernel.org>,
	 Patrick Xia <patrickx@google.com>,
	"Zach O'Keefe" <zokeefe@google.com>
Subject: [PATCH mm-unstable 0/9] mm: add file/shmem support to MADV_COLLAPSE
Date: Thu, 11 Aug 2022 18:28:34 -0700	[thread overview]
Message-ID: <20220812012843.3948330-1-zokeefe@google.com> (raw)

This series builds on top of the previous "mm: userspace hugepage collapse"
series which introduced the MADV_COLLAPSE madvise mode and added support
for private, anonymous mappings[1], by adding support for file and shmem
backed memory to CONFIG_READ_ONLY_THP_FOR_FS=y kernels.

File and shmem support have been added with effort to align with existing
MADV_COLLAPSE semantics and policy decisions[2].  Collapse of shmem-backed
memory ignores kernel-guiding directives and heuristics including all
sysfs settings (transparent_hugepage/shmem_enabled), and tmpfs huge= mount
options (shmem always supports large folios).  Like anonymous mappings, on
successful return of MADV_COLLAPSE on file/shmem memory, the contents of
memory mapped by the addresses provided will be synchronously pmd-mapped
THPs.

This functionality unlocks two important uses:

(1)	Immediately back executable text by THPs.  Current support provided
	by CONFIG_READ_ONLY_THP_FOR_FS may take a long time on a large
	system which might impair services from serving at their full rated
	load after (re)starting.  Tricks like mremap(2)'ing text onto
	anonymous memory to immediately realize iTLB performance prevents
	page sharing and demand paging, both of which increase steady state
	memory footprint.  Now, we can have the best of both worlds: Peak
	upfront performance and lower RAM footprints.

(2)	userfaultfd-based live migration of virtual machines satisfy UFFD
	faults by fetching native-sized pages over the network (to avoid
	latency of transferring an entire hugepage).  However, after guest
	memory has been fully copied to the new host, MADV_COLLAPSE can
	be used to immediately increase guest performance.

khugepaged has received a small improvement by association and can now
detect and collapse pte-mapped THPs.  However, there is still work to be
done along the file collapse path.  Compound pages of arbitrary order still
needs to be supported and THP collapse needs to be converted to using
folios in general.  Eventually, we'd like to move away from the read-only
and executable-mapped constraints currently imposed on eligible files and
support any inode claiming huge folio support.  That said, I think the
series as-is covers enough to claim that MADV_COLLAPSE supports file/shmem
memory.

Patches 1-3	Implement the guts of the series.
Patch 4 	Is a tracepoint for debugging.
Patches 5-8 	Refactor existing khugepaged selftests to work with new
		memory types.
Patch 9 	Adds a userfaultfd selftest mode to mimic a functional test
		of UFFDIO_REGISTER_MODE_MINOR+MADV_COLLAPSE live migration.

Applies against mm-unstable.

[1] https://lore.kernel.org/linux-mm/20220706235936.2197195-1-zokeefe@google.com/
[2] https://lore.kernel.org/linux-mm/YtBmhaiPHUTkJml8@google.com/


Zach O'Keefe (9):
  mm/shmem: add flag to enforce shmem THP in hugepage_vma_check()
  mm/khugepaged: attempt to map file/shmem-backed pte-mapped THPs by
    pmds
  mm/madvise: add file and shmem support to MADV_COLLAPSE
  mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
  selftests/vm: dedup THP helpers
  selftests/vm: modularize thp collapse memory operations
  selftests/vm: add thp collapse file and tmpfs testing
  selftests/vm: add thp collapse shmem testing
  selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory

 include/linux/khugepaged.h                    |  11 +-
 include/linux/shmem_fs.h                      |  10 +-
 include/trace/events/huge_memory.h            |  36 +
 kernel/events/uprobes.c                       |   2 +-
 mm/huge_memory.c                              |   2 +-
 mm/khugepaged.c                               | 285 ++++--
 mm/shmem.c                                    |  18 +-
 tools/testing/selftests/vm/Makefile           |   2 +
 tools/testing/selftests/vm/khugepaged.c       | 828 ++++++++++++------
 tools/testing/selftests/vm/soft-dirty.c       |   2 +-
 .../selftests/vm/split_huge_page_test.c       |  12 +-
 tools/testing/selftests/vm/userfaultfd.c      | 171 +++-
 tools/testing/selftests/vm/vm_util.c          |  36 +-
 tools/testing/selftests/vm/vm_util.h          |   5 +-
 14 files changed, 1034 insertions(+), 386 deletions(-)

-- 
2.37.1.559.g78731f0fdb-goog



             reply	other threads:[~2022-08-12  1:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-12  1:28 Zach O'Keefe [this message]
2022-08-12  1:28 ` [PATCH mm-unstable 1/9] mm/shmem: add flag to enforce shmem THP in hugepage_vma_check() Zach O'Keefe
2022-08-12  1:28 ` [PATCH mm-unstable 2/9] mm/khugepaged: attempt to map file/shmem-backed pte-mapped THPs by pmds Zach O'Keefe
2022-08-12  1:28 ` [PATCH mm-unstable 3/9] mm/madvise: add file and shmem support to MADV_COLLAPSE Zach O'Keefe
2022-08-15  2:23   ` kernel test robot
2022-08-15  5:41   ` kernel test robot
2022-08-15 17:09     ` Zach O'Keefe
2022-08-12  1:28 ` [PATCH mm-unstable 4/9] mm/khugepaged: add tracepoint to hpage_collapse_scan_file() Zach O'Keefe
2022-08-12  1:28 ` [PATCH mm-unstable 5/9] selftests/vm: dedup THP helpers Zach O'Keefe
2022-08-12  1:28 ` [PATCH mm-unstable 6/9] selftests/vm: modularize thp collapse memory operations Zach O'Keefe
2022-08-12  1:28 ` [PATCH mm-unstable 7/9] selftests/vm: add thp collapse file and tmpfs testing Zach O'Keefe
2022-08-12  1:28 ` [PATCH mm-unstable 8/9] selftests/vm: add thp collapse shmem testing Zach O'Keefe
2022-08-12  1:28 ` [PATCH mm-unstable 9/9] selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory Zach O'Keefe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220812012843.3948330-1-zokeefe@google.com \
    --to=zokeefe@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=ckennelly@google.com \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=jthoughton@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=pasha.tatashin@soleen.com \
    --cc=patrickx@google.com \
    --cc=peterx@redhat.com \
    --cc=rientjes@google.com \
    --cc=rongwei.wang@linux.alibaba.com \
    --cc=shy828301@gmail.com \
    --cc=sj@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).