[PATCH 0/3] mm/uffd: Fix missing markers on hugetlb

* [PATCH 0/3] mm/uffd: Fix missing markers on hugetlb
@ 2023-01-04 22:52 Peter Xu
  2023-01-04 22:52 ` [PATCH 1/3] mm/hugetlb: Pre-allocate pgtable pages for uffd wr-protects Peter Xu
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Peter Xu @ 2023-01-04 22:52 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Mike Kravetz, Muchun Song, peterx, Nadav Amit, Andrea Arcangeli,
	David Hildenbrand, James Houghton, Axel Rasmussen, Andrew Morton

When James was developing the vma split fix for hugetlb pmd sharing, he
found that hugetlb uffd-wp is broken with the test case he developed [1]:

https://lore.kernel.org/r/CADrL8HWSym93=yNpTUdWebOEzUOTR2ffbfUk04XdK6O+PNJNoA@mail.gmail.com

Missing hugetlb pgtable pages caused uffd-wp to lose message when vma split
happens to be across a shared huge pmd range in the test.

The issue is pgtable pre-allocation on hugetlb path was overlooked.  That
was fixed in patch 1.

Meanwhile there's another issue on proper reporting of pgtable allocation
failures during UFFDIO_WRITEPROTECT.  When pgtable allocation failed during
the ioctl(UFFDIO_WRITEPROTECT), we will silent the error so the user cannot
detect it (even if extremely rare).  This issue can happen not only on
hugetlb but also shmem.  Anon is not affected because anon doesn't require
pgtable allocation during wr-protection.  Patch 2 prepares for such a
change, then patch 3 allows the error to be reported to the users.

This set only marks patch 1 to copy stable, because it's a real bug to be
fixed for all kernels 5.19+.

Patch 2-3 will be an enhancement to process pgtable allocation errors, it
should hardly be hit even during heavy workloads in the past of my tests,
but it should make the interface clearer.  Not copying stable for patch 2-3
due to that.  I'll prepare a man page update after patch 2-3 lands.

Tested with:

  - James's reproducer above [1] so it'll start to pass with the vma split
    fix:
    https://lore.kernel.org/r/20230101230042.244286-1-jthoughton@google.com
  - Faked memory pressures to make sure -ENOMEM returned with either shmem
    and hugetlbfs
  - Some uffd general routines

Peter Xu (3):
  mm/hugetlb: Pre-allocate pgtable pages for uffd wr-protects
  mm/mprotect: Use long for page accountings and retval
  mm/uffd: Detect pgtable allocation failures

 include/linux/hugetlb.h       |  4 +-
 include/linux/mm.h            |  2 +-
 include/linux/userfaultfd_k.h |  2 +-
 mm/hugetlb.c                  | 21 +++++++--
 mm/mempolicy.c                |  4 +-
 mm/mprotect.c                 | 89 ++++++++++++++++++++++-------------
 mm/userfaultfd.c              | 16 +++++--
 7 files changed, 88 insertions(+), 50 deletions(-)

-- 
2.37.3

^ permalink raw reply	[flat|nested] 21+ messages in thread