[PATCH 0/7] mm/gup: Unify hugetlb, speed up thp

* [PATCH 0/7] mm/gup: Unify hugetlb, speed up thp
@ 2023-06-13 21:53 Peter Xu
  2023-06-13 21:53 ` [PATCH 1/7] mm/hugetlb: Handle FOLL_DUMP well in follow_page_mask() Peter Xu
                   ` (6 more replies)
  0 siblings, 7 replies; 39+ messages in thread
From: Peter Xu @ 2023-06-13 21:53 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Matthew Wilcox, Andrea Arcangeli, John Hubbard, Mike Rapoport,
	David Hildenbrand, Vlastimil Babka, peterx, Kirill A . Shutemov,
	Andrew Morton, Mike Kravetz, James Houghton, Hugh Dickins

Hugetlb has a special path for slow gup that follow_page_mask() is actually
skipped completely along with faultin_page().  It's not only confusing, but
also duplicating a lot of logics that generic gup already has, making
hugetlb slightly special.

This patchset tries to dedup the logic, by first touching up the slow gup
code to be able to handle hugetlb pages correctly with the current follow
page and faultin routines (where we're mostly there.. due to 10 years ago
we did try to optimize thp, but half way done; more below), then at the
last patch drop the special path, then the hugetlb gup will always go the
generic routine too via faultin_page().

Note that hugetlb is still special for gup, mostly due to the pgtable
walking (hugetlb_walk()) that we rely on which is currently per-arch.  But
this is still one small step forward, and the diffstat might be a proof
too that this might be worthwhile.

Then for the "speed up thp" side: as a side effect, when I'm looking at the
chunk of code, I found that thp support is actually partially done.  It
doesn't mean that thp won't work for gup, but as long as **pages pointer
passed over, the optimization will be skipped too.  Patch 6 should address
that, so for thp we now get full speed gup.

For a quick number, "chrt -f 1 ./gup_test -m 512 -t -L -n 1024 -r 10" gives
me 13992.50us -> 378.50us.  Gup_test is an extreme case, but just to show
how it affects thp gups.

James: I hope this won't affect too much of your HGM series as this might
conflict with yours.  Logically it should even make your series smaller,
since you can drop the patch to support HGM for follow_hugetlb_page() now
after this one, but you only need to worry that if this one is proven
useful and merged first.

Patch 1-6 prepares for the switch, while patch 7 does the switch over.

I hope I didn't miss anything, and will be very happy to know.  Please have
a look, thanks.

Peter Xu (7):
  mm/hugetlb: Handle FOLL_DUMP well in follow_page_mask()
  mm/hugetlb: Fix hugetlb_follow_page_mask() on permission checks
  mm/hugetlb: Add page_mask for hugetlb_follow_page_mask()
  mm/hugetlb: Prepare hugetlb_follow_page_mask() for FOLL_PIN
  mm/gup: Cleanup next_page handling
  mm/gup: Accelerate thp gup even for "pages != NULL"
  mm/gup: Retire follow_hugetlb_page()

 include/linux/hugetlb.h |  20 +---
 mm/gup.c                |  68 +++++------
 mm/hugetlb.c            | 259 ++++------------------------------------
 3 files changed, 63 insertions(+), 284 deletions(-)

-- 
2.40.1

^ permalink raw reply	[flat|nested] 39+ messages in thread