linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/75] MM folio patches for 5.18
@ 2022-02-04 19:57 Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 01/75] mm/gup: Increment the page refcount before the pincount Matthew Wilcox (Oracle)
                   ` (76 more replies)
  0 siblings, 77 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Whole series availabke through git, and shortly in linux-next:
https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/for-next
or git://git.infradead.org/users/willy/pagecache.git for-next

The first few patches should look familiar to most; these are converting
the GUP code to folios (and a few other things).  Most are well-reviewed,
but I did have to make significant changes to a few patches to accommodate
John's recent bugfix, so I dropped the R-b from them.

After the GUP changes, I started working on vmscan, trying to convert
all of shrink_page_list() to use a folio.  The pages it works on are
folios by definition (since they're chained through ->lru and ->lru
occupies the same bytes of memory as ->compound_head, so they can't be
tail pages).  This is a ridiculously large function, and I'm only part
of the way through it.  I have, however, finished converting rmap_walk()
and friends to take a folio instead of a page.

Midway through, there's a short detour to fix up page_vma_mapped_walk to
work on an explicit PFN range instead of a page.  I had been intending to
convert that to use a folio, but with page_mapped_in_vma() really just
wanting to know about one page (even if it's a head page) and Muchun
wanting to walk pageless memory, making all the users use PFNs just
seemed like the right thing to do.

The last 9 patches actually start adding large folios to the page cache.
This is where I expect the most trouble, but they've been stable in my
testing for a while.

Matthew Wilcox (Oracle) (74):
  mm/gup: Increment the page refcount before the pincount
  mm/gup: Remove for_each_compound_range()
  mm/gup: Remove for_each_compound_head()
  mm/gup: Change the calling convention for compound_range_next()
  mm/gup: Optimise compound_range_next()
  mm/gup: Change the calling convention for compound_next()
  mm/gup: Fix some contiguous memmap assumptions
  mm/gup: Remove an assumption of a contiguous memmap
  mm/gup: Handle page split race more efficiently
  mm/gup: Remove hpage_pincount_add()
  mm/gup: Remove hpage_pincount_sub()
  mm: Make compound_pincount always available
  mm: Add folio_pincount_ptr()
  mm: Turn page_maybe_dma_pinned() into folio_maybe_dma_pinned()
  mm/gup: Add try_get_folio() and try_grab_folio()
  mm/gup: Convert try_grab_page() to use a folio
  mm: Remove page_cache_add_speculative() and
    page_cache_get_speculative()
  mm/gup: Add gup_put_folio()
  mm/hugetlb: Use try_grab_folio() instead of try_grab_compound_head()
  mm/gup: Convert gup_pte_range() to use a folio
  mm/gup: Convert gup_hugepte() to use a folio
  mm/gup: Convert gup_huge_pmd() to use a folio
  mm/gup: Convert gup_huge_pud() to use a folio
  mm/gup: Convert gup_huge_pgd() to use a folio
  mm/gup: Turn compound_next() into gup_folio_next()
  mm/gup: Turn compound_range_next() into gup_folio_range_next()
  mm: Turn isolate_lru_page() into folio_isolate_lru()
  mm/gup: Convert check_and_migrate_movable_pages() to use a folio
  mm/workingset: Convert workingset_eviction() to take a folio
  mm/memcg: Convert mem_cgroup_swapout() to take a folio
  mm: Add lru_to_folio()
  mm: Turn putback_lru_page() into folio_putback_lru()
  mm/vmscan: Convert __remove_mapping() to take a folio
  mm/vmscan: Turn page_check_dirty_writeback() into
    folio_check_dirty_writeback()
  mm: Turn head_compound_mapcount() into folio_entire_mapcount()
  mm: Add folio_mapcount()
  mm: Add split_folio_to_list()
  mm: Add folio_is_zone_device() and folio_is_device_private()
  mm: Add folio_pgoff()
  mm: Add pvmw_set_page() and pvmw_set_folio()
  hexagon: Add pmd_pfn()
  mm: Convert page_vma_mapped_walk to work on PFNs
  mm/page_idle: Convert page_idle_clear_pte_refs() to use a folio
  mm/rmap: Use a folio in page_mkclean_one()
  mm/rmap: Turn page_referenced() into folio_referenced()
  mm/mlock: Turn clear_page_mlock() into folio_end_mlock()
  mm/mlock: Turn mlock_vma_page() into mlock_vma_folio()
  mm/rmap: Turn page_mlock() into folio_mlock()
  mm/mlock: Turn munlock_vma_page() into munlock_vma_folio()
  mm/huge_memory: Convert __split_huge_pmd() to take a folio
  mm/rmap: Convert try_to_unmap() to take a folio
  mm/rmap: Convert try_to_migrate() to folios
  mm/rmap: Convert make_device_exclusive_range() to use folios
  mm/migrate: Convert remove_migration_ptes() to folios
  mm/damon: Convert damon_pa_mkold() to use a folio
  mm/damon: Convert damon_pa_young() to use a folio
  mm/rmap: Turn page_lock_anon_vma_read() into
    folio_lock_anon_vma_read()
  mm: Turn page_anon_vma() into folio_anon_vma()
  mm/rmap: Convert rmap_walk() to take a folio
  mm/rmap: Constify the rmap_walk_control argument
  mm/vmscan: Free non-shmem folios without splitting them
  mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios
  mm/vmscan: Account large folios correctly
  mm/vmscan: Turn page_check_references() into folio_check_references()
  mm/vmscan: Convert pageout() to take a folio
  mm: Turn can_split_huge_page() into can_split_folio()
  mm/filemap: Allow large folios to be added to the page cache
  mm: Fix READ_ONLY_THP warning
  mm: Make large folios depend on THP
  mm: Support arbitrary THP sizes
  mm/readahead: Add large folio readahead
  mm/readahead: Switch to page_cache_ra_order
  mm/filemap: Support VM_HUGEPAGE for file mappings
  selftests/vm/transhuge-stress: Support file-backed PMD folios

William Kucharski (1):
  mm/readahead: Align file mappings for non-DAX

 Documentation/core-api/pin_user_pages.rst     |  18 +-
 arch/hexagon/include/asm/pgtable.h            |   3 +-
 arch/powerpc/include/asm/mmu_context.h        |   1 -
 include/linux/huge_mm.h                       |  59 +--
 include/linux/hugetlb.h                       |   5 +
 include/linux/ksm.h                           |   6 +-
 include/linux/mm.h                            | 145 +++---
 include/linux/mm_types.h                      |   7 +-
 include/linux/pagemap.h                       |  32 +-
 include/linux/rmap.h                          |  50 ++-
 include/linux/swap.h                          |   6 +-
 include/trace/events/vmscan.h                 |  10 +-
 kernel/events/uprobes.c                       |   2 +-
 mm/damon/paddr.c                              |  52 ++-
 mm/debug.c                                    |  18 +-
 mm/filemap.c                                  |  59 ++-
 mm/folio-compat.c                             |  34 ++
 mm/gup.c                                      | 383 +++++++---------
 mm/huge_memory.c                              | 127 +++---
 mm/hugetlb.c                                  |   7 +-
 mm/internal.h                                 |  52 ++-
 mm/ksm.c                                      |  17 +-
 mm/memcontrol.c                               |  22 +-
 mm/memory-failure.c                           |  10 +-
 mm/memory_hotplug.c                           |  13 +-
 mm/migrate.c                                  |  90 ++--
 mm/mlock.c                                    | 136 +++---
 mm/page_alloc.c                               |   3 +-
 mm/page_idle.c                                |  26 +-
 mm/page_vma_mapped.c                          |  58 ++-
 mm/readahead.c                                | 108 ++++-
 mm/rmap.c                                     | 416 +++++++++---------
 mm/util.c                                     |  36 +-
 mm/vmscan.c                                   | 280 ++++++------
 mm/workingset.c                               |  25 +-
 tools/testing/selftests/vm/transhuge-stress.c |  35 +-
 36 files changed, 1270 insertions(+), 1081 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 115+ messages in thread

* [PATCH 01/75] mm/gup: Increment the page refcount before the pincount
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-04 21:13   ` John Hubbard
  2022-02-07  7:45   ` Christoph Hellwig
  2022-02-04 19:57 ` [PATCH 02/75] mm/gup: Remove for_each_compound_range() Matthew Wilcox (Oracle)
                   ` (75 subsequent siblings)
  76 siblings, 2 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

We should always increase the refcount before doing anything else to
the page so that other page users see the elevated refcount first.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/gup.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index a9d4d724aef7..08020987dfc0 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -220,18 +220,18 @@ bool __must_check try_grab_page(struct page *page, unsigned int flags)
 		if (WARN_ON_ONCE(page_ref_count(page) <= 0))
 			return false;
 
-		if (hpage_pincount_available(page))
-			hpage_pincount_add(page, 1);
-		else
-			refs = GUP_PIN_COUNTING_BIAS;
-
 		/*
 		 * Similar to try_grab_compound_head(): even if using the
 		 * hpage_pincount_add/_sub() routines, be sure to
 		 * *also* increment the normal page refcount field at least
 		 * once, so that the page really is pinned.
 		 */
-		page_ref_add(page, refs);
+		if (hpage_pincount_available(page)) {
+			page_ref_add(page, 1);
+			hpage_pincount_add(page, 1);
+		} else {
+			page_ref_add(page, GUP_PIN_COUNTING_BIAS);
+		}
 
 		mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_ACQUIRED, 1);
 	}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 02/75] mm/gup: Remove for_each_compound_range()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 01/75] mm/gup: Increment the page refcount before the pincount Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 03/75] mm/gup: Remove for_each_compound_head() Matthew Wilcox (Oracle)
                   ` (74 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

This macro doesn't simplify the users; it's easier to just call
compound_range_next() inside the loop.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/gup.c | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 08020987dfc0..dc00e46fae5a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -261,9 +261,6 @@ static inline void compound_range_next(unsigned long i, unsigned long npages,
 	struct page *next, *page;
 	unsigned int nr = 1;
 
-	if (i >= npages)
-		return;
-
 	next = *list + i;
 	page = compound_head(next);
 	if (PageCompound(page) && compound_order(page) >= 1)
@@ -274,12 +271,6 @@ static inline void compound_range_next(unsigned long i, unsigned long npages,
 	*ntails = nr;
 }
 
-#define for_each_compound_range(__i, __list, __npages, __head, __ntails) \
-	for (__i = 0, \
-	     compound_range_next(__i, __npages, __list, &(__head), &(__ntails)); \
-	     __i < __npages; __i += __ntails, \
-	     compound_range_next(__i, __npages, __list, &(__head), &(__ntails)))
-
 static inline void compound_next(unsigned long i, unsigned long npages,
 				 struct page **list, struct page **head,
 				 unsigned int *ntails)
@@ -396,7 +387,8 @@ void unpin_user_page_range_dirty_lock(struct page *page, unsigned long npages,
 	struct page *head;
 	unsigned int ntails;
 
-	for_each_compound_range(index, &page, npages, head, ntails) {
+	for (index = 0; index < npages; index += ntails) {
+		compound_range_next(index, npages, &page, &head, &ntails);
 		if (make_dirty && !PageDirty(head))
 			set_page_dirty_lock(head);
 		put_compound_head(head, ntails, FOLL_PIN);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 03/75] mm/gup: Remove for_each_compound_head()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 01/75] mm/gup: Increment the page refcount before the pincount Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 02/75] mm/gup: Remove for_each_compound_range() Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 04/75] mm/gup: Change the calling convention for compound_range_next() Matthew Wilcox (Oracle)
                   ` (73 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

This macro doesn't simplify the users; it's easier to just call
compound_next() inside a standard loop.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/gup.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index dc00e46fae5a..facadcaedea3 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -278,9 +278,6 @@ static inline void compound_next(unsigned long i, unsigned long npages,
 	struct page *page;
 	unsigned int nr;
 
-	if (i >= npages)
-		return;
-
 	page = compound_head(list[i]);
 	for (nr = i + 1; nr < npages; nr++) {
 		if (compound_head(list[nr]) != page)
@@ -291,12 +288,6 @@ static inline void compound_next(unsigned long i, unsigned long npages,
 	*ntails = nr - i;
 }
 
-#define for_each_compound_head(__i, __list, __npages, __head, __ntails) \
-	for (__i = 0, \
-	     compound_next(__i, __npages, __list, &(__head), &(__ntails)); \
-	     __i < __npages; __i += __ntails, \
-	     compound_next(__i, __npages, __list, &(__head), &(__ntails)))
-
 /**
  * unpin_user_pages_dirty_lock() - release and optionally dirty gup-pinned pages
  * @pages:  array of pages to be maybe marked dirty, and definitely released.
@@ -331,7 +322,8 @@ void unpin_user_pages_dirty_lock(struct page **pages, unsigned long npages,
 		return;
 	}
 
-	for_each_compound_head(index, pages, npages, head, ntails) {
+	for (index = 0; index < npages; index += ntails) {
+		compound_next(index, npages, pages, &head, &ntails);
 		/*
 		 * Checking PageDirty at this point may race with
 		 * clear_page_dirty_for_io(), but that's OK. Two key
@@ -419,8 +411,10 @@ void unpin_user_pages(struct page **pages, unsigned long npages)
 	if (WARN_ON(IS_ERR_VALUE(npages)))
 		return;
 
-	for_each_compound_head(index, pages, npages, head, ntails)
+	for (index = 0; index < npages; index += ntails) {
+		compound_next(index, npages, pages, &head, &ntails);
 		put_compound_head(head, ntails, FOLL_PIN);
+	}
 }
 EXPORT_SYMBOL(unpin_user_pages);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 04/75] mm/gup: Change the calling convention for compound_range_next()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (2 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 03/75] mm/gup: Remove for_each_compound_head() Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-07  7:45   ` Christoph Hellwig
  2022-02-04 19:57 ` [PATCH 05/75] mm/gup: Optimise compound_range_next() Matthew Wilcox (Oracle)
                   ` (72 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, John Hubbard, Jason Gunthorpe, William Kucharski

Return the head page instead of storing it to a passed parameter.
Pass the start page directly instead of passing a pointer to it.
Reorder the arguments to match the calling function's arguments.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/gup.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index facadcaedea3..26c73998c6df 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -254,21 +254,20 @@ void unpin_user_page(struct page *page)
 }
 EXPORT_SYMBOL(unpin_user_page);
 
-static inline void compound_range_next(unsigned long i, unsigned long npages,
-				       struct page **list, struct page **head,
-				       unsigned int *ntails)
+static inline struct page *compound_range_next(struct page *start,
+		unsigned long npages, unsigned long i, unsigned int *ntails)
 {
 	struct page *next, *page;
 	unsigned int nr = 1;
 
-	next = *list + i;
+	next = start + i;
 	page = compound_head(next);
 	if (PageCompound(page) && compound_order(page) >= 1)
 		nr = min_t(unsigned int,
 			   page + compound_nr(page) - next, npages - i);
 
-	*head = page;
 	*ntails = nr;
+	return page;
 }
 
 static inline void compound_next(unsigned long i, unsigned long npages,
@@ -380,7 +379,7 @@ void unpin_user_page_range_dirty_lock(struct page *page, unsigned long npages,
 	unsigned int ntails;
 
 	for (index = 0; index < npages; index += ntails) {
-		compound_range_next(index, npages, &page, &head, &ntails);
+		head = compound_range_next(page, npages, index, &ntails);
 		if (make_dirty && !PageDirty(head))
 			set_page_dirty_lock(head);
 		put_compound_head(head, ntails, FOLL_PIN);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 05/75] mm/gup: Optimise compound_range_next()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (3 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 04/75] mm/gup: Change the calling convention for compound_range_next() Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 06/75] mm/gup: Change the calling convention for compound_next() Matthew Wilcox (Oracle)
                   ` (71 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

By definition, a compound page has an order >= 1, so the second half
of the test was redundant.  Also, this cannot be a tail page since
it's the result of calling compound_head(), so use PageHead() instead
of PageCompound().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/gup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/gup.c b/mm/gup.c
index 26c73998c6df..75a0a1fd4c2a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -262,7 +262,7 @@ static inline struct page *compound_range_next(struct page *start,
 
 	next = start + i;
 	page = compound_head(next);
-	if (PageCompound(page) && compound_order(page) >= 1)
+	if (PageHead(page))
 		nr = min_t(unsigned int,
 			   page + compound_nr(page) - next, npages - i);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 06/75] mm/gup: Change the calling convention for compound_next()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (4 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 05/75] mm/gup: Optimise compound_range_next() Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 07/75] mm/gup: Fix some contiguous memmap assumptions Matthew Wilcox (Oracle)
                   ` (70 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

Return the head page instead of storing it to a passed parameter.
Reorder the arguments to match the calling function's arguments.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/gup.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 75a0a1fd4c2a..7e4bdae83e9b 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -270,9 +270,8 @@ static inline struct page *compound_range_next(struct page *start,
 	return page;
 }
 
-static inline void compound_next(unsigned long i, unsigned long npages,
-				 struct page **list, struct page **head,
-				 unsigned int *ntails)
+static inline struct page *compound_next(struct page **list,
+		unsigned long npages, unsigned long i, unsigned int *ntails)
 {
 	struct page *page;
 	unsigned int nr;
@@ -283,8 +282,8 @@ static inline void compound_next(unsigned long i, unsigned long npages,
 			break;
 	}
 
-	*head = page;
 	*ntails = nr - i;
+	return page;
 }
 
 /**
@@ -322,7 +321,7 @@ void unpin_user_pages_dirty_lock(struct page **pages, unsigned long npages,
 	}
 
 	for (index = 0; index < npages; index += ntails) {
-		compound_next(index, npages, pages, &head, &ntails);
+		head = compound_next(pages, npages, index, &ntails);
 		/*
 		 * Checking PageDirty at this point may race with
 		 * clear_page_dirty_for_io(), but that's OK. Two key
@@ -411,7 +410,7 @@ void unpin_user_pages(struct page **pages, unsigned long npages)
 		return;
 
 	for (index = 0; index < npages; index += ntails) {
-		compound_next(index, npages, pages, &head, &ntails);
+		head = compound_next(pages, npages, index, &ntails);
 		put_compound_head(head, ntails, FOLL_PIN);
 	}
 }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 07/75] mm/gup: Fix some contiguous memmap assumptions
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (5 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 06/75] mm/gup: Change the calling convention for compound_next() Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 08/75] mm/gup: Remove an assumption of a contiguous memmap Matthew Wilcox (Oracle)
                   ` (69 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

Several functions in gup.c assume that a compound page has virtually
contiguous page structs.  This isn't true for SPARSEMEM configs unless
SPARSEMEM_VMEMMAP is also set.  Fix them by using nth_page() instead of
plain pointer arithmetic.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/gup.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 7e4bdae83e9b..29a8021f10a2 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -260,7 +260,7 @@ static inline struct page *compound_range_next(struct page *start,
 	struct page *next, *page;
 	unsigned int nr = 1;
 
-	next = start + i;
+	next = nth_page(start, i);
 	page = compound_head(next);
 	if (PageHead(page))
 		nr = min_t(unsigned int,
@@ -2462,8 +2462,8 @@ static int record_subpages(struct page *page, unsigned long addr,
 {
 	int nr;
 
-	for (nr = 0; addr != end; addr += PAGE_SIZE)
-		pages[nr++] = page++;
+	for (nr = 0; addr != end; nr++, addr += PAGE_SIZE)
+		pages[nr] = nth_page(page, nr);
 
 	return nr;
 }
@@ -2498,7 +2498,7 @@ static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr,
 	VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
 
 	head = pte_page(pte);
-	page = head + ((addr & (sz-1)) >> PAGE_SHIFT);
+	page = nth_page(head, (addr & (sz - 1)) >> PAGE_SHIFT);
 	refs = record_subpages(page, addr, end, pages + *nr);
 
 	head = try_grab_compound_head(head, refs, flags);
@@ -2558,7 +2558,7 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
 					     pages, nr);
 	}
 
-	page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
+	page = nth_page(pmd_page(orig), (addr & ~PMD_MASK) >> PAGE_SHIFT);
 	refs = record_subpages(page, addr, end, pages + *nr);
 
 	head = try_grab_compound_head(pmd_page(orig), refs, flags);
@@ -2592,7 +2592,7 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr,
 					     pages, nr);
 	}
 
-	page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
+	page = nth_page(pud_page(orig), (addr & ~PUD_MASK) >> PAGE_SHIFT);
 	refs = record_subpages(page, addr, end, pages + *nr);
 
 	head = try_grab_compound_head(pud_page(orig), refs, flags);
@@ -2621,7 +2621,7 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr,
 
 	BUILD_BUG_ON(pgd_devmap(orig));
 
-	page = pgd_page(orig) + ((addr & ~PGDIR_MASK) >> PAGE_SHIFT);
+	page = nth_page(pgd_page(orig), (addr & ~PGDIR_MASK) >> PAGE_SHIFT);
 	refs = record_subpages(page, addr, end, pages + *nr);
 
 	head = try_grab_compound_head(pgd_page(orig), refs, flags);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 08/75] mm/gup: Remove an assumption of a contiguous memmap
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (6 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 07/75] mm/gup: Fix some contiguous memmap assumptions Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 09/75] mm/gup: Handle page split race more efficiently Matthew Wilcox (Oracle)
                   ` (68 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

This assumption needs the inverse of nth_page(), which is temporarily
named page_nth() until it's renamed later in this series.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 include/linux/mm.h | 2 ++
 mm/gup.c           | 4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 213cc569b192..e679a7d66200 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -216,8 +216,10 @@ int overcommit_policy_handler(struct ctl_table *, int, void *, size_t *,
 
 #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
 #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
+#define page_nth(head, tail)	(page_to_pfn(tail) - page_to_pfn(head))
 #else
 #define nth_page(page,n) ((page) + (n))
+#define page_nth(head, tail)	((tail) - (head))
 #endif
 
 /* to align the pointer to the (next) page boundary */
diff --git a/mm/gup.c b/mm/gup.c
index 29a8021f10a2..fa75b71820a2 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -263,8 +263,8 @@ static inline struct page *compound_range_next(struct page *start,
 	next = nth_page(start, i);
 	page = compound_head(next);
 	if (PageHead(page))
-		nr = min_t(unsigned int,
-			   page + compound_nr(page) - next, npages - i);
+		nr = min_t(unsigned int, npages - i,
+			   compound_nr(page) - page_nth(page, next));
 
 	*ntails = nr;
 	return page;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 09/75] mm/gup: Handle page split race more efficiently
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (7 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 08/75] mm/gup: Remove an assumption of a contiguous memmap Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 10/75] mm/gup: Remove hpage_pincount_add() Matthew Wilcox (Oracle)
                   ` (67 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

If we hit the page split race, the current code returns NULL which will
presumably trigger a retry under the mmap_lock.  This isn't necessary;
we can just retry the compound_head() lookup.  This is a very minor
optimisation of an unlikely path, but conceptually it matches (eg)
the page cache RCU-protected lookup.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/gup.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index fa75b71820a2..923a0d44203c 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -68,7 +68,10 @@ static void put_page_refs(struct page *page, int refs)
  */
 static inline struct page *try_get_compound_head(struct page *page, int refs)
 {
-	struct page *head = compound_head(page);
+	struct page *head;
+
+retry:
+	head = compound_head(page);
 
 	if (WARN_ON_ONCE(page_ref_count(head) < 0))
 		return NULL;
@@ -86,7 +89,7 @@ static inline struct page *try_get_compound_head(struct page *page, int refs)
 	 */
 	if (unlikely(compound_head(page) != head)) {
 		put_page_refs(head, refs);
-		return NULL;
+		goto retry;
 	}
 
 	return head;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 10/75] mm/gup: Remove hpage_pincount_add()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (8 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 09/75] mm/gup: Handle page split race more efficiently Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-04 21:29   ` John Hubbard
  2022-02-07  7:46   ` Christoph Hellwig
  2022-02-04 19:57 ` [PATCH 11/75] mm/gup: Remove hpage_pincount_sub() Matthew Wilcox (Oracle)
                   ` (66 subsequent siblings)
  76 siblings, 2 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

It's clearer to call atomic_add() in the callers; the assertions clearly
can't fire there because they're part of the condition for calling
atomic_add().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/gup.c | 33 +++++++++++----------------------
 1 file changed, 11 insertions(+), 22 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 923a0d44203c..60168a09d52a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -29,14 +29,6 @@ struct follow_page_context {
 	unsigned int page_mask;
 };
 
-static void hpage_pincount_add(struct page *page, int refs)
-{
-	VM_BUG_ON_PAGE(!hpage_pincount_available(page), page);
-	VM_BUG_ON_PAGE(page != compound_head(page), page);
-
-	atomic_add(refs, compound_pincount_ptr(page));
-}
-
 static void hpage_pincount_sub(struct page *page, int refs)
 {
 	VM_BUG_ON_PAGE(!hpage_pincount_available(page), page);
@@ -151,17 +143,17 @@ __maybe_unused struct page *try_grab_compound_head(struct page *page,
 			return NULL;
 
 		/*
-		 * When pinning a compound page of order > 1 (which is what
-		 * hpage_pincount_available() checks for), use an exact count to
-		 * track it, via hpage_pincount_add/_sub().
+		 * When pinning a compound page of order > 1 (which is
+		 * what hpage_pincount_available() checks for), use an
+		 * exact count to track it.
 		 *
-		 * However, be sure to *also* increment the normal page refcount
-		 * field at least once, so that the page really is pinned.
-		 * That's why the refcount from the earlier
+		 * However, be sure to *also* increment the normal page
+		 * refcount field at least once, so that the page really
+		 * is pinned.  That's why the refcount from the earlier
 		 * try_get_compound_head() is left intact.
 		 */
 		if (hpage_pincount_available(page))
-			hpage_pincount_add(page, refs);
+			atomic_add(refs, compound_pincount_ptr(page));
 		else
 			page_ref_add(page, refs * (GUP_PIN_COUNTING_BIAS - 1));
 
@@ -216,22 +208,19 @@ bool __must_check try_grab_page(struct page *page, unsigned int flags)
 	if (flags & FOLL_GET)
 		return try_get_page(page);
 	else if (flags & FOLL_PIN) {
-		int refs = 1;
-
 		page = compound_head(page);
 
 		if (WARN_ON_ONCE(page_ref_count(page) <= 0))
 			return false;
 
 		/*
-		 * Similar to try_grab_compound_head(): even if using the
-		 * hpage_pincount_add/_sub() routines, be sure to
-		 * *also* increment the normal page refcount field at least
-		 * once, so that the page really is pinned.
+		 * Similar to try_grab_compound_head(): be sure to *also*
+		 * increment the normal page refcount field at least once,
+		 * so that the page really is pinned.
 		 */
 		if (hpage_pincount_available(page)) {
 			page_ref_add(page, 1);
-			hpage_pincount_add(page, 1);
+			atomic_add(1, compound_pincount_ptr(page));
 		} else {
 			page_ref_add(page, GUP_PIN_COUNTING_BIAS);
 		}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 11/75] mm/gup: Remove hpage_pincount_sub()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (9 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 10/75] mm/gup: Remove hpage_pincount_add() Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 12/75] mm: Make compound_pincount always available Matthew Wilcox (Oracle)
                   ` (65 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

Move the assertion (and correct it to be a cheaper variant),
and inline the atomic_sub() operation.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/gup.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 60168a09d52a..af623a139995 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -29,14 +29,6 @@ struct follow_page_context {
 	unsigned int page_mask;
 };
 
-static void hpage_pincount_sub(struct page *page, int refs)
-{
-	VM_BUG_ON_PAGE(!hpage_pincount_available(page), page);
-	VM_BUG_ON_PAGE(page != compound_head(page), page);
-
-	atomic_sub(refs, compound_pincount_ptr(page));
-}
-
 /* Equivalent to calling put_page() @refs times. */
 static void put_page_refs(struct page *page, int refs)
 {
@@ -169,12 +161,13 @@ __maybe_unused struct page *try_grab_compound_head(struct page *page,
 
 static void put_compound_head(struct page *page, int refs, unsigned int flags)
 {
+	VM_BUG_ON_PAGE(PageTail(page), page);
+
 	if (flags & FOLL_PIN) {
 		mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_RELEASED,
 				    refs);
-
 		if (hpage_pincount_available(page))
-			hpage_pincount_sub(page, refs);
+			atomic_sub(refs, compound_pincount_ptr(page));
 		else
 			refs *= GUP_PIN_COUNTING_BIAS;
 	}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 12/75] mm: Make compound_pincount always available
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (10 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 11/75] mm/gup: Remove hpage_pincount_sub() Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 13/75] mm: Add folio_pincount_ptr() Matthew Wilcox (Oracle)
                   ` (64 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, John Hubbard, Christoph Hellwig, Jason Gunthorpe,
	William Kucharski

Move compound_pincount from the third page to the second page, which
means it's available for all compound pages.  That lets us delete
hpage_pincount_available().

On 32-bit systems, there isn't enough space for both compound_pincount
and compound_nr in the second page (it would collide with page->private,
which is in use for pages in the swap cache), so revert the optimisation
of storing both compound_order and compound_nr on 32-bit systems.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 Documentation/core-api/pin_user_pages.rst | 18 +++++++++---------
 include/linux/mm.h                        | 21 ++++++++-------------
 include/linux/mm_types.h                  |  7 +++++--
 mm/debug.c                                | 14 ++++----------
 mm/gup.c                                  | 20 +++++++++-----------
 mm/page_alloc.c                           |  3 +--
 mm/rmap.c                                 |  6 ++----
 7 files changed, 38 insertions(+), 51 deletions(-)

diff --git a/Documentation/core-api/pin_user_pages.rst b/Documentation/core-api/pin_user_pages.rst
index fcf605be43d0..b18416f4500f 100644
--- a/Documentation/core-api/pin_user_pages.rst
+++ b/Documentation/core-api/pin_user_pages.rst
@@ -55,18 +55,18 @@ flags the caller provides. The caller is required to pass in a non-null struct
 pages* array, and the function then pins pages by incrementing each by a special
 value: GUP_PIN_COUNTING_BIAS.
 
-For huge pages (and in fact, any compound page of more than 2 pages), the
-GUP_PIN_COUNTING_BIAS scheme is not used. Instead, an exact form of pin counting
-is achieved, by using the 3rd struct page in the compound page. A new struct
-page field, hpage_pinned_refcount, has been added in order to support this.
+For compound pages, the GUP_PIN_COUNTING_BIAS scheme is not used. Instead,
+an exact form of pin counting is achieved, by using the 2nd struct page
+in the compound page. A new struct page field, compound_pincount, has
+been added in order to support this.
 
 This approach for compound pages avoids the counting upper limit problems that
 are discussed below. Those limitations would have been aggravated severely by
 huge pages, because each tail page adds a refcount to the head page. And in
-fact, testing revealed that, without a separate hpage_pinned_refcount field,
+fact, testing revealed that, without a separate compound_pincount field,
 page overflows were seen in some huge page stress tests.
 
-This also means that huge pages and compound pages (of order > 1) do not suffer
+This also means that huge pages and compound pages do not suffer
 from the false positives problem that is mentioned below.::
 
  Function
@@ -264,9 +264,9 @@ place.)
 Other diagnostics
 =================
 
-dump_page() has been enhanced slightly, to handle these new counting fields, and
-to better report on compound pages in general. Specifically, for compound pages
-with order > 1, the exact (hpage_pinned_refcount) pincount is reported.
+dump_page() has been enhanced slightly, to handle these new counting
+fields, and to better report on compound pages in general. Specifically,
+for compound pages, the exact (compound_pincount) pincount is reported.
 
 References
 ==========
diff --git a/include/linux/mm.h b/include/linux/mm.h
index e679a7d66200..dd7d6e95e43b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -891,17 +891,6 @@ static inline void destroy_compound_page(struct page *page)
 	compound_page_dtors[page[1].compound_dtor](page);
 }
 
-static inline bool hpage_pincount_available(struct page *page)
-{
-	/*
-	 * Can the page->hpage_pinned_refcount field be used? That field is in
-	 * the 3rd page of the compound page, so the smallest (2-page) compound
-	 * pages cannot support it.
-	 */
-	page = compound_head(page);
-	return PageCompound(page) && compound_order(page) > 1;
-}
-
 static inline int head_compound_pincount(struct page *head)
 {
 	return atomic_read(compound_pincount_ptr(head));
@@ -909,7 +898,7 @@ static inline int head_compound_pincount(struct page *head)
 
 static inline int compound_pincount(struct page *page)
 {
-	VM_BUG_ON_PAGE(!hpage_pincount_available(page), page);
+	VM_BUG_ON_PAGE(!PageCompound(page), page);
 	page = compound_head(page);
 	return head_compound_pincount(page);
 }
@@ -917,7 +906,9 @@ static inline int compound_pincount(struct page *page)
 static inline void set_compound_order(struct page *page, unsigned int order)
 {
 	page[1].compound_order = order;
+#ifdef CONFIG_64BIT
 	page[1].compound_nr = 1U << order;
+#endif
 }
 
 /* Returns the number of pages in this potentially compound page. */
@@ -925,7 +916,11 @@ static inline unsigned long compound_nr(struct page *page)
 {
 	if (!PageHead(page))
 		return 1;
+#ifdef CONFIG_64BIT
 	return page[1].compound_nr;
+#else
+	return 1UL << compound_order(page);
+#endif
 }
 
 /* Returns the number of bytes in this potentially compound page. */
@@ -1307,7 +1302,7 @@ void unpin_user_pages(struct page **pages, unsigned long npages);
  */
 static inline bool page_maybe_dma_pinned(struct page *page)
 {
-	if (hpage_pincount_available(page))
+	if (PageCompound(page))
 		return compound_pincount(page) > 0;
 
 	/*
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 5140e5feb486..e510ff214acf 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -126,11 +126,14 @@ struct page {
 			unsigned char compound_dtor;
 			unsigned char compound_order;
 			atomic_t compound_mapcount;
+			atomic_t compound_pincount;
+#ifdef CONFIG_64BIT
 			unsigned int compound_nr; /* 1 << compound_order */
+#endif
 		};
 		struct {	/* Second tail page of compound page */
 			unsigned long _compound_pad_1;	/* compound_head */
-			atomic_t hpage_pinned_refcount;
+			unsigned long _compound_pad_2;
 			/* For both global and memcg */
 			struct list_head deferred_list;
 		};
@@ -285,7 +288,7 @@ static inline atomic_t *compound_mapcount_ptr(struct page *page)
 
 static inline atomic_t *compound_pincount_ptr(struct page *page)
 {
-	return &page[2].hpage_pinned_refcount;
+	return &page[1].compound_pincount;
 }
 
 /*
diff --git a/mm/debug.c b/mm/debug.c
index bc9ac87f0e08..c4cf44266430 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -92,16 +92,10 @@ static void __dump_page(struct page *page)
 			page, page_ref_count(head), mapcount, mapping,
 			page_to_pgoff(page), page_to_pfn(page));
 	if (compound) {
-		if (hpage_pincount_available(page)) {
-			pr_warn("head:%p order:%u compound_mapcount:%d compound_pincount:%d\n",
-					head, compound_order(head),
-					head_compound_mapcount(head),
-					head_compound_pincount(head));
-		} else {
-			pr_warn("head:%p order:%u compound_mapcount:%d\n",
-					head, compound_order(head),
-					head_compound_mapcount(head));
-		}
+		pr_warn("head:%p order:%u compound_mapcount:%d compound_pincount:%d\n",
+				head, compound_order(head),
+				head_compound_mapcount(head),
+				head_compound_pincount(head));
 	}
 
 #ifdef CONFIG_MEMCG
diff --git a/mm/gup.c b/mm/gup.c
index af623a139995..a444b94c96fd 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -99,12 +99,11 @@ static inline struct page *try_get_compound_head(struct page *page, int refs)
  *
  *    FOLL_GET: page's refcount will be incremented by @refs.
  *
- *    FOLL_PIN on compound pages that are > two pages long: page's refcount will
- *    be incremented by @refs, and page[2].hpage_pinned_refcount will be
- *    incremented by @refs * GUP_PIN_COUNTING_BIAS.
+ *    FOLL_PIN on compound pages: page's refcount will be incremented by
+ *    @refs, and page[1].compound_pincount will be incremented by @refs.
  *
- *    FOLL_PIN on normal pages, or compound pages that are two pages long:
- *    page's refcount will be incremented by @refs * GUP_PIN_COUNTING_BIAS.
+ *    FOLL_PIN on normal pages: page's refcount will be incremented by
+ *    @refs * GUP_PIN_COUNTING_BIAS.
  *
  * Return: head page (with refcount appropriately incremented) for success, or
  * NULL upon failure. If neither FOLL_GET nor FOLL_PIN was set, that's
@@ -135,16 +134,15 @@ __maybe_unused struct page *try_grab_compound_head(struct page *page,
 			return NULL;
 
 		/*
-		 * When pinning a compound page of order > 1 (which is
-		 * what hpage_pincount_available() checks for), use an
-		 * exact count to track it.
+		 * When pinning a compound page, use an exact count to
+		 * track it.
 		 *
 		 * However, be sure to *also* increment the normal page
 		 * refcount field at least once, so that the page really
 		 * is pinned.  That's why the refcount from the earlier
 		 * try_get_compound_head() is left intact.
 		 */
-		if (hpage_pincount_available(page))
+		if (PageHead(page))
 			atomic_add(refs, compound_pincount_ptr(page));
 		else
 			page_ref_add(page, refs * (GUP_PIN_COUNTING_BIAS - 1));
@@ -166,7 +164,7 @@ static void put_compound_head(struct page *page, int refs, unsigned int flags)
 	if (flags & FOLL_PIN) {
 		mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_RELEASED,
 				    refs);
-		if (hpage_pincount_available(page))
+		if (PageHead(page))
 			atomic_sub(refs, compound_pincount_ptr(page));
 		else
 			refs *= GUP_PIN_COUNTING_BIAS;
@@ -211,7 +209,7 @@ bool __must_check try_grab_page(struct page *page, unsigned int flags)
 		 * increment the normal page refcount field at least once,
 		 * so that the page really is pinned.
 		 */
-		if (hpage_pincount_available(page)) {
+		if (PageHead(page)) {
 			page_ref_add(page, 1);
 			atomic_add(1, compound_pincount_ptr(page));
 		} else {
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3589febc6d31..02283598fd14 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -734,8 +734,7 @@ static void prep_compound_head(struct page *page, unsigned int order)
 	set_compound_page_dtor(page, COMPOUND_PAGE_DTOR);
 	set_compound_order(page, order);
 	atomic_set(compound_mapcount_ptr(page), -1);
-	if (hpage_pincount_available(page))
-		atomic_set(compound_pincount_ptr(page), 0);
+	atomic_set(compound_pincount_ptr(page), 0);
 }
 
 static void prep_compound_tail(struct page *head, int tail_idx)
diff --git a/mm/rmap.c b/mm/rmap.c
index 6a1e8c7f6213..a531b64d53fa 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1216,8 +1216,7 @@ void page_add_new_anon_rmap(struct page *page,
 		VM_BUG_ON_PAGE(!PageTransHuge(page), page);
 		/* increment count (starts at -1) */
 		atomic_set(compound_mapcount_ptr(page), 0);
-		if (hpage_pincount_available(page))
-			atomic_set(compound_pincount_ptr(page), 0);
+		atomic_set(compound_pincount_ptr(page), 0);
 
 		__mod_lruvec_page_state(page, NR_ANON_THPS, nr);
 	} else {
@@ -2439,8 +2438,7 @@ void hugepage_add_new_anon_rmap(struct page *page,
 {
 	BUG_ON(address < vma->vm_start || address >= vma->vm_end);
 	atomic_set(compound_mapcount_ptr(page), 0);
-	if (hpage_pincount_available(page))
-		atomic_set(compound_pincount_ptr(page), 0);
+	atomic_set(compound_pincount_ptr(page), 0);
 
 	__page_set_anon_rmap(page, vma, address, 1);
 }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 13/75] mm: Add folio_pincount_ptr()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (11 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 12/75] mm: Make compound_pincount always available Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 14/75] mm: Turn page_maybe_dma_pinned() into folio_maybe_dma_pinned() Matthew Wilcox (Oracle)
                   ` (63 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

This is the folio equivalent of compound_pincount_ptr().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 include/linux/mm.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index dd7d6e95e43b..d5f0f2cfd552 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -891,6 +891,11 @@ static inline void destroy_compound_page(struct page *page)
 	compound_page_dtors[page[1].compound_dtor](page);
 }
 
+static inline atomic_t *folio_pincount_ptr(struct folio *folio)
+{
+	return &folio_page(folio, 1)->compound_pincount;
+}
+
 static inline int head_compound_pincount(struct page *head)
 {
 	return atomic_read(compound_pincount_ptr(head));
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 14/75] mm: Turn page_maybe_dma_pinned() into folio_maybe_dma_pinned()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (12 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 13/75] mm: Add folio_pincount_ptr() Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 15/75] mm/gup: Add try_get_folio() and try_grab_folio() Matthew Wilcox (Oracle)
                   ` (62 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

Replace three calls to compound_head() with one.  This removes the last
user of compound_pincount(), so remove that helper too.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 include/linux/mm.h | 49 ++++++++++++++++++++++------------------------
 1 file changed, 23 insertions(+), 26 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index d5f0f2cfd552..a29dacec7294 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -901,13 +901,6 @@ static inline int head_compound_pincount(struct page *head)
 	return atomic_read(compound_pincount_ptr(head));
 }
 
-static inline int compound_pincount(struct page *page)
-{
-	VM_BUG_ON_PAGE(!PageCompound(page), page);
-	page = compound_head(page);
-	return head_compound_pincount(page);
-}
-
 static inline void set_compound_order(struct page *page, unsigned int order)
 {
 	page[1].compound_order = order;
@@ -1280,48 +1273,52 @@ void unpin_user_page_range_dirty_lock(struct page *page, unsigned long npages,
 void unpin_user_pages(struct page **pages, unsigned long npages);
 
 /**
- * page_maybe_dma_pinned - Report if a page is pinned for DMA.
- * @page: The page.
+ * folio_maybe_dma_pinned - Report if a folio may be pinned for DMA.
+ * @folio: The folio.
  *
- * This function checks if a page has been pinned via a call to
+ * This function checks if a folio has been pinned via a call to
  * a function in the pin_user_pages() family.
  *
- * For non-huge pages, the return value is partially fuzzy: false is not fuzzy,
+ * For small folios, the return value is partially fuzzy: false is not fuzzy,
  * because it means "definitely not pinned for DMA", but true means "probably
  * pinned for DMA, but possibly a false positive due to having at least
- * GUP_PIN_COUNTING_BIAS worth of normal page references".
+ * GUP_PIN_COUNTING_BIAS worth of normal folio references".
  *
- * False positives are OK, because: a) it's unlikely for a page to get that many
- * refcounts, and b) all the callers of this routine are expected to be able to
- * deal gracefully with a false positive.
+ * False positives are OK, because: a) it's unlikely for a folio to
+ * get that many refcounts, and b) all the callers of this routine are
+ * expected to be able to deal gracefully with a false positive.
  *
- * For huge pages, the result will be exactly correct. That's because we have
- * more tracking data available: the 3rd struct page in the compound page is
- * used to track the pincount (instead using of the GUP_PIN_COUNTING_BIAS
- * scheme).
+ * For large folios, the result will be exactly correct. That's because
+ * we have more tracking data available: the compound_pincount is used
+ * instead of the GUP_PIN_COUNTING_BIAS scheme.
  *
  * For more information, please see Documentation/core-api/pin_user_pages.rst.
  *
  * Return: True, if it is likely that the page has been "dma-pinned".
  * False, if the page is definitely not dma-pinned.
  */
-static inline bool page_maybe_dma_pinned(struct page *page)
+static inline bool folio_maybe_dma_pinned(struct folio *folio)
 {
-	if (PageCompound(page))
-		return compound_pincount(page) > 0;
+	if (folio_test_large(folio))
+		return atomic_read(folio_pincount_ptr(folio)) > 0;
 
 	/*
-	 * page_ref_count() is signed. If that refcount overflows, then
-	 * page_ref_count() returns a negative value, and callers will avoid
+	 * folio_ref_count() is signed. If that refcount overflows, then
+	 * folio_ref_count() returns a negative value, and callers will avoid
 	 * further incrementing the refcount.
 	 *
-	 * Here, for that overflow case, use the signed bit to count a little
+	 * Here, for that overflow case, use the sign bit to count a little
 	 * bit higher via unsigned math, and thus still get an accurate result.
 	 */
-	return ((unsigned int)page_ref_count(compound_head(page))) >=
+	return ((unsigned int)folio_ref_count(folio)) >=
 		GUP_PIN_COUNTING_BIAS;
 }
 
+static inline bool page_maybe_dma_pinned(struct page *page)
+{
+	return folio_maybe_dma_pinned(page_folio(page));
+}
+
 static inline bool is_cow_mapping(vm_flags_t flags)
 {
 	return (flags & (VM_SHARED | VM_MAYWRITE)) == VM_MAYWRITE;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 15/75] mm/gup: Add try_get_folio() and try_grab_folio()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (13 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 14/75] mm: Turn page_maybe_dma_pinned() into folio_maybe_dma_pinned() Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 16/75] mm/gup: Convert try_grab_page() to use a folio Matthew Wilcox (Oracle)
                   ` (61 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

Convert try_get_compound_head() into try_get_folio() and convert
try_grab_compound_head() into try_grab_folio().  Add a temporary
try_grab_compound_head() wrapper around try_grab_folio() to let us
convert callers individually.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/gup.c      | 99 +++++++++++++++++++++++++--------------------------
 mm/internal.h |  5 +++
 2 files changed, 54 insertions(+), 50 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index a444b94c96fd..4f1669db92f5 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -47,75 +47,70 @@ static void put_page_refs(struct page *page, int refs)
 }
 
 /*
- * Return the compound head page with ref appropriately incremented,
+ * Return the folio with ref appropriately incremented,
  * or NULL if that failed.
  */
-static inline struct page *try_get_compound_head(struct page *page, int refs)
+static inline struct folio *try_get_folio(struct page *page, int refs)
 {
-	struct page *head;
+	struct folio *folio;
 
 retry:
-	head = compound_head(page);
-
-	if (WARN_ON_ONCE(page_ref_count(head) < 0))
+	folio = page_folio(page);
+	if (WARN_ON_ONCE(folio_ref_count(folio) < 0))
 		return NULL;
-	if (unlikely(!page_cache_add_speculative(head, refs)))
+	if (unlikely(!folio_ref_try_add_rcu(folio, refs)))
 		return NULL;
 
 	/*
-	 * At this point we have a stable reference to the head page; but it
-	 * could be that between the compound_head() lookup and the refcount
-	 * increment, the compound page was split, in which case we'd end up
-	 * holding a reference on a page that has nothing to do with the page
+	 * At this point we have a stable reference to the folio; but it
+	 * could be that between calling page_folio() and the refcount
+	 * increment, the folio was split, in which case we'd end up
+	 * holding a reference on a folio that has nothing to do with the page
 	 * we were given anymore.
-	 * So now that the head page is stable, recheck that the pages still
-	 * belong together.
+	 * So now that the folio is stable, recheck that the page still
+	 * belongs to this folio.
 	 */
-	if (unlikely(compound_head(page) != head)) {
-		put_page_refs(head, refs);
+	if (unlikely(page_folio(page) != folio)) {
+		folio_put_refs(folio, refs);
 		goto retry;
 	}
 
-	return head;
+	return folio;
 }
 
 /**
- * try_grab_compound_head() - attempt to elevate a page's refcount, by a
- * flags-dependent amount.
- *
- * Even though the name includes "compound_head", this function is still
- * appropriate for callers that have a non-compound @page to get.
- *
+ * try_grab_folio() - Attempt to get or pin a folio.
  * @page:  pointer to page to be grabbed
- * @refs:  the value to (effectively) add to the page's refcount
+ * @refs:  the value to (effectively) add to the folio's refcount
  * @flags: gup flags: these are the FOLL_* flag values.
  *
  * "grab" names in this file mean, "look at flags to decide whether to use
- * FOLL_PIN or FOLL_GET behavior, when incrementing the page's refcount.
+ * FOLL_PIN or FOLL_GET behavior, when incrementing the folio's refcount.
  *
  * Either FOLL_PIN or FOLL_GET (or neither) must be set, but not both at the
  * same time. (That's true throughout the get_user_pages*() and
  * pin_user_pages*() APIs.) Cases:
  *
- *    FOLL_GET: page's refcount will be incremented by @refs.
+ *    FOLL_GET: folio's refcount will be incremented by @refs.
  *
- *    FOLL_PIN on compound pages: page's refcount will be incremented by
- *    @refs, and page[1].compound_pincount will be incremented by @refs.
+ *    FOLL_PIN on large folios: folio's refcount will be incremented by
+ *    @refs, and its compound_pincount will be incremented by @refs.
  *
- *    FOLL_PIN on normal pages: page's refcount will be incremented by
+ *    FOLL_PIN on single-page folios: folio's refcount will be incremented by
  *    @refs * GUP_PIN_COUNTING_BIAS.
  *
- * Return: head page (with refcount appropriately incremented) for success, or
- * NULL upon failure. If neither FOLL_GET nor FOLL_PIN was set, that's
- * considered failure, and furthermore, a likely bug in the caller, so a warning
- * is also emitted.
+ * Return: The folio containing @page (with refcount appropriately
+ * incremented) for success, or NULL upon failure. If neither FOLL_GET
+ * nor FOLL_PIN was set, that's considered failure, and furthermore,
+ * a likely bug in the caller, so a warning is also emitted.
  */
-__maybe_unused struct page *try_grab_compound_head(struct page *page,
-						   int refs, unsigned int flags)
+struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags)
 {
 	if (flags & FOLL_GET)
-		return try_get_compound_head(page, refs);
+		return try_get_folio(page, refs);
 	else if (flags & FOLL_PIN) {
+		struct folio *folio;
+
 		/*
 		 * Can't do FOLL_LONGTERM + FOLL_PIN gup fast path if not in a
 		 * right zone, so fail and let the caller fall back to the slow
@@ -129,34 +124,38 @@ __maybe_unused struct page *try_grab_compound_head(struct page *page,
 		 * CAUTION: Don't use compound_head() on the page before this
 		 * point, the result won't be stable.
 		 */
-		page = try_get_compound_head(page, refs);
-		if (!page)
+		folio = try_get_folio(page, refs);
+		if (!folio)
 			return NULL;
 
 		/*
-		 * When pinning a compound page, use an exact count to
-		 * track it.
+		 * When pinning a large folio, use an exact count to track it.
 		 *
-		 * However, be sure to *also* increment the normal page
-		 * refcount field at least once, so that the page really
+		 * However, be sure to *also* increment the normal folio
+		 * refcount field at least once, so that the folio really
 		 * is pinned.  That's why the refcount from the earlier
-		 * try_get_compound_head() is left intact.
+		 * try_get_folio() is left intact.
 		 */
-		if (PageHead(page))
-			atomic_add(refs, compound_pincount_ptr(page));
+		if (folio_test_large(folio))
+			atomic_add(refs, folio_pincount_ptr(folio));
 		else
-			page_ref_add(page, refs * (GUP_PIN_COUNTING_BIAS - 1));
+			folio_ref_add(folio,
+					refs * (GUP_PIN_COUNTING_BIAS - 1));
+		node_stat_mod_folio(folio, NR_FOLL_PIN_ACQUIRED, refs);
 
-		mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_ACQUIRED,
-				    refs);
-
-		return page;
+		return folio;
 	}
 
 	WARN_ON_ONCE(1);
 	return NULL;
 }
 
+struct page *try_grab_compound_head(struct page *page,
+		int refs, unsigned int flags)
+{
+	return &try_grab_folio(page, refs, flags)->page;
+}
+
 static void put_compound_head(struct page *page, int refs, unsigned int flags)
 {
 	VM_BUG_ON_PAGE(PageTail(page), page);
@@ -185,7 +184,7 @@ static void put_compound_head(struct page *page, int refs, unsigned int flags)
  * @flags:   gup flags: these are the FOLL_* flag values.
  *
  * Either FOLL_PIN or FOLL_GET (or neither) may be set, but not both at the same
- * time. Cases: please see the try_grab_compound_head() documentation, with
+ * time. Cases: please see the try_grab_folio() documentation, with
  * "refs=1".
  *
  * Return: true for success, or if no action was required (if neither FOLL_PIN
diff --git a/mm/internal.h b/mm/internal.h
index d80300392a19..08a44802c80e 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -718,4 +718,9 @@ void vunmap_range_noflush(unsigned long start, unsigned long end);
 int numa_migrate_prep(struct page *page, struct vm_area_struct *vma,
 		      unsigned long addr, int page_nid, int *flags);
 
+/*
+ * mm/gup.c
+ */
+struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags);
+
 #endif	/* __MM_INTERNAL_H */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 16/75] mm/gup: Convert try_grab_page() to use a folio
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (14 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 15/75] mm/gup: Add try_get_folio() and try_grab_folio() Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-06  2:12   ` John Hubbard
  2022-02-07  7:47   ` Christoph Hellwig
  2022-02-04 19:57 ` [PATCH 17/75] mm: Remove page_cache_add_speculative() and page_cache_get_speculative() Matthew Wilcox (Oracle)
                   ` (60 subsequent siblings)
  76 siblings, 2 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Hoist the folio conversion and the folio_ref_count() check to the
top of the function instead of using the one buried in try_get_page().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/gup.c | 28 +++++++++++++---------------
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 4f1669db92f5..d18ce4da573f 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -174,15 +174,14 @@ static void put_compound_head(struct page *page, int refs, unsigned int flags)
 
 /**
  * try_grab_page() - elevate a page's refcount by a flag-dependent amount
+ * @page:    pointer to page to be grabbed
+ * @flags:   gup flags: these are the FOLL_* flag values.
  *
  * This might not do anything at all, depending on the flags argument.
  *
  * "grab" names in this file mean, "look at flags to decide whether to use
  * FOLL_PIN or FOLL_GET behavior, when incrementing the page's refcount.
  *
- * @page:    pointer to page to be grabbed
- * @flags:   gup flags: these are the FOLL_* flag values.
- *
  * Either FOLL_PIN or FOLL_GET (or neither) may be set, but not both at the same
  * time. Cases: please see the try_grab_folio() documentation, with
  * "refs=1".
@@ -193,29 +192,28 @@ static void put_compound_head(struct page *page, int refs, unsigned int flags)
  */
 bool __must_check try_grab_page(struct page *page, unsigned int flags)
 {
+	struct folio *folio = page_folio(page);
+
 	WARN_ON_ONCE((flags & (FOLL_GET | FOLL_PIN)) == (FOLL_GET | FOLL_PIN));
+	if (WARN_ON_ONCE(folio_ref_count(folio) <= 0))
+		return false;
 
 	if (flags & FOLL_GET)
-		return try_get_page(page);
+		folio_ref_inc(folio);
 	else if (flags & FOLL_PIN) {
-		page = compound_head(page);
-
-		if (WARN_ON_ONCE(page_ref_count(page) <= 0))
-			return false;
-
 		/*
-		 * Similar to try_grab_compound_head(): be sure to *also*
+		 * Similar to try_grab_folio(): be sure to *also*
 		 * increment the normal page refcount field at least once,
 		 * so that the page really is pinned.
 		 */
-		if (PageHead(page)) {
-			page_ref_add(page, 1);
-			atomic_add(1, compound_pincount_ptr(page));
+		if (folio_test_large(folio)) {
+			folio_ref_add(folio, 1);
+			atomic_add(1, folio_pincount_ptr(folio));
 		} else {
-			page_ref_add(page, GUP_PIN_COUNTING_BIAS);
+			folio_ref_add(folio, GUP_PIN_COUNTING_BIAS);
 		}
 
-		mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_ACQUIRED, 1);
+		node_stat_mod_folio(folio, NR_FOLL_PIN_ACQUIRED, 1);
 	}
 
 	return true;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 17/75] mm: Remove page_cache_add_speculative() and page_cache_get_speculative()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (15 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 16/75] mm/gup: Convert try_grab_page() to use a folio Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 18/75] mm/gup: Add gup_put_folio() Matthew Wilcox (Oracle)
                   ` (59 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

These wrappers have no more callers, so delete them.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 include/linux/mm.h      |  7 +++----
 include/linux/pagemap.h | 10 ----------
 2 files changed, 3 insertions(+), 14 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a29dacec7294..703bc2ec40a9 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1258,10 +1258,9 @@ static inline void put_page(struct page *page)
  * applications that don't have huge page reference counts, this won't be an
  * issue.
  *
- * Locking: the lockless algorithm described in page_cache_get_speculative()
- * and page_cache_gup_pin_speculative() provides safe operation for
- * get_user_pages and page_mkclean and other calls that race to set up page
- * table entries.
+ * Locking: the lockless algorithm described in folio_try_get_rcu()
+ * provides safe operation for get_user_pages(), page_mkclean() and
+ * other calls that race to set up page table entries.
  */
 #define GUP_PIN_COUNTING_BIAS (1U << 10)
 
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 270bf5136c34..cdb3f118603a 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -283,16 +283,6 @@ static inline struct inode *folio_inode(struct folio *folio)
 	return folio->mapping->host;
 }
 
-static inline bool page_cache_add_speculative(struct page *page, int count)
-{
-	return folio_ref_try_add_rcu((struct folio *)page, count);
-}
-
-static inline bool page_cache_get_speculative(struct page *page)
-{
-	return page_cache_add_speculative(page, 1);
-}
-
 /**
  * folio_attach_private - Attach private data to a folio.
  * @folio: Folio to attach data to.
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 18/75] mm/gup: Add gup_put_folio()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (16 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 17/75] mm: Remove page_cache_add_speculative() and page_cache_get_speculative() Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 19/75] mm/hugetlb: Use try_grab_folio() instead of try_grab_compound_head() Matthew Wilcox (Oracle)
                   ` (58 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

Convert put_compound_head() to gup_put_folio() and hpage_pincount_sub()
to folio_pincount_sub().  This removes the last call to put_page_refs(),
so delete it.  Add a temporary put_compound_head() wrapper which will
be deleted by the end of this series.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/gup.c | 38 ++++++++++++--------------------------
 1 file changed, 12 insertions(+), 26 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index d18ce4da573f..04a370b1580e 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -29,23 +29,6 @@ struct follow_page_context {
 	unsigned int page_mask;
 };
 
-/* Equivalent to calling put_page() @refs times. */
-static void put_page_refs(struct page *page, int refs)
-{
-#ifdef CONFIG_DEBUG_VM
-	if (VM_WARN_ON_ONCE_PAGE(page_ref_count(page) < refs, page))
-		return;
-#endif
-
-	/*
-	 * Calling put_page() for each ref is unnecessarily slow. Only the last
-	 * ref needs a put_page().
-	 */
-	if (refs > 1)
-		page_ref_sub(page, refs - 1);
-	put_page(page);
-}
-
 /*
  * Return the folio with ref appropriately incremented,
  * or NULL if that failed.
@@ -156,20 +139,23 @@ struct page *try_grab_compound_head(struct page *page,
 	return &try_grab_folio(page, refs, flags)->page;
 }
 
-static void put_compound_head(struct page *page, int refs, unsigned int flags)
+static void gup_put_folio(struct folio *folio, int refs, unsigned int flags)
 {
-	VM_BUG_ON_PAGE(PageTail(page), page);
-
 	if (flags & FOLL_PIN) {
-		mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_RELEASED,
-				    refs);
-		if (PageHead(page))
-			atomic_sub(refs, compound_pincount_ptr(page));
+		node_stat_mod_folio(folio, NR_FOLL_PIN_RELEASED, refs);
+		if (folio_test_large(folio))
+			atomic_sub(refs, folio_pincount_ptr(folio));
 		else
 			refs *= GUP_PIN_COUNTING_BIAS;
 	}
 
-	put_page_refs(page, refs);
+	folio_put_refs(folio, refs);
+}
+
+static void put_compound_head(struct page *page, int refs, unsigned int flags)
+{
+	VM_BUG_ON_PAGE(PageTail(page), page);
+	gup_put_folio((struct folio *)page, refs, flags);
 }
 
 /**
@@ -230,7 +216,7 @@ bool __must_check try_grab_page(struct page *page, unsigned int flags)
  */
 void unpin_user_page(struct page *page)
 {
-	put_compound_head(compound_head(page), 1, FOLL_PIN);
+	gup_put_folio(page_folio(page), 1, FOLL_PIN);
 }
 EXPORT_SYMBOL(unpin_user_page);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 19/75] mm/hugetlb: Use try_grab_folio() instead of try_grab_compound_head()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (17 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 18/75] mm/gup: Add gup_put_folio() Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 20/75] mm/gup: Convert gup_pte_range() to use a folio Matthew Wilcox (Oracle)
                   ` (57 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

follow_hugetlb_page() only cares about success or failure, so it doesn't
need to know the type of the returned pointer, only whether it's NULL
or not.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 include/linux/mm.h | 3 ---
 mm/gup.c           | 2 +-
 mm/hugetlb.c       | 7 +++----
 3 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 703bc2ec40a9..da565dc1029d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1162,9 +1162,6 @@ static inline void get_page(struct page *page)
 }
 
 bool __must_check try_grab_page(struct page *page, unsigned int flags);
-struct page *try_grab_compound_head(struct page *page, int refs,
-				    unsigned int flags);
-
 
 static inline __must_check bool try_get_page(struct page *page)
 {
diff --git a/mm/gup.c b/mm/gup.c
index 04a370b1580e..00227b2cb1cf 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -133,7 +133,7 @@ struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags)
 	return NULL;
 }
 
-struct page *try_grab_compound_head(struct page *page,
+static inline struct page *try_grab_compound_head(struct page *page,
 		int refs, unsigned int flags)
 {
 	return &try_grab_folio(page, refs, flags)->page;
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 61895cc01d09..5ce3a0d891c6 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -6072,7 +6072,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma,
 
 		if (pages) {
 			/*
-			 * try_grab_compound_head() should always succeed here,
+			 * try_grab_folio() should always succeed here,
 			 * because: a) we hold the ptl lock, and b) we've just
 			 * checked that the huge page is present in the page
 			 * tables. If the huge page is present, then the tail
@@ -6081,9 +6081,8 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma,
 			 * any way. So this page must be available at this
 			 * point, unless the page refcount overflowed:
 			 */
-			if (WARN_ON_ONCE(!try_grab_compound_head(pages[i],
-								 refs,
-								 flags))) {
+			if (WARN_ON_ONCE(!try_grab_folio(pages[i], refs,
+							 flags))) {
 				spin_unlock(ptl);
 				remainder = 0;
 				err = -ENOMEM;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 20/75] mm/gup: Convert gup_pte_range() to use a folio
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (18 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 19/75] mm/hugetlb: Use try_grab_folio() instead of try_grab_compound_head() Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-06 14:52   ` Mark Hemment
  2022-02-04 19:57 ` [PATCH 21/75] mm/gup: Convert gup_hugepte() " Matthew Wilcox (Oracle)
                   ` (56 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

We still call try_grab_folio() once per PTE; a future patch could
optimise to just adjust the reference count for each page within
the folio.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/gup.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 00227b2cb1cf..44281350db1a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2252,7 +2252,8 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
 	ptem = ptep = pte_offset_map(&pmd, addr);
 	do {
 		pte_t pte = ptep_get_lockless(ptep);
-		struct page *head, *page;
+		struct page *page;
+		struct folio *folio;
 
 		/*
 		 * Similar to the PMD case below, NUMA hinting must take slow
@@ -2279,22 +2280,20 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
 		VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
 		page = pte_page(pte);
 
-		head = try_grab_compound_head(page, 1, flags);
-		if (!head)
+		folio = try_grab_folio(page, 1, flags);
+		if (!folio)
 			goto pte_unmap;
 
 		if (unlikely(page_is_secretmem(page))) {
-			put_compound_head(head, 1, flags);
+			gup_put_folio(folio, 1, flags);
 			goto pte_unmap;
 		}
 
 		if (unlikely(pte_val(pte) != pte_val(*ptep))) {
-			put_compound_head(head, 1, flags);
+			gup_put_folio(folio, 1, flags);
 			goto pte_unmap;
 		}
 
-		VM_BUG_ON_PAGE(compound_head(page) != head, page);
-
 		/*
 		 * We need to make the page accessible if and only if we are
 		 * going to access its content (the FOLL_PIN case).  Please
@@ -2308,10 +2307,9 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
 				goto pte_unmap;
 			}
 		}
-		SetPageReferenced(page);
+		folio_set_referenced(folio);
 		pages[*nr] = page;
 		(*nr)++;
-
 	} while (ptep++, addr += PAGE_SIZE, addr != end);
 
 	ret = 1;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 21/75] mm/gup: Convert gup_hugepte() to use a folio
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (19 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 20/75] mm/gup: Convert gup_pte_range() to use a folio Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-04 19:57 ` [PATCH 22/75] mm/gup: Convert gup_huge_pmd() " Matthew Wilcox (Oracle)
                   ` (55 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

There should be little to no effect from this patch; just removing
uses of some old APIs.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/gup.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 44281350db1a..6faf8beb4cd9 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2445,7 +2445,8 @@ static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr,
 		       struct page **pages, int *nr)
 {
 	unsigned long pte_end;
-	struct page *head, *page;
+	struct page *page;
+	struct folio *folio;
 	pte_t pte;
 	int refs;
 
@@ -2461,21 +2462,20 @@ static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr,
 	/* hugepages are never "special" */
 	VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
 
-	head = pte_page(pte);
-	page = nth_page(head, (addr & (sz - 1)) >> PAGE_SHIFT);
+	page = nth_page(pte_page(pte), (addr & (sz - 1)) >> PAGE_SHIFT);
 	refs = record_subpages(page, addr, end, pages + *nr);
 
-	head = try_grab_compound_head(head, refs, flags);
-	if (!head)
+	folio = try_grab_folio(page, refs, flags);
+	if (!folio)
 		return 0;
 
 	if (unlikely(pte_val(pte) != pte_val(*ptep))) {
-		put_compound_head(head, refs, flags);
+		gup_put_folio(folio, refs, flags);
 		return 0;
 	}
 
 	*nr += refs;
-	SetPageReferenced(head);
+	folio_set_referenced(folio);
 	return 1;
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 22/75] mm/gup: Convert gup_huge_pmd() to use a folio
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (20 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 21/75] mm/gup: Convert gup_hugepte() " Matthew Wilcox (Oracle)
@ 2022-02-04 19:57 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 23/75] mm/gup: Convert gup_huge_pud() " Matthew Wilcox (Oracle)
                   ` (54 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:57 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

Use the new folio-based APIs.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/gup.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 6faf8beb4cd9..ca8262392ce3 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2509,7 +2509,8 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
 			unsigned long end, unsigned int flags,
 			struct page **pages, int *nr)
 {
-	struct page *head, *page;
+	struct page *page;
+	struct folio *folio;
 	int refs;
 
 	if (!pmd_access_permitted(orig, flags & FOLL_WRITE))
@@ -2525,17 +2526,17 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
 	page = nth_page(pmd_page(orig), (addr & ~PMD_MASK) >> PAGE_SHIFT);
 	refs = record_subpages(page, addr, end, pages + *nr);
 
-	head = try_grab_compound_head(pmd_page(orig), refs, flags);
-	if (!head)
+	folio = try_grab_folio(page, refs, flags);
+	if (!folio)
 		return 0;
 
 	if (unlikely(pmd_val(orig) != pmd_val(*pmdp))) {
-		put_compound_head(head, refs, flags);
+		gup_put_folio(folio, refs, flags);
 		return 0;
 	}
 
 	*nr += refs;
-	SetPageReferenced(head);
+	folio_set_referenced(folio);
 	return 1;
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 23/75] mm/gup: Convert gup_huge_pud() to use a folio
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (21 preceding siblings ...)
  2022-02-04 19:57 ` [PATCH 22/75] mm/gup: Convert gup_huge_pmd() " Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 24/75] mm/gup: Convert gup_huge_pgd() " Matthew Wilcox (Oracle)
                   ` (53 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

Use the new folio-based APIs.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/gup.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index ca8262392ce3..6d7a2ba6790b 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2544,7 +2544,8 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr,
 			unsigned long end, unsigned int flags,
 			struct page **pages, int *nr)
 {
-	struct page *head, *page;
+	struct page *page;
+	struct folio *folio;
 	int refs;
 
 	if (!pud_access_permitted(orig, flags & FOLL_WRITE))
@@ -2560,17 +2561,17 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr,
 	page = nth_page(pud_page(orig), (addr & ~PUD_MASK) >> PAGE_SHIFT);
 	refs = record_subpages(page, addr, end, pages + *nr);
 
-	head = try_grab_compound_head(pud_page(orig), refs, flags);
-	if (!head)
+	folio = try_grab_folio(page, refs, flags);
+	if (!folio)
 		return 0;
 
 	if (unlikely(pud_val(orig) != pud_val(*pudp))) {
-		put_compound_head(head, refs, flags);
+		gup_put_folio(folio, refs, flags);
 		return 0;
 	}
 
 	*nr += refs;
-	SetPageReferenced(head);
+	folio_set_referenced(folio);
 	return 1;
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 24/75] mm/gup: Convert gup_huge_pgd() to use a folio
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (22 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 23/75] mm/gup: Convert gup_huge_pud() " Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 25/75] mm/gup: Turn compound_next() into gup_folio_next() Matthew Wilcox (Oracle)
                   ` (52 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

Use the new folio-based APIs.  This was the last user of
try_grab_compound_head(), so remove it.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/gup.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 6d7a2ba6790b..bf196219c189 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -133,12 +133,6 @@ struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags)
 	return NULL;
 }
 
-static inline struct page *try_grab_compound_head(struct page *page,
-		int refs, unsigned int flags)
-{
-	return &try_grab_folio(page, refs, flags)->page;
-}
-
 static void gup_put_folio(struct folio *folio, int refs, unsigned int flags)
 {
 	if (flags & FOLL_PIN) {
@@ -2580,7 +2574,8 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr,
 			struct page **pages, int *nr)
 {
 	int refs;
-	struct page *head, *page;
+	struct page *page;
+	struct folio *folio;
 
 	if (!pgd_access_permitted(orig, flags & FOLL_WRITE))
 		return 0;
@@ -2590,17 +2585,17 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr,
 	page = nth_page(pgd_page(orig), (addr & ~PGDIR_MASK) >> PAGE_SHIFT);
 	refs = record_subpages(page, addr, end, pages + *nr);
 
-	head = try_grab_compound_head(pgd_page(orig), refs, flags);
-	if (!head)
+	folio = try_grab_folio(page, refs, flags);
+	if (!folio)
 		return 0;
 
 	if (unlikely(pgd_val(orig) != pgd_val(*pgdp))) {
-		put_compound_head(head, refs, flags);
+		gup_put_folio(folio, refs, flags);
 		return 0;
 	}
 
 	*nr += refs;
-	SetPageReferenced(head);
+	folio_set_referenced(folio);
 	return 1;
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 25/75] mm/gup: Turn compound_next() into gup_folio_next()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (23 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 24/75] mm/gup: Convert gup_huge_pgd() " Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 26/75] mm/gup: Turn compound_range_next() into gup_folio_range_next() Matthew Wilcox (Oracle)
                   ` (51 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

Convert both callers to work on folios instead of pages.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/gup.c | 40 +++++++++++++++++++++-------------------
 1 file changed, 21 insertions(+), 19 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index bf196219c189..d90f8e5790c0 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -230,20 +230,19 @@ static inline struct page *compound_range_next(struct page *start,
 	return page;
 }
 
-static inline struct page *compound_next(struct page **list,
+static inline struct folio *gup_folio_next(struct page **list,
 		unsigned long npages, unsigned long i, unsigned int *ntails)
 {
-	struct page *page;
+	struct folio *folio = page_folio(list[i]);
 	unsigned int nr;
 
-	page = compound_head(list[i]);
 	for (nr = i + 1; nr < npages; nr++) {
-		if (compound_head(list[nr]) != page)
+		if (page_folio(list[nr]) != folio)
 			break;
 	}
 
 	*ntails = nr - i;
-	return page;
+	return folio;
 }
 
 /**
@@ -271,17 +270,17 @@ static inline struct page *compound_next(struct page **list,
 void unpin_user_pages_dirty_lock(struct page **pages, unsigned long npages,
 				 bool make_dirty)
 {
-	unsigned long index;
-	struct page *head;
-	unsigned int ntails;
+	unsigned long i;
+	struct folio *folio;
+	unsigned int nr;
 
 	if (!make_dirty) {
 		unpin_user_pages(pages, npages);
 		return;
 	}
 
-	for (index = 0; index < npages; index += ntails) {
-		head = compound_next(pages, npages, index, &ntails);
+	for (i = 0; i < npages; i += nr) {
+		folio = gup_folio_next(pages, npages, i, &nr);
 		/*
 		 * Checking PageDirty at this point may race with
 		 * clear_page_dirty_for_io(), but that's OK. Two key
@@ -302,9 +301,12 @@ void unpin_user_pages_dirty_lock(struct page **pages, unsigned long npages,
 		 * written back, so it gets written back again in the
 		 * next writeback cycle. This is harmless.
 		 */
-		if (!PageDirty(head))
-			set_page_dirty_lock(head);
-		put_compound_head(head, ntails, FOLL_PIN);
+		if (!folio_test_dirty(folio)) {
+			folio_lock(folio);
+			folio_mark_dirty(folio);
+			folio_unlock(folio);
+		}
+		gup_put_folio(folio, nr, FOLL_PIN);
 	}
 }
 EXPORT_SYMBOL(unpin_user_pages_dirty_lock);
@@ -357,9 +359,9 @@ EXPORT_SYMBOL(unpin_user_page_range_dirty_lock);
  */
 void unpin_user_pages(struct page **pages, unsigned long npages)
 {
-	unsigned long index;
-	struct page *head;
-	unsigned int ntails;
+	unsigned long i;
+	struct folio *folio;
+	unsigned int nr;
 
 	/*
 	 * If this WARN_ON() fires, then the system *might* be leaking pages (by
@@ -369,9 +371,9 @@ void unpin_user_pages(struct page **pages, unsigned long npages)
 	if (WARN_ON(IS_ERR_VALUE(npages)))
 		return;
 
-	for (index = 0; index < npages; index += ntails) {
-		head = compound_next(pages, npages, index, &ntails);
-		put_compound_head(head, ntails, FOLL_PIN);
+	for (i = 0; i < npages; i += nr) {
+		folio = gup_folio_next(pages, npages, i, &nr);
+		gup_put_folio(folio, nr, FOLL_PIN);
 	}
 }
 EXPORT_SYMBOL(unpin_user_pages);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 26/75] mm/gup: Turn compound_range_next() into gup_folio_range_next()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (24 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 25/75] mm/gup: Turn compound_next() into gup_folio_next() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 27/75] mm: Turn isolate_lru_page() into folio_isolate_lru() Matthew Wilcox (Oracle)
                   ` (50 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

Convert the only caller to work on folios instead of pages.
This removes the last caller of put_compound_head(), so delete it.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 include/linux/mm.h |  4 ++--
 mm/gup.c           | 38 +++++++++++++++++---------------------
 2 files changed, 19 insertions(+), 23 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index da565dc1029d..3ca6dea4fe4a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -216,10 +216,10 @@ int overcommit_policy_handler(struct ctl_table *, int, void *, size_t *,
 
 #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
 #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
-#define page_nth(head, tail)	(page_to_pfn(tail) - page_to_pfn(head))
+#define folio_page_idx(folio, p)	(page_to_pfn(p) - folio_pfn(folio))
 #else
 #define nth_page(page,n) ((page) + (n))
-#define page_nth(head, tail)	((tail) - (head))
+#define folio_page_idx(folio, p)	((p) - &(folio)->page)
 #endif
 
 /* to align the pointer to the (next) page boundary */
diff --git a/mm/gup.c b/mm/gup.c
index d90f8e5790c0..edec7356b965 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -146,12 +146,6 @@ static void gup_put_folio(struct folio *folio, int refs, unsigned int flags)
 	folio_put_refs(folio, refs);
 }
 
-static void put_compound_head(struct page *page, int refs, unsigned int flags)
-{
-	VM_BUG_ON_PAGE(PageTail(page), page);
-	gup_put_folio((struct folio *)page, refs, flags);
-}
-
 /**
  * try_grab_page() - elevate a page's refcount by a flag-dependent amount
  * @page:    pointer to page to be grabbed
@@ -214,20 +208,19 @@ void unpin_user_page(struct page *page)
 }
 EXPORT_SYMBOL(unpin_user_page);
 
-static inline struct page *compound_range_next(struct page *start,
+static inline struct folio *gup_folio_range_next(struct page *start,
 		unsigned long npages, unsigned long i, unsigned int *ntails)
 {
-	struct page *next, *page;
+	struct page *next = nth_page(start, i);
+	struct folio *folio = page_folio(next);
 	unsigned int nr = 1;
 
-	next = nth_page(start, i);
-	page = compound_head(next);
-	if (PageHead(page))
+	if (folio_test_large(folio))
 		nr = min_t(unsigned int, npages - i,
-			   compound_nr(page) - page_nth(page, next));
+			   folio_nr_pages(folio) - folio_page_idx(folio, next));
 
 	*ntails = nr;
-	return page;
+	return folio;
 }
 
 static inline struct folio *gup_folio_next(struct page **list,
@@ -335,15 +328,18 @@ EXPORT_SYMBOL(unpin_user_pages_dirty_lock);
 void unpin_user_page_range_dirty_lock(struct page *page, unsigned long npages,
 				      bool make_dirty)
 {
-	unsigned long index;
-	struct page *head;
-	unsigned int ntails;
+	unsigned long i;
+	struct folio *folio;
+	unsigned int nr;
 
-	for (index = 0; index < npages; index += ntails) {
-		head = compound_range_next(page, npages, index, &ntails);
-		if (make_dirty && !PageDirty(head))
-			set_page_dirty_lock(head);
-		put_compound_head(head, ntails, FOLL_PIN);
+	for (i = 0; i < npages; i += nr) {
+		folio = gup_folio_range_next(page, npages, i, &nr);
+		if (make_dirty && !folio_test_dirty(folio)) {
+			folio_lock(folio);
+			folio_mark_dirty(folio);
+			folio_unlock(folio);
+		}
+		gup_put_folio(folio, nr, FOLL_PIN);
 	}
 }
 EXPORT_SYMBOL(unpin_user_page_range_dirty_lock);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 27/75] mm: Turn isolate_lru_page() into folio_isolate_lru()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (25 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 26/75] mm/gup: Turn compound_range_next() into gup_folio_range_next() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 28/75] mm/gup: Convert check_and_migrate_movable_pages() to use a folio Matthew Wilcox (Oracle)
                   ` (49 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

Add isolate_lru_page() as a wrapper around isolate_lru_folio().
TestClearPageLRU() would have always failed on a tail page, so
returning -EBUSY is the same behaviour.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 arch/powerpc/include/asm/mmu_context.h |  1 -
 mm/folio-compat.c                      |  8 +++++
 mm/internal.h                          |  3 +-
 mm/vmscan.c                            | 43 ++++++++++++--------------
 4 files changed, 29 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index fd277b15635c..b8527a74bd4d 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -21,7 +21,6 @@ extern void destroy_context(struct mm_struct *mm);
 #ifdef CONFIG_SPAPR_TCE_IOMMU
 struct mm_iommu_table_group_mem_t;
 
-extern int isolate_lru_page(struct page *page);	/* from internal.h */
 extern bool mm_iommu_preregistered(struct mm_struct *mm);
 extern long mm_iommu_new(struct mm_struct *mm,
 		unsigned long ua, unsigned long entries,
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 749555a232a8..a4a7725f4486 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -7,6 +7,7 @@
 #include <linux/migrate.h>
 #include <linux/pagemap.h>
 #include <linux/swap.h>
+#include "internal.h"
 
 struct address_space *page_mapping(struct page *page)
 {
@@ -151,3 +152,10 @@ int try_to_release_page(struct page *page, gfp_t gfp)
 	return filemap_release_folio(page_folio(page), gfp);
 }
 EXPORT_SYMBOL(try_to_release_page);
+
+int isolate_lru_page(struct page *page)
+{
+	if (WARN_RATELIMIT(PageTail(page), "trying to isolate tail page"))
+		return -EBUSY;
+	return folio_isolate_lru((struct folio *)page);
+}
diff --git a/mm/internal.h b/mm/internal.h
index 08a44802c80e..8b0249909b06 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -157,7 +157,8 @@ extern unsigned long highest_memmap_pfn;
 /*
  * in mm/vmscan.c:
  */
-extern int isolate_lru_page(struct page *page);
+int isolate_lru_page(struct page *page);
+int folio_isolate_lru(struct folio *folio);
 extern void putback_lru_page(struct page *page);
 extern void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason);
 
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 090bfb605ecf..e0cc5f0cb999 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2209,45 +2209,40 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
 }
 
 /**
- * isolate_lru_page - tries to isolate a page from its LRU list
- * @page: page to isolate from its LRU list
+ * folio_isolate_lru() - Try to isolate a folio from its LRU list.
+ * @folio: Folio to isolate from its LRU list.
  *
- * Isolates a @page from an LRU list, clears PageLRU and adjusts the
- * vmstat statistic corresponding to whatever LRU list the page was on.
+ * Isolate a @folio from an LRU list and adjust the vmstat statistic
+ * corresponding to whatever LRU list the folio was on.
  *
- * Returns 0 if the page was removed from an LRU list.
- * Returns -EBUSY if the page was not on an LRU list.
- *
- * The returned page will have PageLRU() cleared.  If it was found on
- * the active list, it will have PageActive set.  If it was found on
- * the unevictable list, it will have the PageUnevictable bit set. That flag
+ * The folio will have its LRU flag cleared.  If it was found on the
+ * active list, it will have the Active flag set.  If it was found on the
+ * unevictable list, it will have the Unevictable flag set.  These flags
  * may need to be cleared by the caller before letting the page go.
  *
- * The vmstat statistic corresponding to the list on which the page was
- * found will be decremented.
- *
- * Restrictions:
+ * Context:
  *
  * (1) Must be called with an elevated refcount on the page. This is a
- *     fundamental difference from isolate_lru_pages (which is called
+ *     fundamental difference from isolate_lru_pages() (which is called
  *     without a stable reference).
- * (2) the lru_lock must not be held.
- * (3) interrupts must be enabled.
+ * (2) The lru_lock must not be held.
+ * (3) Interrupts must be enabled.
+ *
+ * Return: 0 if the folio was removed from an LRU list.
+ * -EBUSY if the folio was not on an LRU list.
  */
-int isolate_lru_page(struct page *page)
+int folio_isolate_lru(struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
 	int ret = -EBUSY;
 
-	VM_BUG_ON_PAGE(!page_count(page), page);
-	WARN_RATELIMIT(PageTail(page), "trying to isolate tail page");
+	VM_BUG_ON_FOLIO(!folio_ref_count(folio), folio);
 
-	if (TestClearPageLRU(page)) {
+	if (folio_test_clear_lru(folio)) {
 		struct lruvec *lruvec;
 
-		get_page(page);
+		folio_get(folio);
 		lruvec = folio_lruvec_lock_irq(folio);
-		del_page_from_lru_list(page, lruvec);
+		lruvec_del_folio(lruvec, folio);
 		unlock_page_lruvec_irq(lruvec);
 		ret = 0;
 	}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 28/75] mm/gup: Convert check_and_migrate_movable_pages() to use a folio
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (26 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 27/75] mm: Turn isolate_lru_page() into folio_isolate_lru() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 29/75] mm/workingset: Convert workingset_eviction() to take " Matthew Wilcox (Oracle)
                   ` (48 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski

Switch from head pages to folios.  This removes an assumption that
THPs are the only way to have a high-order page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/gup.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index edec7356b965..9f2f8d765c58 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1815,41 +1815,41 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages,
 	bool drain_allow = true;
 	LIST_HEAD(movable_page_list);
 	long ret = 0;
-	struct page *prev_head = NULL;
-	struct page *head;
+	struct folio *folio, *prev_folio = NULL;
 	struct migration_target_control mtc = {
 		.nid = NUMA_NO_NODE,
 		.gfp_mask = GFP_USER | __GFP_NOWARN,
 	};
 
 	for (i = 0; i < nr_pages; i++) {
-		head = compound_head(pages[i]);
-		if (head == prev_head)
+		folio = page_folio(pages[i]);
+		if (folio == prev_folio)
 			continue;
-		prev_head = head;
+		prev_folio = folio;
 		/*
 		 * If we get a movable page, since we are going to be pinning
 		 * these entries, try to move them out if possible.
 		 */
-		if (!is_pinnable_page(head)) {
-			if (PageHuge(head)) {
-				if (!isolate_huge_page(head, &movable_page_list))
+		if (!is_pinnable_page(&folio->page)) {
+			if (folio_test_hugetlb(folio)) {
+				if (!isolate_huge_page(&folio->page,
+							&movable_page_list))
 					isolation_error_count++;
 			} else {
-				if (!PageLRU(head) && drain_allow) {
+				if (!folio_test_lru(folio) && drain_allow) {
 					lru_add_drain_all();
 					drain_allow = false;
 				}
 
-				if (isolate_lru_page(head)) {
+				if (folio_isolate_lru(folio)) {
 					isolation_error_count++;
 					continue;
 				}
-				list_add_tail(&head->lru, &movable_page_list);
-				mod_node_page_state(page_pgdat(head),
+				list_add_tail(&folio->lru, &movable_page_list);
+				node_stat_mod_folio(folio,
 						    NR_ISOLATED_ANON +
-						    page_is_file_lru(head),
-						    thp_nr_pages(head));
+						    folio_is_file_lru(folio),
+						    folio_nr_pages(folio));
 			}
 		}
 	}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 29/75] mm/workingset: Convert workingset_eviction() to take a folio
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (27 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 28/75] mm/gup: Convert check_and_migrate_movable_pages() to use a folio Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-07  7:49   ` Christoph Hellwig
  2022-02-04 19:58 ` [PATCH 30/75] mm/memcg: Convert mem_cgroup_swapout() " Matthew Wilcox (Oracle)
                   ` (47 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This removes an assumption that THPs are the only kind of compound
pages and removes a few hidden calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/swap.h |  2 +-
 mm/vmscan.c          |  7 ++++---
 mm/workingset.c      | 25 +++++++++++++------------
 3 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 1d38d9475c4d..de36f140227e 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -328,7 +328,7 @@ static inline swp_entry_t folio_swap_entry(struct folio *folio)
 
 /* linux/mm/workingset.c */
 void workingset_age_nonresident(struct lruvec *lruvec, unsigned long nr_pages);
-void *workingset_eviction(struct page *page, struct mem_cgroup *target_memcg);
+void *workingset_eviction(struct folio *folio, struct mem_cgroup *target_memcg);
 void workingset_refault(struct folio *folio, void *shadow);
 void workingset_activation(struct folio *folio);
 
diff --git a/mm/vmscan.c b/mm/vmscan.c
index e0cc5f0cb999..75223b7d98ec 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1240,6 +1240,7 @@ static pageout_t pageout(struct page *page, struct address_space *mapping)
 static int __remove_mapping(struct address_space *mapping, struct page *page,
 			    bool reclaimed, struct mem_cgroup *target_memcg)
 {
+	struct folio *folio = page_folio(page);
 	int refcount;
 	void *shadow = NULL;
 
@@ -1287,7 +1288,7 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
 		swp_entry_t swap = { .val = page_private(page) };
 		mem_cgroup_swapout(page, swap);
 		if (reclaimed && !mapping_exiting(mapping))
-			shadow = workingset_eviction(page, target_memcg);
+			shadow = workingset_eviction(folio, target_memcg);
 		__delete_from_swap_cache(page, swap, shadow);
 		xa_unlock_irq(&mapping->i_pages);
 		put_swap_page(page, swap);
@@ -1313,8 +1314,8 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
 		 */
 		if (reclaimed && page_is_file_lru(page) &&
 		    !mapping_exiting(mapping) && !dax_mapping(mapping))
-			shadow = workingset_eviction(page, target_memcg);
-		__delete_from_page_cache(page, shadow);
+			shadow = workingset_eviction(folio, target_memcg);
+		__filemap_remove_folio(folio, shadow);
 		xa_unlock_irq(&mapping->i_pages);
 		if (mapping_shrinkable(mapping))
 			inode_add_lru(mapping->host);
diff --git a/mm/workingset.c b/mm/workingset.c
index 8c03afe1d67c..b717eae4e0dd 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -245,31 +245,32 @@ void workingset_age_nonresident(struct lruvec *lruvec, unsigned long nr_pages)
 }
 
 /**
- * workingset_eviction - note the eviction of a page from memory
+ * workingset_eviction - note the eviction of a folio from memory
  * @target_memcg: the cgroup that is causing the reclaim
- * @page: the page being evicted
+ * @folio: the folio being evicted
  *
- * Return: a shadow entry to be stored in @page->mapping->i_pages in place
- * of the evicted @page so that a later refault can be detected.
+ * Return: a shadow entry to be stored in @folio->mapping->i_pages in place
+ * of the evicted @folio so that a later refault can be detected.
  */
-void *workingset_eviction(struct page *page, struct mem_cgroup *target_memcg)
+void *workingset_eviction(struct folio *folio, struct mem_cgroup *target_memcg)
 {
-	struct pglist_data *pgdat = page_pgdat(page);
+	struct pglist_data *pgdat = folio_pgdat(folio);
 	unsigned long eviction;
 	struct lruvec *lruvec;
 	int memcgid;
 
-	/* Page is fully exclusive and pins page's memory cgroup pointer */
-	VM_BUG_ON_PAGE(PageLRU(page), page);
-	VM_BUG_ON_PAGE(page_count(page), page);
-	VM_BUG_ON_PAGE(!PageLocked(page), page);
+	/* Folio is fully exclusive and pins folio's memory cgroup pointer */
+	VM_BUG_ON_FOLIO(folio_test_lru(folio), folio);
+	VM_BUG_ON_FOLIO(folio_ref_count(folio), folio);
+	VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
 
 	lruvec = mem_cgroup_lruvec(target_memcg, pgdat);
 	/* XXX: target_memcg can be NULL, go through lruvec */
 	memcgid = mem_cgroup_id(lruvec_memcg(lruvec));
 	eviction = atomic_long_read(&lruvec->nonresident_age);
-	workingset_age_nonresident(lruvec, thp_nr_pages(page));
-	return pack_shadow(memcgid, pgdat, eviction, PageWorkingset(page));
+	workingset_age_nonresident(lruvec, folio_nr_pages(folio));
+	return pack_shadow(memcgid, pgdat, eviction,
+				folio_test_workingset(folio));
 }
 
 /**
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 30/75] mm/memcg: Convert mem_cgroup_swapout() to take a folio
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (28 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 29/75] mm/workingset: Convert workingset_eviction() to take " Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-07  7:49   ` Christoph Hellwig
  2022-02-04 19:58 ` [PATCH 31/75] mm: Add lru_to_folio() Matthew Wilcox (Oracle)
                   ` (46 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This removes an assumption that THPs are the only kind of compound
pages and removes a couple of hidden calls to compound_head.  It
also documents that you can't pass a tail page to mem_cgroup_swapout().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/swap.h |  4 ++--
 mm/memcontrol.c      | 22 +++++++++++-----------
 mm/vmscan.c          |  2 +-
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index de36f140227e..c9949c3bad20 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -741,7 +741,7 @@ static inline void cgroup_throttle_swaprate(struct page *page, gfp_t gfp_mask)
 #endif
 
 #ifdef CONFIG_MEMCG_SWAP
-extern void mem_cgroup_swapout(struct page *page, swp_entry_t entry);
+extern void mem_cgroup_swapout(struct folio *folio, swp_entry_t entry);
 extern int __mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry);
 static inline int mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry)
 {
@@ -761,7 +761,7 @@ static inline void mem_cgroup_uncharge_swap(swp_entry_t entry, unsigned int nr_p
 extern long mem_cgroup_get_nr_swap_pages(struct mem_cgroup *memcg);
 extern bool mem_cgroup_swap_full(struct page *page);
 #else
-static inline void mem_cgroup_swapout(struct page *page, swp_entry_t entry)
+static inline void mem_cgroup_swapout(struct folio *folio, swp_entry_t entry)
 {
 }
 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 09d342c7cbd0..326df978cedf 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -7121,19 +7121,19 @@ static struct mem_cgroup *mem_cgroup_id_get_online(struct mem_cgroup *memcg)
 
 /**
  * mem_cgroup_swapout - transfer a memsw charge to swap
- * @page: page whose memsw charge to transfer
+ * @folio: folio whose memsw charge to transfer
  * @entry: swap entry to move the charge to
  *
- * Transfer the memsw charge of @page to @entry.
+ * Transfer the memsw charge of @folio to @entry.
  */
-void mem_cgroup_swapout(struct page *page, swp_entry_t entry)
+void mem_cgroup_swapout(struct folio *folio, swp_entry_t entry)
 {
 	struct mem_cgroup *memcg, *swap_memcg;
 	unsigned int nr_entries;
 	unsigned short oldid;
 
-	VM_BUG_ON_PAGE(PageLRU(page), page);
-	VM_BUG_ON_PAGE(page_count(page), page);
+	VM_BUG_ON_FOLIO(folio_test_lru(folio), folio);
+	VM_BUG_ON_FOLIO(folio_ref_count(folio), folio);
 
 	if (mem_cgroup_disabled())
 		return;
@@ -7141,9 +7141,9 @@ void mem_cgroup_swapout(struct page *page, swp_entry_t entry)
 	if (cgroup_subsys_on_dfl(memory_cgrp_subsys))
 		return;
 
-	memcg = page_memcg(page);
+	memcg = folio_memcg(folio);
 
-	VM_WARN_ON_ONCE_PAGE(!memcg, page);
+	VM_WARN_ON_ONCE_FOLIO(!memcg, folio);
 	if (!memcg)
 		return;
 
@@ -7153,16 +7153,16 @@ void mem_cgroup_swapout(struct page *page, swp_entry_t entry)
 	 * ancestor for the swap instead and transfer the memory+swap charge.
 	 */
 	swap_memcg = mem_cgroup_id_get_online(memcg);
-	nr_entries = thp_nr_pages(page);
+	nr_entries = folio_nr_pages(folio);
 	/* Get references for the tail pages, too */
 	if (nr_entries > 1)
 		mem_cgroup_id_get_many(swap_memcg, nr_entries - 1);
 	oldid = swap_cgroup_record(entry, mem_cgroup_id(swap_memcg),
 				   nr_entries);
-	VM_BUG_ON_PAGE(oldid, page);
+	VM_BUG_ON_FOLIO(oldid, folio);
 	mod_memcg_state(swap_memcg, MEMCG_SWAP, nr_entries);
 
-	page->memcg_data = 0;
+	folio->memcg_data = 0;
 
 	if (!mem_cgroup_is_root(memcg))
 		page_counter_uncharge(&memcg->memory, nr_entries);
@@ -7181,7 +7181,7 @@ void mem_cgroup_swapout(struct page *page, swp_entry_t entry)
 	 */
 	VM_BUG_ON(!irqs_disabled());
 	mem_cgroup_charge_statistics(memcg, -nr_entries);
-	memcg_check_events(memcg, page_to_nid(page));
+	memcg_check_events(memcg, folio_nid(folio));
 
 	css_put(&memcg->css);
 }
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 75223b7d98ec..08dcb1897f58 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1286,7 +1286,7 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
 
 	if (PageSwapCache(page)) {
 		swp_entry_t swap = { .val = page_private(page) };
-		mem_cgroup_swapout(page, swap);
+		mem_cgroup_swapout(folio, swap);
 		if (reclaimed && !mapping_exiting(mapping))
 			shadow = workingset_eviction(folio, target_memcg);
 		__delete_from_swap_cache(page, swap, shadow);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 31/75] mm: Add lru_to_folio()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (29 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 30/75] mm/memcg: Convert mem_cgroup_swapout() " Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-07  7:50   ` Christoph Hellwig
  2022-02-04 19:58 ` [PATCH 32/75] mm: Turn putback_lru_page() into folio_putback_lru() Matthew Wilcox (Oracle)
                   ` (45 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Since page->lru occupies the same bytes as compound_head, any page
on the LRU list must be a folio.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 3ca6dea4fe4a..6cb2651eccbe 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -229,6 +229,7 @@ int overcommit_policy_handler(struct ctl_table *, int, void *, size_t *,
 #define PAGE_ALIGNED(addr)	IS_ALIGNED((unsigned long)(addr), PAGE_SIZE)
 
 #define lru_to_page(head) (list_entry((head)->prev, struct page, lru))
+#define lru_to_folio(head) (list_entry((head)->prev, struct folio, lru))
 
 void setup_initial_init_mm(void *start_code, void *end_code,
 			   void *end_data, void *brk);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 32/75] mm: Turn putback_lru_page() into folio_putback_lru()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (30 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 31/75] mm: Add lru_to_folio() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-07  7:50   ` Christoph Hellwig
  2022-02-04 19:58 ` [PATCH 33/75] mm/vmscan: Convert __remove_mapping() to take a folio Matthew Wilcox (Oracle)
                   ` (44 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Add a putback_lru_page() wrapper.  Removes a couple of compound_head()
calls.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/folio-compat.c |  5 +++++
 mm/internal.h     |  3 ++-
 mm/vmscan.c       | 16 ++++++++--------
 3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index a4a7725f4486..46fa179e32fb 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -159,3 +159,8 @@ int isolate_lru_page(struct page *page)
 		return -EBUSY;
 	return folio_isolate_lru((struct folio *)page);
 }
+
+void putback_lru_page(struct page *page)
+{
+	folio_putback_lru(page_folio(page));
+}
diff --git a/mm/internal.h b/mm/internal.h
index 8b0249909b06..b7a2195c12b1 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -159,7 +159,8 @@ extern unsigned long highest_memmap_pfn;
  */
 int isolate_lru_page(struct page *page);
 int folio_isolate_lru(struct folio *folio);
-extern void putback_lru_page(struct page *page);
+void putback_lru_page(struct page *page);
+void folio_putback_lru(struct folio *folio);
 extern void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason);
 
 /*
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 08dcb1897f58..9f11960b1db8 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1355,18 +1355,18 @@ int remove_mapping(struct address_space *mapping, struct page *page)
 }
 
 /**
- * putback_lru_page - put previously isolated page onto appropriate LRU list
- * @page: page to be put back to appropriate lru list
+ * folio_putback_lru - Put previously isolated folio onto appropriate LRU list.
+ * @folio: Folio to be returned to an LRU list.
  *
- * Add previously isolated @page to appropriate LRU list.
- * Page may still be unevictable for other reasons.
+ * Add previously isolated @folio to appropriate LRU list.
+ * The folio may still be unevictable for other reasons.
  *
- * lru_lock must not be held, interrupts must be enabled.
+ * Context: lru_lock must not be held, interrupts must be enabled.
  */
-void putback_lru_page(struct page *page)
+void folio_putback_lru(struct folio *folio)
 {
-	lru_cache_add(page);
-	put_page(page);		/* drop ref from isolate */
+	folio_add_lru(folio);
+	folio_put(folio);		/* drop ref from isolate */
 }
 
 enum page_references {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 33/75] mm/vmscan: Convert __remove_mapping() to take a folio
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (31 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 32/75] mm: Turn putback_lru_page() into folio_putback_lru() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-07  7:51   ` Christoph Hellwig
  2022-02-04 19:58 ` [PATCH 34/75] mm/vmscan: Turn page_check_dirty_writeback() into folio_check_dirty_writeback() Matthew Wilcox (Oracle)
                   ` (43 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This removes a few hidden calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/vmscan.c | 44 +++++++++++++++++++++++---------------------
 1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 9f11960b1db8..15cbfae0d8ec 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1237,17 +1237,16 @@ static pageout_t pageout(struct page *page, struct address_space *mapping)
  * Same as remove_mapping, but if the page is removed from the mapping, it
  * gets returned with a refcount of 0.
  */
-static int __remove_mapping(struct address_space *mapping, struct page *page,
+static int __remove_mapping(struct address_space *mapping, struct folio *folio,
 			    bool reclaimed, struct mem_cgroup *target_memcg)
 {
-	struct folio *folio = page_folio(page);
 	int refcount;
 	void *shadow = NULL;
 
-	BUG_ON(!PageLocked(page));
-	BUG_ON(mapping != page_mapping(page));
+	BUG_ON(!folio_test_locked(folio));
+	BUG_ON(mapping != folio_mapping(folio));
 
-	if (!PageSwapCache(page))
+	if (!folio_test_swapcache(folio))
 		spin_lock(&mapping->host->i_lock);
 	xa_lock_irq(&mapping->i_pages);
 	/*
@@ -1275,23 +1274,23 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
 	 * Note that if SetPageDirty is always performed via set_page_dirty,
 	 * and thus under the i_pages lock, then this ordering is not required.
 	 */
-	refcount = 1 + compound_nr(page);
-	if (!page_ref_freeze(page, refcount))
+	refcount = 1 + folio_nr_pages(folio);
+	if (!folio_ref_freeze(folio, refcount))
 		goto cannot_free;
 	/* note: atomic_cmpxchg in page_ref_freeze provides the smp_rmb */
-	if (unlikely(PageDirty(page))) {
-		page_ref_unfreeze(page, refcount);
+	if (unlikely(folio_test_dirty(folio))) {
+		folio_ref_unfreeze(folio, refcount);
 		goto cannot_free;
 	}
 
-	if (PageSwapCache(page)) {
-		swp_entry_t swap = { .val = page_private(page) };
+	if (folio_test_swapcache(folio)) {
+		swp_entry_t swap = folio_swap_entry(folio);
 		mem_cgroup_swapout(folio, swap);
 		if (reclaimed && !mapping_exiting(mapping))
 			shadow = workingset_eviction(folio, target_memcg);
-		__delete_from_swap_cache(page, swap, shadow);
+		__delete_from_swap_cache(&folio->page, swap, shadow);
 		xa_unlock_irq(&mapping->i_pages);
-		put_swap_page(page, swap);
+		put_swap_page(&folio->page, swap);
 	} else {
 		void (*freepage)(struct page *);
 
@@ -1312,7 +1311,7 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
 		 * exceptional entries and shadow exceptional entries in the
 		 * same address_space.
 		 */
-		if (reclaimed && page_is_file_lru(page) &&
+		if (reclaimed && folio_is_file_lru(folio) &&
 		    !mapping_exiting(mapping) && !dax_mapping(mapping))
 			shadow = workingset_eviction(folio, target_memcg);
 		__filemap_remove_folio(folio, shadow);
@@ -1322,14 +1321,14 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
 		spin_unlock(&mapping->host->i_lock);
 
 		if (freepage != NULL)
-			freepage(page);
+			freepage(&folio->page);
 	}
 
 	return 1;
 
 cannot_free:
 	xa_unlock_irq(&mapping->i_pages);
-	if (!PageSwapCache(page))
+	if (!folio_test_swapcache(folio))
 		spin_unlock(&mapping->host->i_lock);
 	return 0;
 }
@@ -1342,13 +1341,14 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
  */
 int remove_mapping(struct address_space *mapping, struct page *page)
 {
-	if (__remove_mapping(mapping, page, false, NULL)) {
+	struct folio *folio = page_folio(page);
+	if (__remove_mapping(mapping, folio, false, NULL)) {
 		/*
 		 * Unfreezing the refcount with 1 rather than 2 effectively
 		 * drops the pagecache ref for us without requiring another
 		 * atomic operation.
 		 */
-		page_ref_unfreeze(page, 1);
+		folio_ref_unfreeze(folio, 1);
 		return 1;
 	}
 	return 0;
@@ -1530,14 +1530,16 @@ static unsigned int shrink_page_list(struct list_head *page_list,
 	while (!list_empty(page_list)) {
 		struct address_space *mapping;
 		struct page *page;
+		struct folio *folio;
 		enum page_references references = PAGEREF_RECLAIM;
 		bool dirty, writeback, may_enter_fs;
 		unsigned int nr_pages;
 
 		cond_resched();
 
-		page = lru_to_page(page_list);
-		list_del(&page->lru);
+		folio = lru_to_folio(page_list);
+		list_del(&folio->lru);
+		page = &folio->page;
 
 		if (!trylock_page(page))
 			goto keep;
@@ -1890,7 +1892,7 @@ static unsigned int shrink_page_list(struct list_head *page_list,
 			 */
 			count_vm_event(PGLAZYFREED);
 			count_memcg_page_event(page, PGLAZYFREED);
-		} else if (!mapping || !__remove_mapping(mapping, page, true,
+		} else if (!mapping || !__remove_mapping(mapping, folio, true,
 							 sc->target_mem_cgroup))
 			goto keep_locked;
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 34/75] mm/vmscan: Turn page_check_dirty_writeback() into folio_check_dirty_writeback()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (32 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 33/75] mm/vmscan: Convert __remove_mapping() to take a folio Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-07  7:51   ` Christoph Hellwig
  2022-02-04 19:58 ` [PATCH 35/75] mm: Turn head_compound_mapcount() into folio_entire_mapcount() Matthew Wilcox (Oracle)
                   ` (42 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Saves a few calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/vmscan.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 15cbfae0d8ec..e8c5855bc38d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1430,7 +1430,7 @@ static enum page_references page_check_references(struct page *page,
 }
 
 /* Check if a page is dirty or under writeback */
-static void page_check_dirty_writeback(struct page *page,
+static void folio_check_dirty_writeback(struct folio *folio,
 				       bool *dirty, bool *writeback)
 {
 	struct address_space *mapping;
@@ -1439,24 +1439,24 @@ static void page_check_dirty_writeback(struct page *page,
 	 * Anonymous pages are not handled by flushers and must be written
 	 * from reclaim context. Do not stall reclaim based on them
 	 */
-	if (!page_is_file_lru(page) ||
-	    (PageAnon(page) && !PageSwapBacked(page))) {
+	if (!folio_is_file_lru(folio) ||
+	    (folio_test_anon(folio) && !folio_test_swapbacked(folio))) {
 		*dirty = false;
 		*writeback = false;
 		return;
 	}
 
-	/* By default assume that the page flags are accurate */
-	*dirty = PageDirty(page);
-	*writeback = PageWriteback(page);
+	/* By default assume that the folio flags are accurate */
+	*dirty = folio_test_dirty(folio);
+	*writeback = folio_test_writeback(folio);
 
 	/* Verify dirty/writeback state if the filesystem supports it */
-	if (!page_has_private(page))
+	if (!folio_test_private(folio))
 		return;
 
-	mapping = page_mapping(page);
+	mapping = folio_mapping(folio);
 	if (mapping && mapping->a_ops->is_dirty_writeback)
-		mapping->a_ops->is_dirty_writeback(page, dirty, writeback);
+		mapping->a_ops->is_dirty_writeback(&folio->page, dirty, writeback);
 }
 
 static struct page *alloc_demote_page(struct page *page, unsigned long node)
@@ -1565,7 +1565,7 @@ static unsigned int shrink_page_list(struct list_head *page_list,
 		 * reclaim_congested. kswapd will stall and start writing
 		 * pages if the tail of the LRU is all dirty unqueued pages.
 		 */
-		page_check_dirty_writeback(page, &dirty, &writeback);
+		folio_check_dirty_writeback(folio, &dirty, &writeback);
 		if (dirty || writeback)
 			stat->nr_dirty++;
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 35/75] mm: Turn head_compound_mapcount() into folio_entire_mapcount()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (33 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 34/75] mm/vmscan: Turn page_check_dirty_writeback() into folio_check_dirty_writeback() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-07  7:52   ` Christoph Hellwig
  2022-02-04 19:58 ` [PATCH 36/75] mm: Add folio_mapcount() Matthew Wilcox (Oracle)
                   ` (41 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Adjust documentation to be more clear.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h | 17 +++++++++++------
 mm/debug.c         |  6 ++++--
 2 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 6cb2651eccbe..6ddf655f9279 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -777,21 +777,26 @@ static inline int is_vmalloc_or_module_addr(const void *x)
 }
 #endif
 
-static inline int head_compound_mapcount(struct page *head)
+/*
+ * How many times the entire folio is mapped as a single unit (eg by a
+ * PMD or PUD entry).  This is probably not what you want, except for
+ * debugging purposes; look at folio_mapcount() or page_mapcount()
+ * instead.
+ */
+static inline int folio_entire_mapcount(struct folio *folio)
 {
-	return atomic_read(compound_mapcount_ptr(head)) + 1;
+	VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
+	return atomic_read(folio_mapcount_ptr(folio)) + 1;
 }
 
 /*
  * Mapcount of compound page as a whole, does not include mapped sub-pages.
  *
- * Must be called only for compound pages or any their tail sub-pages.
+ * Must be called only for compound pages.
  */
 static inline int compound_mapcount(struct page *page)
 {
-	VM_BUG_ON_PAGE(!PageCompound(page), page);
-	page = compound_head(page);
-	return head_compound_mapcount(page);
+	return folio_entire_mapcount(page_folio(page));
 }
 
 /*
diff --git a/mm/debug.c b/mm/debug.c
index c4cf44266430..eeb7ea3ca292 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -48,7 +48,8 @@ const struct trace_print_flags vmaflag_names[] = {
 
 static void __dump_page(struct page *page)
 {
-	struct page *head = compound_head(page);
+	struct folio *folio = page_folio(page);
+	struct page *head = &folio->page;
 	struct address_space *mapping;
 	bool compound = PageCompound(page);
 	/*
@@ -76,6 +77,7 @@ static void __dump_page(struct page *page)
 		else
 			mapping = (void *)(tmp & ~PAGE_MAPPING_FLAGS);
 		head = page;
+		folio = (struct folio *)page;
 		compound = false;
 	} else {
 		mapping = page_mapping(page);
@@ -94,7 +96,7 @@ static void __dump_page(struct page *page)
 	if (compound) {
 		pr_warn("head:%p order:%u compound_mapcount:%d compound_pincount:%d\n",
 				head, compound_order(head),
-				head_compound_mapcount(head),
+				folio_entire_mapcount(folio),
 				head_compound_pincount(head));
 	}
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 36/75] mm: Add folio_mapcount()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (34 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 35/75] mm: Turn head_compound_mapcount() into folio_entire_mapcount() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-07  7:53   ` Christoph Hellwig
  2022-02-04 19:58 ` [PATCH 37/75] mm: Add split_folio_to_list() Matthew Wilcox (Oracle)
                   ` (40 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This implements the same algorithm as total_mapcount(), which is
transformed into a wrapper function.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h |  8 +++++++-
 mm/huge_memory.c   | 24 ------------------------
 mm/util.c          | 33 +++++++++++++++++++++++++++++++++
 3 files changed, 40 insertions(+), 25 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 6ddf655f9279..6a19cd97d5aa 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -826,8 +826,14 @@ static inline int page_mapcount(struct page *page)
 	return atomic_read(&page->_mapcount) + 1;
 }
 
+int folio_mapcount(struct folio *folio);
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-int total_mapcount(struct page *page);
+static inline int total_mapcount(struct page *page)
+{
+	return folio_mapcount(page_folio(page));
+}
+
 int page_trans_huge_mapcount(struct page *page);
 #else
 static inline int total_mapcount(struct page *page)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 406a3c28c026..94e591d638eb 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2494,30 +2494,6 @@ static void __split_huge_page(struct page *page, struct list_head *list,
 	}
 }
 
-int total_mapcount(struct page *page)
-{
-	int i, compound, nr, ret;
-
-	VM_BUG_ON_PAGE(PageTail(page), page);
-
-	if (likely(!PageCompound(page)))
-		return atomic_read(&page->_mapcount) + 1;
-
-	compound = compound_mapcount(page);
-	nr = compound_nr(page);
-	if (PageHuge(page))
-		return compound;
-	ret = compound;
-	for (i = 0; i < nr; i++)
-		ret += atomic_read(&page[i]._mapcount) + 1;
-	/* File pages has compound_mapcount included in _mapcount */
-	if (!PageAnon(page))
-		return ret - compound * nr;
-	if (PageDoubleMap(page))
-		ret -= nr;
-	return ret;
-}
-
 /*
  * This calculates accurately how many mappings a transparent hugepage
  * has (unlike page_mapcount() which isn't fully accurate). This full
diff --git a/mm/util.c b/mm/util.c
index 7e43369064c8..b614f423aaa4 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -740,6 +740,39 @@ int __page_mapcount(struct page *page)
 }
 EXPORT_SYMBOL_GPL(__page_mapcount);
 
+/**
+ * folio_mapcount() - Calculate the number of mappings of this folio.
+ * @folio: The folio.
+ *
+ * A large folio tracks both how many times the entire folio is mapped,
+ * and how many times each individual page in the folio is mapped.
+ * This function calculates the total number of times the folio is
+ * mapped.
+ *
+ * Return: The number of times this folio is mapped.
+ */
+int folio_mapcount(struct folio *folio)
+{
+	int i, compound, nr, ret;
+
+	if (likely(!folio_test_large(folio)))
+		return atomic_read(&folio->_mapcount) + 1;
+
+	compound = folio_entire_mapcount(folio);
+	nr = folio_nr_pages(folio);
+	if (folio_test_hugetlb(folio))
+		return compound;
+	ret = compound;
+	for (i = 0; i < nr; i++)
+		ret += atomic_read(&folio_page(folio, i)->_mapcount) + 1;
+	/* File pages has compound_mapcount included in _mapcount */
+	if (!folio_test_anon(folio))
+		return ret - compound * nr;
+	if (folio_test_double_map(folio))
+		ret -= nr;
+	return ret;
+}
+
 /**
  * folio_copy - Copy the contents of one folio to another.
  * @dst: Folio to copy to.
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 37/75] mm: Add split_folio_to_list()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (35 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 36/75] mm: Add folio_mapcount() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-07  7:54   ` Christoph Hellwig
  2022-02-04 19:58 ` [PATCH 38/75] mm: Add folio_is_zone_device() and folio_is_device_private() Matthew Wilcox (Oracle)
                   ` (39 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This is a convenience function; split_huge_page_to_list() can take
any page in a folio (and does so on purpose because that page will
be the one which keeps the refcount).  But it's convenient for the
callers to pass the folio instead of the first page in the folio.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/huge_mm.h |  6 ++++++
 mm/vmscan.c             | 10 +++++-----
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index e4c18ba8d3bf..71c073d411ac 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -483,6 +483,12 @@ static inline bool thp_migration_supported(void)
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
+static inline int split_folio_to_list(struct folio *folio,
+		struct list_head *list)
+{
+	return split_huge_page_to_list(&folio->page, list);
+}
+
 /**
  * thp_size - Size of a transparent huge page.
  * @page: Head page of a transparent huge page.
diff --git a/mm/vmscan.c b/mm/vmscan.c
index e8c5855bc38d..0d23ade9f6e2 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1708,16 +1708,16 @@ static unsigned int shrink_page_list(struct list_head *page_list,
 					 * tail pages can be freed without IO.
 					 */
 					if (!compound_mapcount(page) &&
-					    split_huge_page_to_list(page,
-								    page_list))
+					    split_folio_to_list(folio,
+								page_list))
 						goto activate_locked;
 				}
 				if (!add_to_swap(page)) {
 					if (!PageTransHuge(page))
 						goto activate_locked_split;
 					/* Fallback to swap normal pages */
-					if (split_huge_page_to_list(page,
-								    page_list))
+					if (split_folio_to_list(folio,
+								page_list))
 						goto activate_locked;
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 					count_vm_event(THP_SWPOUT_FALLBACK);
@@ -1733,7 +1733,7 @@ static unsigned int shrink_page_list(struct list_head *page_list,
 			}
 		} else if (unlikely(PageTransHuge(page))) {
 			/* Split file THP */
-			if (split_huge_page_to_list(page, page_list))
+			if (split_folio_to_list(folio, page_list))
 				goto keep_locked;
 		}
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 38/75] mm: Add folio_is_zone_device() and folio_is_device_private()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (36 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 37/75] mm: Add split_folio_to_list() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-07  7:54   ` Christoph Hellwig
  2022-02-04 19:58 ` [PATCH 39/75] mm: Add folio_pgoff() Matthew Wilcox (Oracle)
                   ` (38 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

These two wrappers are the equivalent of is_zone_device_page()
and is_device_private_page().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 6a19cd97d5aa..028bd9336e82 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1096,6 +1096,11 @@ static inline bool is_zone_device_page(const struct page *page)
 }
 #endif
 
+static inline bool folio_is_zone_device(const struct folio *folio)
+{
+	return is_zone_device_page(&folio->page);
+}
+
 static inline bool is_zone_movable_page(const struct page *page)
 {
 	return page_zonenum(page) == ZONE_MOVABLE;
@@ -1142,6 +1147,11 @@ static inline bool is_device_private_page(const struct page *page)
 		page->pgmap->type == MEMORY_DEVICE_PRIVATE;
 }
 
+static inline bool folio_is_device_private(const struct folio *folio)
+{
+	return is_device_private_page(&folio->page);
+}
+
 static inline bool is_pci_p2pdma_page(const struct page *page)
 {
 	return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) &&
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 39/75] mm: Add folio_pgoff()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (37 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 38/75] mm: Add folio_is_zone_device() and folio_is_device_private() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-07  7:55   ` Christoph Hellwig
  2022-02-04 19:58 ` [PATCH 40/75] mm: Add pvmw_set_page() and pvmw_set_folio() Matthew Wilcox (Oracle)
                   ` (37 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This is the folio equivalent of page_to_pgoff().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/pagemap.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index cdb3f118603a..dddd660da24f 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -703,6 +703,17 @@ static inline loff_t folio_file_pos(struct folio *folio)
 	return page_file_offset(&folio->page);
 }
 
+/*
+ * Get the offset in PAGE_SIZE (even for hugetlb folios).
+ * (TODO: hugetlb folios should have ->index in PAGE_SIZE)
+ */
+static inline pgoff_t folio_pgoff(struct folio *folio)
+{
+	if (unlikely(folio_test_hugetlb(folio)))
+		return hugetlb_basepage_index(&folio->page);
+	return folio->index;
+}
+
 extern pgoff_t linear_hugepage_index(struct vm_area_struct *vma,
 				     unsigned long address);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 40/75] mm: Add pvmw_set_page() and pvmw_set_folio()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (38 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 39/75] mm: Add folio_pgoff() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-07  7:55   ` Christoph Hellwig
  2022-02-04 19:58 ` [PATCH 41/75] hexagon: Add pmd_pfn() Matthew Wilcox (Oracle)
                   ` (36 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Instead of setting the page directly in struct page_vma_mapped_walk,
use this helper to allow us to transition to a PFN approach in the
next patch.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/rmap.h    | 12 ++++++++++++
 kernel/events/uprobes.c |  2 +-
 mm/damon/paddr.c        |  4 ++--
 mm/ksm.c                |  2 +-
 mm/migrate.c            |  2 +-
 mm/page_idle.c          |  2 +-
 mm/rmap.c               | 12 ++++++------
 7 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index e704b1a4c06c..e076aca3a203 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -213,6 +213,18 @@ struct page_vma_mapped_walk {
 	unsigned int flags;
 };
 
+static inline void pvmw_set_page(struct page_vma_mapped_walk *pvmw,
+		struct page *page)
+{
+	pvmw->page = page;
+}
+
+static inline void pvmw_set_folio(struct page_vma_mapped_walk *pvmw,
+		struct folio *folio)
+{
+	pvmw->page = &folio->page;
+}
+
 static inline void page_vma_mapped_walk_done(struct page_vma_mapped_walk *pvmw)
 {
 	/* HugeTLB pte is set to the relevant page table entry without pte_mapped. */
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 6357c3580d07..5f74671b0066 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -156,13 +156,13 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
 {
 	struct mm_struct *mm = vma->vm_mm;
 	struct page_vma_mapped_walk pvmw = {
-		.page = compound_head(old_page),
 		.vma = vma,
 		.address = addr,
 	};
 	int err;
 	struct mmu_notifier_range range;
 
+	pvmw_set_page(&pvmw, compound_head(old_page));
 	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm, addr,
 				addr + PAGE_SIZE);
 
diff --git a/mm/damon/paddr.c b/mm/damon/paddr.c
index 5e8244f65a1a..4e27d64abbb7 100644
--- a/mm/damon/paddr.c
+++ b/mm/damon/paddr.c
@@ -20,11 +20,11 @@ static bool __damon_pa_mkold(struct page *page, struct vm_area_struct *vma,
 		unsigned long addr, void *arg)
 {
 	struct page_vma_mapped_walk pvmw = {
-		.page = page,
 		.vma = vma,
 		.address = addr,
 	};
 
+	pvmw_set_page(&pvmw, page);
 	while (page_vma_mapped_walk(&pvmw)) {
 		addr = pvmw.address;
 		if (pvmw.pte)
@@ -94,11 +94,11 @@ static bool __damon_pa_young(struct page *page, struct vm_area_struct *vma,
 {
 	struct damon_pa_access_chk_result *result = arg;
 	struct page_vma_mapped_walk pvmw = {
-		.page = page,
 		.vma = vma,
 		.address = addr,
 	};
 
+	pvmw_set_page(&pvmw, page);
 	result->accessed = false;
 	result->page_sz = PAGE_SIZE;
 	while (page_vma_mapped_walk(&pvmw)) {
diff --git a/mm/ksm.c b/mm/ksm.c
index c20bd4d9a0d9..1639160c9e9a 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1035,13 +1035,13 @@ static int write_protect_page(struct vm_area_struct *vma, struct page *page,
 {
 	struct mm_struct *mm = vma->vm_mm;
 	struct page_vma_mapped_walk pvmw = {
-		.page = page,
 		.vma = vma,
 	};
 	int swapped;
 	int err = -EFAULT;
 	struct mmu_notifier_range range;
 
+	pvmw_set_page(&pvmw, page);
 	pvmw.address = page_address_in_vma(page, vma);
 	if (pvmw.address == -EFAULT)
 		goto out;
diff --git a/mm/migrate.c b/mm/migrate.c
index c7da064b4781..07464fd45925 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -177,7 +177,6 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
 				 unsigned long addr, void *old)
 {
 	struct page_vma_mapped_walk pvmw = {
-		.page = old,
 		.vma = vma,
 		.address = addr,
 		.flags = PVMW_SYNC | PVMW_MIGRATION,
@@ -187,6 +186,7 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
 	swp_entry_t entry;
 
 	VM_BUG_ON_PAGE(PageTail(page), page);
+	pvmw_set_page(&pvmw, old);
 	while (page_vma_mapped_walk(&pvmw)) {
 		if (PageKsm(page))
 			new = page;
diff --git a/mm/page_idle.c b/mm/page_idle.c
index edead6a8a5f9..20d35d720872 100644
--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -49,12 +49,12 @@ static bool page_idle_clear_pte_refs_one(struct page *page,
 					unsigned long addr, void *arg)
 {
 	struct page_vma_mapped_walk pvmw = {
-		.page = page,
 		.vma = vma,
 		.address = addr,
 	};
 	bool referenced = false;
 
+	pvmw_set_page(&pvmw, page);
 	while (page_vma_mapped_walk(&pvmw)) {
 		addr = pvmw.address;
 		if (pvmw.pte) {
diff --git a/mm/rmap.c b/mm/rmap.c
index a531b64d53fa..fa8478372e94 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -803,12 +803,12 @@ static bool page_referenced_one(struct page *page, struct vm_area_struct *vma,
 {
 	struct page_referenced_arg *pra = arg;
 	struct page_vma_mapped_walk pvmw = {
-		.page = page,
 		.vma = vma,
 		.address = address,
 	};
 	int referenced = 0;
 
+	pvmw_set_page(&pvmw, page);
 	while (page_vma_mapped_walk(&pvmw)) {
 		address = pvmw.address;
 
@@ -932,7 +932,6 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
 			    unsigned long address, void *arg)
 {
 	struct page_vma_mapped_walk pvmw = {
-		.page = page,
 		.vma = vma,
 		.address = address,
 		.flags = PVMW_SYNC,
@@ -940,6 +939,7 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
 	struct mmu_notifier_range range;
 	int *cleaned = arg;
 
+	pvmw_set_page(&pvmw, page);
 	/*
 	 * We have to assume the worse case ie pmd for invalidation. Note that
 	 * the page can not be free from this function.
@@ -1423,7 +1423,6 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 {
 	struct mm_struct *mm = vma->vm_mm;
 	struct page_vma_mapped_walk pvmw = {
-		.page = page,
 		.vma = vma,
 		.address = address,
 	};
@@ -1433,6 +1432,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 	struct mmu_notifier_range range;
 	enum ttu_flags flags = (enum ttu_flags)(long)arg;
 
+	pvmw_set_page(&pvmw, page);
 	/*
 	 * When racing against e.g. zap_pte_range() on another cpu,
 	 * in between its ptep_get_and_clear_full() and page_remove_rmap(),
@@ -1723,7 +1723,6 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
 {
 	struct mm_struct *mm = vma->vm_mm;
 	struct page_vma_mapped_walk pvmw = {
-		.page = page,
 		.vma = vma,
 		.address = address,
 	};
@@ -1733,6 +1732,7 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
 	struct mmu_notifier_range range;
 	enum ttu_flags flags = (enum ttu_flags)(long)arg;
 
+	pvmw_set_page(&pvmw, page);
 	/*
 	 * When racing against e.g. zap_pte_range() on another cpu,
 	 * in between its ptep_get_and_clear_full() and page_remove_rmap(),
@@ -2003,11 +2003,11 @@ static bool page_mlock_one(struct page *page, struct vm_area_struct *vma,
 				 unsigned long address, void *unused)
 {
 	struct page_vma_mapped_walk pvmw = {
-		.page = page,
 		.vma = vma,
 		.address = address,
 	};
 
+	pvmw_set_page(&pvmw, page);
 	/* An un-locked vma doesn't have any pages to lock, continue the scan */
 	if (!(vma->vm_flags & VM_LOCKED))
 		return true;
@@ -2078,7 +2078,6 @@ static bool page_make_device_exclusive_one(struct page *page,
 {
 	struct mm_struct *mm = vma->vm_mm;
 	struct page_vma_mapped_walk pvmw = {
-		.page = page,
 		.vma = vma,
 		.address = address,
 	};
@@ -2090,6 +2089,7 @@ static bool page_make_device_exclusive_one(struct page *page,
 	swp_entry_t entry;
 	pte_t swp_pte;
 
+	pvmw_set_page(&pvmw, page);
 	mmu_notifier_range_init_owner(&range, MMU_NOTIFY_EXCLUSIVE, 0, vma,
 				      vma->vm_mm, address, min(vma->vm_end,
 				      address + page_size(page)), args->owner);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 41/75] hexagon: Add pmd_pfn()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (39 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 40/75] mm: Add pvmw_set_page() and pvmw_set_folio() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-06 18:13   ` Mike Rapoport
  2022-02-04 19:58 ` [PATCH 42/75] mm: Convert page_vma_mapped_walk to work on PFNs Matthew Wilcox (Oracle)
                   ` (35 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

I need to use this function in common code, so define it for hexagon.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 arch/hexagon/include/asm/pgtable.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/hexagon/include/asm/pgtable.h b/arch/hexagon/include/asm/pgtable.h
index 18cd6ea9ab23..87e96463ccd6 100644
--- a/arch/hexagon/include/asm/pgtable.h
+++ b/arch/hexagon/include/asm/pgtable.h
@@ -235,10 +235,11 @@ static inline int pmd_bad(pmd_t pmd)
 	return 0;
 }
 
+#define pmd_pfn(pmd)	(pmd_val(pmd) >> PAGE_SHIFT)
 /*
  * pmd_page - converts a PMD entry to a page pointer
  */
-#define pmd_page(pmd)  (pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT))
+#define pmd_page(pmd)  (pfn_to_page(pmd_pfn(pmd)))
 
 /**
  * pte_none - check if pte is mapped
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 42/75] mm: Convert page_vma_mapped_walk to work on PFNs
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (40 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 41/75] hexagon: Add pmd_pfn() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 43/75] mm/page_idle: Convert page_idle_clear_pte_refs() to use a folio Matthew Wilcox (Oracle)
                   ` (34 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

page_mapped_in_vma() really just wants to walk one page, but as the
code stands, if passed the head page of a compound page, it will
walk every page in the compound page.  Extract pfn/nr_pages/pgoff
from the struct page early, so they can be overridden by
page_mapped_in_vma().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/hugetlb.h |  5 ++++
 include/linux/rmap.h    | 17 ++++++++----
 mm/internal.h           | 15 ++++++-----
 mm/migrate.c            |  2 +-
 mm/page_vma_mapped.c    | 58 ++++++++++++++++++-----------------------
 mm/rmap.c               |  8 +++---
 6 files changed, 56 insertions(+), 49 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index d1897a69c540..6ba2f8e74fbb 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -970,6 +970,11 @@ static inline struct hstate *page_hstate(struct page *page)
 	return NULL;
 }
 
+static inline struct hstate *size_to_hstate(unsigned long size)
+{
+	return NULL;
+}
+
 static inline unsigned long huge_page_size(struct hstate *h)
 {
 	return PAGE_SIZE;
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index e076aca3a203..29ea97c5e96a 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -11,6 +11,7 @@
 #include <linux/rwsem.h>
 #include <linux/memcontrol.h>
 #include <linux/highmem.h>
+#include <linux/pagemap.h>
 
 /*
  * The anon_vma heads a list of private "related" vmas, to scan if
@@ -200,11 +201,13 @@ int make_device_exclusive_range(struct mm_struct *mm, unsigned long start,
 
 /* Avoid racy checks */
 #define PVMW_SYNC		(1 << 0)
-/* Look for migarion entries rather than present PTEs */
+/* Look for migration entries rather than present PTEs */
 #define PVMW_MIGRATION		(1 << 1)
 
 struct page_vma_mapped_walk {
-	struct page *page;
+	unsigned long pfn;
+	unsigned long nr_pages;
+	pgoff_t pgoff;
 	struct vm_area_struct *vma;
 	unsigned long address;
 	pmd_t *pmd;
@@ -216,19 +219,23 @@ struct page_vma_mapped_walk {
 static inline void pvmw_set_page(struct page_vma_mapped_walk *pvmw,
 		struct page *page)
 {
-	pvmw->page = page;
+	pvmw->pfn = page_to_pfn(page);
+	pvmw->nr_pages = compound_nr(page);
+	pvmw->pgoff = page_to_pgoff(page);
 }
 
 static inline void pvmw_set_folio(struct page_vma_mapped_walk *pvmw,
 		struct folio *folio)
 {
-	pvmw->page = &folio->page;
+	pvmw->pfn = folio_pfn(folio);
+	pvmw->nr_pages = folio_nr_pages(folio);
+	pvmw->pgoff = folio_pgoff(folio);
 }
 
 static inline void page_vma_mapped_walk_done(struct page_vma_mapped_walk *pvmw)
 {
 	/* HugeTLB pte is set to the relevant page table entry without pte_mapped. */
-	if (pvmw->pte && !PageHuge(pvmw->page))
+	if (pvmw->pte && !is_vm_hugetlb_page(pvmw->vma))
 		pte_unmap(pvmw->pte);
 	if (pvmw->ptl)
 		spin_unlock(pvmw->ptl);
diff --git a/mm/internal.h b/mm/internal.h
index b7a2195c12b1..7f1db0f1a8bc 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -10,6 +10,7 @@
 #include <linux/fs.h>
 #include <linux/mm.h>
 #include <linux/pagemap.h>
+#include <linux/rmap.h>
 #include <linux/tracepoint-defs.h>
 
 struct folio_batch;
@@ -459,18 +460,20 @@ vma_address(struct page *page, struct vm_area_struct *vma)
 }
 
 /*
- * Then at what user virtual address will none of the page be found in vma?
+ * Then at what user virtual address will none of the range be found in vma?
  * Assumes that vma_address() already returned a good starting address.
- * If page is a compound head, the entire compound page is considered.
  */
-static inline unsigned long
-vma_address_end(struct page *page, struct vm_area_struct *vma)
+static inline unsigned long vma_address_end(struct page_vma_mapped_walk *pvmw)
 {
+	struct vm_area_struct *vma = pvmw->vma;
 	pgoff_t pgoff;
 	unsigned long address;
 
-	VM_BUG_ON_PAGE(PageKsm(page), page);	/* KSM page->index unusable */
-	pgoff = page_to_pgoff(page) + compound_nr(page);
+	/* Common case, plus ->pgoff is invalid for KSM */
+	if (pvmw->nr_pages == 1)
+		return pvmw->address + PAGE_SIZE;
+
+	pgoff = pvmw->pgoff + pvmw->nr_pages;
 	address = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT);
 	/* Check for address beyond vma (or wrapped through 0?) */
 	if (address < vma->vm_start || address > vma->vm_end)
diff --git a/mm/migrate.c b/mm/migrate.c
index 07464fd45925..766dc67874a1 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -191,7 +191,7 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
 		if (PageKsm(page))
 			new = page;
 		else
-			new = page - pvmw.page->index +
+			new = page - pvmw.pgoff +
 				linear_page_index(vma, pvmw.address);
 
 #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index f7b331081791..1187f9c1ec5b 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -53,18 +53,6 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw)
 	return true;
 }
 
-static inline bool pfn_is_match(struct page *page, unsigned long pfn)
-{
-	unsigned long page_pfn = page_to_pfn(page);
-
-	/* normal page and hugetlbfs page */
-	if (!PageTransCompound(page) || PageHuge(page))
-		return page_pfn == pfn;
-
-	/* THP can be referenced by any subpage */
-	return pfn >= page_pfn && pfn - page_pfn < thp_nr_pages(page);
-}
-
 /**
  * check_pte - check if @pvmw->page is mapped at the @pvmw->pte
  * @pvmw: page_vma_mapped_walk struct, includes a pair pte and page for checking
@@ -116,7 +104,17 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw)
 		pfn = pte_pfn(*pvmw->pte);
 	}
 
-	return pfn_is_match(pvmw->page, pfn);
+	return (pfn - pvmw->pfn) < pvmw->nr_pages;
+}
+
+/* Returns true if the two ranges overlap.  Careful to not overflow. */
+static bool check_pmd(unsigned long pfn, struct page_vma_mapped_walk *pvmw)
+{
+	if ((pfn + HPAGE_PMD_NR - 1) < pvmw->pfn)
+		return false;
+	if (pfn > pvmw->pfn + pvmw->nr_pages - 1)
+		return false;
+	return true;
 }
 
 static void step_forward(struct page_vma_mapped_walk *pvmw, unsigned long size)
@@ -127,7 +125,7 @@ static void step_forward(struct page_vma_mapped_walk *pvmw, unsigned long size)
 }
 
 /**
- * page_vma_mapped_walk - check if @pvmw->page is mapped in @pvmw->vma at
+ * page_vma_mapped_walk - check if @pvmw->pfn is mapped in @pvmw->vma at
  * @pvmw->address
  * @pvmw: pointer to struct page_vma_mapped_walk. page, vma, address and flags
  * must be set. pmd, pte and ptl must be NULL.
@@ -152,8 +150,8 @@ static void step_forward(struct page_vma_mapped_walk *pvmw, unsigned long size)
  */
 bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
 {
-	struct mm_struct *mm = pvmw->vma->vm_mm;
-	struct page *page = pvmw->page;
+	struct vm_area_struct *vma = pvmw->vma;
+	struct mm_struct *mm = vma->vm_mm;
 	unsigned long end;
 	pgd_t *pgd;
 	p4d_t *p4d;
@@ -164,32 +162,26 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
 	if (pvmw->pmd && !pvmw->pte)
 		return not_found(pvmw);
 
-	if (unlikely(PageHuge(page))) {
+	if (unlikely(is_vm_hugetlb_page(vma))) {
+		unsigned long size = pvmw->nr_pages * PAGE_SIZE;
 		/* The only possible mapping was handled on last iteration */
 		if (pvmw->pte)
 			return not_found(pvmw);
 
 		/* when pud is not present, pte will be NULL */
-		pvmw->pte = huge_pte_offset(mm, pvmw->address, page_size(page));
+		pvmw->pte = huge_pte_offset(mm, pvmw->address, size);
 		if (!pvmw->pte)
 			return false;
 
-		pvmw->ptl = huge_pte_lockptr(page_hstate(page), mm, pvmw->pte);
+		pvmw->ptl = huge_pte_lockptr(size_to_hstate(size), mm,
+						pvmw->pte);
 		spin_lock(pvmw->ptl);
 		if (!check_pte(pvmw))
 			return not_found(pvmw);
 		return true;
 	}
 
-	/*
-	 * Seek to next pte only makes sense for THP.
-	 * But more important than that optimization, is to filter out
-	 * any PageKsm page: whose page->index misleads vma_address()
-	 * and vma_address_end() to disaster.
-	 */
-	end = PageTransCompound(page) ?
-		vma_address_end(page, pvmw->vma) :
-		pvmw->address + PAGE_SIZE;
+	end = vma_address_end(pvmw);
 	if (pvmw->pte)
 		goto next_pte;
 restart:
@@ -224,7 +216,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
 			if (likely(pmd_trans_huge(pmde))) {
 				if (pvmw->flags & PVMW_MIGRATION)
 					return not_found(pvmw);
-				if (pmd_page(pmde) != page)
+				if (!check_pmd(pmd_pfn(pmde), pvmw))
 					return not_found(pvmw);
 				return true;
 			}
@@ -236,7 +228,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
 					return not_found(pvmw);
 				entry = pmd_to_swp_entry(pmde);
 				if (!is_migration_entry(entry) ||
-				    pfn_swap_entry_to_page(entry) != page)
+				    !check_pmd(swp_offset(entry), pvmw))
 					return not_found(pvmw);
 				return true;
 			}
@@ -250,7 +242,8 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
 			 * cleared *pmd but not decremented compound_mapcount().
 			 */
 			if ((pvmw->flags & PVMW_SYNC) &&
-			    PageTransCompound(page)) {
+			    transparent_hugepage_active(vma) &&
+			    (pvmw->nr_pages >= HPAGE_PMD_NR)) {
 				spinlock_t *ptl = pmd_lock(mm, pvmw->pmd);
 
 				spin_unlock(ptl);
@@ -307,7 +300,8 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
 int page_mapped_in_vma(struct page *page, struct vm_area_struct *vma)
 {
 	struct page_vma_mapped_walk pvmw = {
-		.page = page,
+		.pfn = page_to_pfn(page),
+		.nr_pages = 1,
 		.vma = vma,
 		.flags = PVMW_SYNC,
 	};
diff --git a/mm/rmap.c b/mm/rmap.c
index fa8478372e94..d62a6fcef318 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -946,7 +946,7 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
 	 */
 	mmu_notifier_range_init(&range, MMU_NOTIFY_PROTECTION_PAGE,
 				0, vma, vma->vm_mm, address,
-				vma_address_end(page, vma));
+				vma_address_end(&pvmw));
 	mmu_notifier_invalidate_range_start(&range);
 
 	while (page_vma_mapped_walk(&pvmw)) {
@@ -1453,8 +1453,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 	 * Note that the page can not be free in this function as call of
 	 * try_to_unmap() must hold a reference on the page.
 	 */
-	range.end = PageKsm(page) ?
-			address + PAGE_SIZE : vma_address_end(page, vma);
+	range.end = vma_address_end(&pvmw);
 	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
 				address, range.end);
 	if (PageHuge(page)) {
@@ -1757,8 +1756,7 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
 	 * Note that the page can not be free in this function as call of
 	 * try_to_unmap() must hold a reference on the page.
 	 */
-	range.end = PageKsm(page) ?
-			address + PAGE_SIZE : vma_address_end(page, vma);
+	range.end = vma_address_end(&pvmw);
 	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
 				address, range.end);
 	if (PageHuge(page)) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 43/75] mm/page_idle: Convert page_idle_clear_pte_refs() to use a folio
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (41 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 42/75] mm: Convert page_vma_mapped_walk to work on PFNs Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-07  7:57   ` Christoph Hellwig
  2022-02-04 19:58 ` [PATCH 44/75] mm/rmap: Use a folio in page_mkclean_one() Matthew Wilcox (Oracle)
                   ` (33 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

The PG_idle and PG_young bits are ignored if they're set on tail
pages, so ensure we're passing a folio around.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/page_idle.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/mm/page_idle.c b/mm/page_idle.c
index 20d35d720872..544814bd9e37 100644
--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -13,6 +13,8 @@
 #include <linux/page_ext.h>
 #include <linux/page_idle.h>
 
+#include "internal.h"
+
 #define BITMAP_CHUNK_SIZE	sizeof(u64)
 #define BITMAP_CHUNK_BITS	(BITMAP_CHUNK_SIZE * BITS_PER_BYTE)
 
@@ -48,6 +50,7 @@ static bool page_idle_clear_pte_refs_one(struct page *page,
 					struct vm_area_struct *vma,
 					unsigned long addr, void *arg)
 {
+	struct folio *folio = page_folio(page);
 	struct page_vma_mapped_walk pvmw = {
 		.vma = vma,
 		.address = addr,
@@ -74,19 +77,20 @@ static bool page_idle_clear_pte_refs_one(struct page *page,
 	}
 
 	if (referenced) {
-		clear_page_idle(page);
+		folio_clear_idle(folio);
 		/*
 		 * We cleared the referenced bit in a mapping to this page. To
 		 * avoid interference with page reclaim, mark it young so that
 		 * page_referenced() will return > 0.
 		 */
-		set_page_young(page);
+		folio_set_young(folio);
 	}
 	return true;
 }
 
 static void page_idle_clear_pte_refs(struct page *page)
 {
+	struct folio *folio = page_folio(page);
 	/*
 	 * Since rwc.arg is unused, rwc is effectively immutable, so we
 	 * can make it static const to save some cycles and stack.
@@ -97,18 +101,17 @@ static void page_idle_clear_pte_refs(struct page *page)
 	};
 	bool need_lock;
 
-	if (!page_mapped(page) ||
-	    !page_rmapping(page))
+	if (!folio_mapped(folio) || !folio_raw_mapping(folio))
 		return;
 
-	need_lock = !PageAnon(page) || PageKsm(page);
-	if (need_lock && !trylock_page(page))
+	need_lock = !folio_test_anon(folio) || folio_test_ksm(folio);
+	if (need_lock && !folio_trylock(folio))
 		return;
 
-	rmap_walk(page, (struct rmap_walk_control *)&rwc);
+	rmap_walk(&folio->page, (struct rmap_walk_control *)&rwc);
 
 	if (need_lock)
-		unlock_page(page);
+		folio_unlock(folio);
 }
 
 static ssize_t page_idle_bitmap_read(struct file *file, struct kobject *kobj,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 44/75] mm/rmap: Use a folio in page_mkclean_one()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (42 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 43/75] mm/page_idle: Convert page_idle_clear_pte_refs() to use a folio Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-07  7:57   ` Christoph Hellwig
  2022-02-04 19:58 ` [PATCH 45/75] mm/rmap: Turn page_referenced() into folio_referenced() Matthew Wilcox (Oracle)
                   ` (32 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

folio_mkclean() already passes down a head page, so convert it
back to a folio.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/rmap.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index d62a6fcef318..18ae6bd79efd 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -931,6 +931,7 @@ int page_referenced(struct page *page,
 static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
 			    unsigned long address, void *arg)
 {
+	struct folio *folio = page_folio(page);
 	struct page_vma_mapped_walk pvmw = {
 		.vma = vma,
 		.address = address,
@@ -942,7 +943,7 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
 	pvmw_set_page(&pvmw, page);
 	/*
 	 * We have to assume the worse case ie pmd for invalidation. Note that
-	 * the page can not be free from this function.
+	 * the folio can not be freed from this function.
 	 */
 	mmu_notifier_range_init(&range, MMU_NOTIFY_PROTECTION_PAGE,
 				0, vma, vma->vm_mm, address,
@@ -974,14 +975,14 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
 			if (!pmd_dirty(*pmd) && !pmd_write(*pmd))
 				continue;
 
-			flush_cache_page(vma, address, page_to_pfn(page));
+			flush_cache_page(vma, address, folio_pfn(folio));
 			entry = pmdp_invalidate(vma, address, pmd);
 			entry = pmd_wrprotect(entry);
 			entry = pmd_mkclean(entry);
 			set_pmd_at(vma->vm_mm, address, pmd, entry);
 			ret = 1;
 #else
-			/* unexpected pmd-mapped page? */
+			/* unexpected pmd-mapped folio? */
 			WARN_ON_ONCE(1);
 #endif
 		}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 45/75] mm/rmap: Turn page_referenced() into folio_referenced()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (43 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 44/75] mm/rmap: Use a folio in page_mkclean_one() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-07  7:58   ` Christoph Hellwig
  2022-02-04 19:58 ` [PATCH 46/75] mm/mlock: Turn clear_page_mlock() into folio_end_mlock() Matthew Wilcox (Oracle)
                   ` (31 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Both its callers pass a page which was previously on an LRU list,
so were passing a folio by definition.  Use the type system to enforce
that and remove a few calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/rmap.h |  4 ++--
 mm/mlock.c           |  4 ++--
 mm/page_idle.c       |  2 +-
 mm/rmap.c            | 46 ++++++++++++++++++++++----------------------
 mm/vmscan.c          | 20 +++++++++++--------
 5 files changed, 40 insertions(+), 36 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 29ea97c5e96a..00b772cdaaaa 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -189,7 +189,7 @@ static inline void page_dup_rmap(struct page *page, bool compound)
 /*
  * Called from mm/vmscan.c to handle paging out
  */
-int page_referenced(struct page *, int is_locked,
+int folio_referenced(struct folio *, int is_locked,
 			struct mem_cgroup *memcg, unsigned long *vm_flags);
 
 void try_to_migrate(struct page *page, enum ttu_flags flags);
@@ -302,7 +302,7 @@ void rmap_walk_locked(struct page *page, struct rmap_walk_control *rwc);
 #define anon_vma_prepare(vma)	(0)
 #define anon_vma_link(vma)	do {} while (0)
 
-static inline int page_referenced(struct page *page, int is_locked,
+static inline int folio_referenced(struct folio *folio, int is_locked,
 				  struct mem_cgroup *memcg,
 				  unsigned long *vm_flags)
 {
diff --git a/mm/mlock.c b/mm/mlock.c
index 8f584eddd305..24d0809cacba 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -134,7 +134,7 @@ static void __munlock_isolated_page(struct page *page)
  * Performs accounting when page isolation fails in munlock. There is nothing
  * else to do because it means some other task has already removed the page
  * from the LRU. putback_lru_page() will take care of removing the page from
- * the unevictable list, if necessary. vmscan [page_referenced()] will move
+ * the unevictable list, if necessary. vmscan [folio_referenced()] will move
  * the page back to the unevictable list if some other vma has it mlocked.
  */
 static void __munlock_isolation_failed(struct page *page)
@@ -163,7 +163,7 @@ static void __munlock_isolation_failed(struct page *page)
  * task has removed the page from the LRU, we won't be able to do that.
  * So we clear the PageMlocked as we might not get another chance.  If we
  * can't isolate the page, we leave it for putback_lru_page() and vmscan
- * [page_referenced()/try_to_unmap()] to deal with.
+ * [folio_referenced()/try_to_unmap()] to deal with.
  */
 unsigned int munlock_vma_page(struct page *page)
 {
diff --git a/mm/page_idle.c b/mm/page_idle.c
index 544814bd9e37..35e53db430df 100644
--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -81,7 +81,7 @@ static bool page_idle_clear_pte_refs_one(struct page *page,
 		/*
 		 * We cleared the referenced bit in a mapping to this page. To
 		 * avoid interference with page reclaim, mark it young so that
-		 * page_referenced() will return > 0.
+		 * folio_referenced() will return > 0.
 		 */
 		folio_set_young(folio);
 	}
diff --git a/mm/rmap.c b/mm/rmap.c
index 18ae6bd79efd..1cedcfd6105c 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -801,6 +801,7 @@ struct page_referenced_arg {
 static bool page_referenced_one(struct page *page, struct vm_area_struct *vma,
 			unsigned long address, void *arg)
 {
+	struct folio *folio = page_folio(page);
 	struct page_referenced_arg *pra = arg;
 	struct page_vma_mapped_walk pvmw = {
 		.vma = vma,
@@ -824,10 +825,10 @@ static bool page_referenced_one(struct page *page, struct vm_area_struct *vma,
 				/*
 				 * Don't treat a reference through
 				 * a sequentially read mapping as such.
-				 * If the page has been used in another mapping,
+				 * If the folio has been used in another mapping,
 				 * we will catch it; if this other mapping is
 				 * already gone, the unmap path will have set
-				 * PG_referenced or activated the page.
+				 * the referenced flag or activated the folio.
 				 */
 				if (likely(!(vma->vm_flags & VM_SEQ_READ)))
 					referenced++;
@@ -837,7 +838,7 @@ static bool page_referenced_one(struct page *page, struct vm_area_struct *vma,
 						pvmw.pmd))
 				referenced++;
 		} else {
-			/* unexpected pmd-mapped page? */
+			/* unexpected pmd-mapped folio? */
 			WARN_ON_ONCE(1);
 		}
 
@@ -845,8 +846,8 @@ static bool page_referenced_one(struct page *page, struct vm_area_struct *vma,
 	}
 
 	if (referenced)
-		clear_page_idle(page);
-	if (test_and_clear_page_young(page))
+		folio_clear_idle(folio);
+	if (folio_test_clear_young(folio))
 		referenced++;
 
 	if (referenced) {
@@ -872,23 +873,22 @@ static bool invalid_page_referenced_vma(struct vm_area_struct *vma, void *arg)
 }
 
 /**
- * page_referenced - test if the page was referenced
- * @page: the page to test
- * @is_locked: caller holds lock on the page
+ * folio_referenced() - Test if the folio was referenced.
+ * @folio: The folio to test.
+ * @is_locked: Caller holds lock on the folio.
  * @memcg: target memory cgroup
- * @vm_flags: collect encountered vma->vm_flags who actually referenced the page
+ * @vm_flags: A combination of all the vma->vm_flags which referenced the folio.
+ *
+ * Quick test_and_clear_referenced for all mappings of a folio,
  *
- * Quick test_and_clear_referenced for all mappings to a page,
- * returns the number of ptes which referenced the page.
+ * Return: The number of mappings which referenced the folio.
  */
-int page_referenced(struct page *page,
-		    int is_locked,
-		    struct mem_cgroup *memcg,
-		    unsigned long *vm_flags)
+int folio_referenced(struct folio *folio, int is_locked,
+		     struct mem_cgroup *memcg, unsigned long *vm_flags)
 {
 	int we_locked = 0;
 	struct page_referenced_arg pra = {
-		.mapcount = total_mapcount(page),
+		.mapcount = folio_mapcount(folio),
 		.memcg = memcg,
 	};
 	struct rmap_walk_control rwc = {
@@ -901,11 +901,11 @@ int page_referenced(struct page *page,
 	if (!pra.mapcount)
 		return 0;
 
-	if (!page_rmapping(page))
+	if (!folio_raw_mapping(folio))
 		return 0;
 
-	if (!is_locked && (!PageAnon(page) || PageKsm(page))) {
-		we_locked = trylock_page(page);
+	if (!is_locked && (!folio_test_anon(folio) || folio_test_ksm(folio))) {
+		we_locked = folio_trylock(folio);
 		if (!we_locked)
 			return 1;
 	}
@@ -919,11 +919,11 @@ int page_referenced(struct page *page,
 		rwc.invalid_vma = invalid_page_referenced_vma;
 	}
 
-	rmap_walk(page, &rwc);
+	rmap_walk(&folio->page, &rwc);
 	*vm_flags = pra.vm_flags;
 
 	if (we_locked)
-		unlock_page(page);
+		folio_unlock(folio);
 
 	return pra.referenced;
 }
@@ -1058,8 +1058,8 @@ void page_move_anon_rmap(struct page *page, struct vm_area_struct *vma)
 	anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
 	/*
 	 * Ensure that anon_vma and the PAGE_MAPPING_ANON bit are written
-	 * simultaneously, so a concurrent reader (eg page_referenced()'s
-	 * PageAnon()) will not see one without the other.
+	 * simultaneously, so a concurrent reader (eg folio_referenced()'s
+	 * folio_test_anon()) will not see one without the other.
 	 */
 	WRITE_ONCE(page->mapping, (struct address_space *) anon_vma);
 }
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 0d23ade9f6e2..1e751ba3b4a8 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1379,11 +1379,12 @@ enum page_references {
 static enum page_references page_check_references(struct page *page,
 						  struct scan_control *sc)
 {
+	struct folio *folio = page_folio(page);
 	int referenced_ptes, referenced_page;
 	unsigned long vm_flags;
 
-	referenced_ptes = page_referenced(page, 1, sc->target_mem_cgroup,
-					  &vm_flags);
+	referenced_ptes = folio_referenced(folio, 1, sc->target_mem_cgroup,
+					   &vm_flags);
 	referenced_page = TestClearPageReferenced(page);
 
 	/*
@@ -2483,7 +2484,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
  *
  * If the pages are mostly unmapped, the processing is fast and it is
  * appropriate to hold lru_lock across the whole operation.  But if
- * the pages are mapped, the processing is slow (page_referenced()), so
+ * the pages are mapped, the processing is slow (folio_referenced()), so
  * we should drop lru_lock around each page.  It's impossible to balance
  * this, so instead we remove the pages from the LRU while processing them.
  * It is safe to rely on PG_active against the non-LRU pages in here because
@@ -2503,7 +2504,6 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	LIST_HEAD(l_hold);	/* The pages which were snipped off */
 	LIST_HEAD(l_active);
 	LIST_HEAD(l_inactive);
-	struct page *page;
 	unsigned nr_deactivate, nr_activate;
 	unsigned nr_rotated = 0;
 	int file = is_file_lru(lru);
@@ -2525,9 +2525,13 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	spin_unlock_irq(&lruvec->lru_lock);
 
 	while (!list_empty(&l_hold)) {
+		struct folio *folio;
+		struct page *page;
+
 		cond_resched();
-		page = lru_to_page(&l_hold);
-		list_del(&page->lru);
+		folio = lru_to_folio(&l_hold);
+		list_del(&folio->lru);
+		page = &folio->page;
 
 		if (unlikely(!page_evictable(page))) {
 			putback_lru_page(page);
@@ -2542,8 +2546,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
 			}
 		}
 
-		if (page_referenced(page, 0, sc->target_mem_cgroup,
-				    &vm_flags)) {
+		if (folio_referenced(folio, 0, sc->target_mem_cgroup,
+				     &vm_flags)) {
 			/*
 			 * Identify referenced, file-backed active pages and
 			 * give them one more trip around the active list. So
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 46/75] mm/mlock: Turn clear_page_mlock() into folio_end_mlock()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (44 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 45/75] mm/rmap: Turn page_referenced() into folio_referenced() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 47/75] mm/mlock: Turn mlock_vma_page() into mlock_vma_folio() Matthew Wilcox (Oracle)
                   ` (30 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Add a clear_page_mlock() wrapper function.  It looks like all
callers were already passing a head page, but if they weren't,
this will fix an accounting bug.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/folio-compat.c |  5 +++++
 mm/internal.h     | 15 +++------------
 mm/mlock.c        | 28 +++++++++++++++++-----------
 3 files changed, 25 insertions(+), 23 deletions(-)

diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 46fa179e32fb..bcb037d9cec3 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -164,3 +164,8 @@ void putback_lru_page(struct page *page)
 {
 	folio_putback_lru(page_folio(page));
 }
+
+void clear_page_mlock(struct page *page)
+{
+	folio_end_mlock(page_folio(page));
+}
diff --git a/mm/internal.h b/mm/internal.h
index 7f1db0f1a8bc..041c76a4c284 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -416,17 +416,8 @@ extern unsigned int munlock_vma_page(struct page *page);
 
 extern int mlock_future_check(struct mm_struct *mm, unsigned long flags,
 			      unsigned long len);
-
-/*
- * Clear the page's PageMlocked().  This can be useful in a situation where
- * we want to unconditionally remove a page from the pagecache -- e.g.,
- * on truncation or freeing.
- *
- * It is legal to call this function for any page, mlocked or not.
- * If called for a page that is still mapped by mlocked vmas, all we do
- * is revert to lazy LRU behaviour -- semantics are not broken.
- */
-extern void clear_page_mlock(struct page *page);
+void folio_end_mlock(struct folio *folio);
+void clear_page_mlock(struct page *page);
 
 extern pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma);
 
@@ -503,7 +494,7 @@ static inline struct file *maybe_unlock_mmap_for_io(struct vm_fault *vmf,
 }
 #else /* !CONFIG_MMU */
 static inline void unmap_mapping_folio(struct folio *folio) { }
-static inline void clear_page_mlock(struct page *page) { }
+static inline void folio_end_mlock(struct folio *folio) { }
 static inline void mlock_vma_page(struct page *page) { }
 static inline void vunmap_range_noflush(unsigned long start, unsigned long end)
 {
diff --git a/mm/mlock.c b/mm/mlock.c
index 24d0809cacba..ff067d64acc5 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -55,31 +55,37 @@ EXPORT_SYMBOL(can_do_mlock);
  */
 
 /*
- *  LRU accounting for clear_page_mlock()
+ * Clear the folio's PageMlocked().  This can be useful in a situation where
+ * we want to unconditionally remove a folio from the pagecache -- e.g.,
+ * on truncation or freeing.
+ *
+ * It is legal to call this function for any folio, mlocked or not.
+ * If called for a folio that is still mapped by mlocked vmas, all we do
+ * is revert to lazy LRU behaviour -- semantics are not broken.
  */
-void clear_page_mlock(struct page *page)
+void folio_end_mlock(struct folio *folio)
 {
-	int nr_pages;
+	long nr_pages;
 
-	if (!TestClearPageMlocked(page))
+	if (!folio_test_clear_mlocked(folio))
 		return;
 
-	nr_pages = thp_nr_pages(page);
-	mod_zone_page_state(page_zone(page), NR_MLOCK, -nr_pages);
+	nr_pages = folio_nr_pages(folio);
+	zone_stat_mod_folio(folio, NR_MLOCK, -nr_pages);
 	count_vm_events(UNEVICTABLE_PGCLEARED, nr_pages);
 	/*
-	 * The previous TestClearPageMlocked() corresponds to the smp_mb()
+	 * The previous folio_test_clear_mlocked() corresponds to the smp_mb()
 	 * in __pagevec_lru_add_fn().
 	 *
 	 * See __pagevec_lru_add_fn for more explanation.
 	 */
-	if (!isolate_lru_page(page)) {
-		putback_lru_page(page);
+	if (!folio_isolate_lru(folio)) {
+		folio_putback_lru(folio);
 	} else {
 		/*
-		 * We lost the race. the page already moved to evictable list.
+		 * We lost the race. the folio already moved to evictable list.
 		 */
-		if (PageUnevictable(page))
+		if (folio_test_unevictable(folio))
 			count_vm_events(UNEVICTABLE_PGSTRANDED, nr_pages);
 	}
 }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 47/75] mm/mlock: Turn mlock_vma_page() into mlock_vma_folio()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (45 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 46/75] mm/mlock: Turn clear_page_mlock() into folio_end_mlock() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-07 10:46   ` Mike Rapoport
  2022-02-04 19:58 ` [PATCH 48/75] mm/rmap: Turn page_mlock() into folio_mlock() Matthew Wilcox (Oracle)
                   ` (29 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Add mlock_vma_page() back as a wrapper.  Saves a few calls to
compound_head() and an assertion that the page is not a tail page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/folio-compat.c |  5 +++++
 mm/internal.h     |  3 ++-
 mm/mlock.c        | 18 +++++++++---------
 3 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index bcb037d9cec3..9cb0867d5b38 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -169,3 +169,8 @@ void clear_page_mlock(struct page *page)
 {
 	folio_end_mlock(page_folio(page));
 }
+
+void mlock_vma_page(struct page *page)
+{
+	mlock_vma_folio(page_folio(page));
+}
diff --git a/mm/internal.h b/mm/internal.h
index 041c76a4c284..18b024aa7e59 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -411,7 +411,8 @@ static inline void munlock_vma_pages_all(struct vm_area_struct *vma)
 /*
  * must be called with vma's mmap_lock held for read or write, and page locked.
  */
-extern void mlock_vma_page(struct page *page);
+void mlock_vma_page(struct page *page);
+void mlock_vma_folio(struct folio *folio);
 extern unsigned int munlock_vma_page(struct page *page);
 
 extern int mlock_future_check(struct mm_struct *mm, unsigned long flags,
diff --git a/mm/mlock.c b/mm/mlock.c
index ff067d64acc5..d998fd5c84bf 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -94,21 +94,21 @@ void folio_end_mlock(struct folio *folio)
  * Mark page as mlocked if not already.
  * If page on LRU, isolate and putback to move to unevictable list.
  */
-void mlock_vma_page(struct page *page)
+void mlock_vma_folio(struct folio *folio)
 {
 	/* Serialize with page migration */
-	BUG_ON(!PageLocked(page));
+	BUG_ON(!folio_test_locked(folio));
 
-	VM_BUG_ON_PAGE(PageTail(page), page);
-	VM_BUG_ON_PAGE(PageCompound(page) && PageDoubleMap(page), page);
+	VM_BUG_ON_FOLIO(folio_test_large(folio) && folio_test_double_map(folio),
+			folio);
 
-	if (!TestSetPageMlocked(page)) {
-		int nr_pages = thp_nr_pages(page);
+	if (!folio_test_set_mlocked(folio)) {
+		long nr_pages = folio_nr_pages(folio);
 
-		mod_zone_page_state(page_zone(page), NR_MLOCK, nr_pages);
+		zone_stat_mod_folio(folio, NR_MLOCK, nr_pages);
 		count_vm_events(UNEVICTABLE_PGMLOCKED, nr_pages);
-		if (!isolate_lru_page(page))
-			putback_lru_page(page);
+		if (!folio_isolate_lru(folio))
+			folio_putback_lru(folio);
 	}
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 48/75] mm/rmap: Turn page_mlock() into folio_mlock()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (46 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 47/75] mm/mlock: Turn mlock_vma_page() into mlock_vma_folio() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 49/75] mm/mlock: Turn munlock_vma_page() into munlock_vma_folio() Matthew Wilcox (Oracle)
                   ` (28 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Add back page_mlock() as a wrapper around folio_mlock().  Removes
a few hidden calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/rmap.h |  1 +
 mm/folio-compat.c    |  6 ++++++
 mm/rmap.c            | 31 +++++++++++++++++--------------
 3 files changed, 24 insertions(+), 14 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 00b772cdaaaa..31f3a299ef66 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -261,6 +261,7 @@ int folio_mkclean(struct folio *);
  * the page mlocked.
  */
 void page_mlock(struct page *page);
+void folio_mlock(struct folio *folio);
 
 void remove_migration_ptes(struct page *old, struct page *new, bool locked);
 
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 9cb0867d5b38..90f03187a5e3 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -7,6 +7,7 @@
 #include <linux/migrate.h>
 #include <linux/pagemap.h>
 #include <linux/swap.h>
+#include <linux/rmap.h>
 #include "internal.h"
 
 struct address_space *page_mapping(struct page *page)
@@ -174,3 +175,8 @@ void mlock_vma_page(struct page *page)
 {
 	mlock_vma_folio(page_folio(page));
 }
+
+void page_mlock(struct page *page)
+{
+	folio_mlock(page_folio(page));
+}
diff --git a/mm/rmap.c b/mm/rmap.c
index 1cedcfd6105c..a383e25fb196 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2001,6 +2001,7 @@ void try_to_migrate(struct page *page, enum ttu_flags flags)
 static bool page_mlock_one(struct page *page, struct vm_area_struct *vma,
 				 unsigned long address, void *unused)
 {
+	struct folio *folio = page_folio(page);
 	struct page_vma_mapped_walk pvmw = {
 		.vma = vma,
 		.address = address,
@@ -2024,9 +2025,9 @@ static bool page_mlock_one(struct page *page, struct vm_area_struct *vma,
 			 * nor on an Anon THP (which may still be PTE-mapped
 			 * after DoubleMap was cleared).
 			 */
-			mlock_vma_page(page);
+			mlock_vma_folio(folio);
 			/*
-			 * No need to scan further once the page is marked
+			 * No need to scan further once the folio is marked
 			 * as mlocked.
 			 */
 			page_vma_mapped_walk_done(&pvmw);
@@ -2038,14 +2039,14 @@ static bool page_mlock_one(struct page *page, struct vm_area_struct *vma,
 }
 
 /**
- * page_mlock - try to mlock a page
- * @page: the page to be mlocked
+ * folio_mlock() - Try to mlock a folio.
+ * @folio: The folio to be mlocked.
  *
- * Called from munlock code. Checks all of the VMAs mapping the page and mlocks
- * the page if any are found. The page will be returned with PG_mlocked cleared
- * if it is not mapped by any locked vmas.
+ * Called from munlock code. Checks all of the VMAs mapping the folio
+ * and mlocks the folio if any are found. The folio will be returned
+ * with the mlocked flag clear if it is not mapped by any locked vmas.
  */
-void page_mlock(struct page *page)
+void folio_mlock(struct folio *folio)
 {
 	struct rmap_walk_control rwc = {
 		.rmap_one = page_mlock_one,
@@ -2054,14 +2055,16 @@ void page_mlock(struct page *page)
 
 	};
 
-	VM_BUG_ON_PAGE(!PageLocked(page) || PageLRU(page), page);
-	VM_BUG_ON_PAGE(PageCompound(page) && PageDoubleMap(page), page);
+	VM_BUG_ON_FOLIO(!folio_test_locked(folio) || folio_test_lru(folio),
+			folio);
+	VM_BUG_ON_FOLIO(folio_test_large(folio) && folio_test_double_map(folio),
+			folio);
 
 	/* Anon THP are only marked as mlocked when singly mapped */
-	if (PageTransCompound(page) && PageAnon(page))
+	if (folio_test_large(folio) && folio_test_anon(folio))
 		return;
 
-	rmap_walk(page, &rwc);
+	rmap_walk(&folio->page, &rwc);
 }
 
 #ifdef CONFIG_DEVICE_PRIVATE
@@ -2290,7 +2293,7 @@ static struct anon_vma *rmap_walk_anon_lock(struct page *page,
  * Find all the mappings of a page using the mapping pointer and the vma chains
  * contained in the anon_vma struct it points to.
  *
- * When called from page_mlock(), the mmap_lock of the mm containing the vma
+ * When called from folio_mlock(), the mmap_lock of the mm containing the vma
  * where the page was found will be held for write.  So, we won't recheck
  * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
  * LOCKED.
@@ -2343,7 +2346,7 @@ static void rmap_walk_anon(struct page *page, struct rmap_walk_control *rwc,
  * Find all the mappings of a page using the mapping pointer and the vma chains
  * contained in the address_space struct it points to.
  *
- * When called from page_mlock(), the mmap_lock of the mm containing the vma
+ * When called from folio_mlock(), the mmap_lock of the mm containing the vma
  * where the page was found will be held for write.  So, we won't recheck
  * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
  * LOCKED.
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 49/75] mm/mlock: Turn munlock_vma_page() into munlock_vma_folio()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (47 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 48/75] mm/rmap: Turn page_mlock() into folio_mlock() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 50/75] mm/huge_memory: Convert __split_huge_pmd() to take a folio Matthew Wilcox (Oracle)
                   ` (27 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Add back munlock_vma_page() as a wrapper function.  Saves a few calls
to compound_head() and an assertion that the page is not a tail page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/folio-compat.c |  5 +++
 mm/internal.h     |  3 +-
 mm/mlock.c        | 86 +++++++++++++++++++++++------------------------
 3 files changed, 50 insertions(+), 44 deletions(-)

diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 90f03187a5e3..3804fd8c1f20 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -176,6 +176,11 @@ void mlock_vma_page(struct page *page)
 	mlock_vma_folio(page_folio(page));
 }
 
+unsigned long munlock_vma_page(struct page *page)
+{
+	return munlock_vma_folio(page_folio(page));
+}
+
 void page_mlock(struct page *page)
 {
 	folio_mlock(page_folio(page));
diff --git a/mm/internal.h b/mm/internal.h
index 18b024aa7e59..66645972cbd7 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -413,7 +413,8 @@ static inline void munlock_vma_pages_all(struct vm_area_struct *vma)
  */
 void mlock_vma_page(struct page *page);
 void mlock_vma_folio(struct folio *folio);
-extern unsigned int munlock_vma_page(struct page *page);
+unsigned long munlock_vma_page(struct page *page);
+unsigned long munlock_vma_folio(struct folio *folio);
 
 extern int mlock_future_check(struct mm_struct *mm, unsigned long flags,
 			      unsigned long len);
diff --git a/mm/mlock.c b/mm/mlock.c
index d998fd5c84bf..f188038ef48e 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -115,82 +115,81 @@ void mlock_vma_folio(struct folio *folio)
 /*
  * Finish munlock after successful page isolation
  *
- * Page must be locked. This is a wrapper for page_mlock()
- * and putback_lru_page() with munlock accounting.
+ * Folio must be locked. This is a wrapper for folio_mlock()
+ * and folio_putback_lru() with munlock accounting.
  */
-static void __munlock_isolated_page(struct page *page)
+static void __munlock_isolated_folio(struct folio *folio)
 {
 	/*
 	 * Optimization: if the page was mapped just once, that's our mapping
 	 * and we don't need to check all the other vmas.
 	 */
-	if (page_mapcount(page) > 1)
-		page_mlock(page);
+	/* XXX: should be folio_mapcount(), surely? */
+	if (page_mapcount(&folio->page) > 1)
+		folio_mlock(folio);
 
 	/* Did try_to_unlock() succeed or punt? */
-	if (!PageMlocked(page))
-		count_vm_events(UNEVICTABLE_PGMUNLOCKED, thp_nr_pages(page));
+	if (!folio_test_mlocked(folio))
+		count_vm_events(UNEVICTABLE_PGMUNLOCKED, folio_nr_pages(folio));
 
-	putback_lru_page(page);
+	folio_putback_lru(folio);
 }
 
 /*
- * Accounting for page isolation fail during munlock
+ * Accounting for folio isolation fail during munlock
  *
- * Performs accounting when page isolation fails in munlock. There is nothing
- * else to do because it means some other task has already removed the page
- * from the LRU. putback_lru_page() will take care of removing the page from
+ * Performs accounting when folio isolation fails in munlock. There is nothing
+ * else to do because it means some other task has already removed the folio
+ * from the LRU. folio_putback_lru() will take care of removing the folio from
  * the unevictable list, if necessary. vmscan [folio_referenced()] will move
- * the page back to the unevictable list if some other vma has it mlocked.
+ * the folio back to the unevictable list if some other vma has it mlocked.
  */
-static void __munlock_isolation_failed(struct page *page)
+static void __munlock_isolation_failed(struct folio *folio)
 {
-	int nr_pages = thp_nr_pages(page);
+	long nr_pages = folio_nr_pages(folio);
 
-	if (PageUnevictable(page))
+	if (folio_test_unevictable(folio))
 		__count_vm_events(UNEVICTABLE_PGSTRANDED, nr_pages);
 	else
 		__count_vm_events(UNEVICTABLE_PGMUNLOCKED, nr_pages);
 }
 
 /**
- * munlock_vma_page - munlock a vma page
- * @page: page to be unlocked, either a normal page or THP page head
+ * munlock_vma_folio() - munlock a vma folio.
+ * @folio: Folio to be unlocked.
  *
- * returns the size of the page as a page mask (0 for normal page,
- *         HPAGE_PMD_NR - 1 for THP head page)
- *
- * called from munlock()/munmap() path with page supposedly on the LRU.
- * When we munlock a page, because the vma where we found the page is being
+ * called from munlock()/munmap() path with folio supposedly on the LRU.
+ * When we munlock a folio, because the vma where we found the folio is being
  * munlock()ed or munmap()ed, we want to check whether other vmas hold the
- * page locked so that we can leave it on the unevictable lru list and not
- * bother vmscan with it.  However, to walk the page's rmap list in
- * page_mlock() we must isolate the page from the LRU.  If some other
- * task has removed the page from the LRU, we won't be able to do that.
- * So we clear the PageMlocked as we might not get another chance.  If we
- * can't isolate the page, we leave it for putback_lru_page() and vmscan
+ * folio locked so that we can leave it on the unevictable lru list and not
+ * bother vmscan with it.  However, to walk the folio's rmap list in
+ * folio_mlock() we must isolate the folio from the LRU.  If some other
+ * task has removed the folio from the LRU, we won't be able to do that.
+ * So we clear the folio mlocked flag as we might not get another chance.  If
+ * we can't isolate the folio, we leave it for folio_putback_lru() and vmscan
  * [folio_referenced()/try_to_unmap()] to deal with.
+ *
+ * Return: The size of the folio as a page mask (2^order - 1).
  */
-unsigned int munlock_vma_page(struct page *page)
+unsigned long munlock_vma_folio(struct folio *folio)
 {
-	int nr_pages;
+	long nr_pages;
 
-	/* For page_mlock() and to serialize with page migration */
-	BUG_ON(!PageLocked(page));
-	VM_BUG_ON_PAGE(PageTail(page), page);
+	/* For folio_mlock() and to serialize with page migration */
+	BUG_ON(!folio_test_locked(folio));
 
-	if (!TestClearPageMlocked(page)) {
-		/* Potentially, PTE-mapped THP: do not skip the rest PTEs */
+	if (!folio_test_clear_mlocked(folio)) {
+		/* Potentially, PTE-mapped folio: do not skip the other PTEs */
 		return 0;
 	}
 
-	nr_pages = thp_nr_pages(page);
-	mod_zone_page_state(page_zone(page), NR_MLOCK, -nr_pages);
+	nr_pages = folio_nr_pages(folio);
+	zone_stat_mod_folio(folio, NR_MLOCK, -nr_pages);
 
-	if (!isolate_lru_page(page))
-		__munlock_isolated_page(page);
+	if (!folio_isolate_lru(folio))
+		__munlock_isolated_folio(folio);
 	else
-		__munlock_isolation_failed(page);
+		__munlock_isolation_failed(folio);
 
 	return nr_pages - 1;
 }
@@ -289,7 +288,7 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone)
 				del_page_from_lru_list(page, lruvec);
 				continue;
 			} else
-				__munlock_isolation_failed(page);
+				__munlock_isolation_failed(folio);
 		} else {
 			delta_munlocked++;
 		}
@@ -318,6 +317,7 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone)
 		struct page *page = pvec->pages[i];
 
 		if (page) {
+			struct folio *folio = page_folio(page);
 			lock_page(page);
 			if (!__putback_lru_fast_prepare(page, &pvec_putback,
 					&pgrescued)) {
@@ -326,7 +326,7 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone)
 				 * pin before unlock_page()
 				 */
 				get_page(page); /* for putback_lru_page() */
-				__munlock_isolated_page(page);
+				__munlock_isolated_folio(folio);
 				unlock_page(page);
 				put_page(page); /* from follow_page_mask() */
 			}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 50/75] mm/huge_memory: Convert __split_huge_pmd() to take a folio
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (48 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 49/75] mm/mlock: Turn munlock_vma_page() into munlock_vma_folio() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 51/75] mm/rmap: Convert try_to_unmap() " Matthew Wilcox (Oracle)
                   ` (26 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Convert split_huge_pmd_address() at the same time since it only passes
the folio through, and its two callers already have a folio on hand.
Removes numerous calls to compound_head() and removes an assumption
that a page cannot be larger than a PMD.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/huge_mm.h |  8 +++----
 mm/huge_memory.c        | 50 ++++++++++++++++++++---------------------
 mm/rmap.c               |  6 +++--
 3 files changed, 33 insertions(+), 31 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 71c073d411ac..4368b314d9c8 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -194,7 +194,7 @@ static inline int split_huge_page(struct page *page)
 void deferred_split_huge_page(struct page *page);
 
 void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
-		unsigned long address, bool freeze, struct page *page);
+		unsigned long address, bool freeze, struct folio *folio);
 
 #define split_huge_pmd(__vma, __pmd, __address)				\
 	do {								\
@@ -207,7 +207,7 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
 
 
 void split_huge_pmd_address(struct vm_area_struct *vma, unsigned long address,
-		bool freeze, struct page *page);
+		bool freeze, struct folio *folio);
 
 void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud,
 		unsigned long address);
@@ -406,9 +406,9 @@ static inline void deferred_split_huge_page(struct page *page) {}
 	do { } while (0)
 
 static inline void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
-		unsigned long address, bool freeze, struct page *page) {}
+		unsigned long address, bool freeze, struct folio *folio) {}
 static inline void split_huge_pmd_address(struct vm_area_struct *vma,
-		unsigned long address, bool freeze, struct page *page) {}
+		unsigned long address, bool freeze, struct folio *folio) {}
 
 #define split_huge_pud(__vma, __pmd, __address)	\
 	do { } while (0)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 94e591d638eb..f934b93d08ca 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2143,11 +2143,11 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
 }
 
 void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
-		unsigned long address, bool freeze, struct page *page)
+		unsigned long address, bool freeze, struct folio *folio)
 {
 	spinlock_t *ptl;
 	struct mmu_notifier_range range;
-	bool do_unlock_page = false;
+	bool do_unlock_folio = false;
 	pmd_t _pmd;
 
 	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
@@ -2157,20 +2157,20 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
 	ptl = pmd_lock(vma->vm_mm, pmd);
 
 	/*
-	 * If caller asks to setup a migration entries, we need a page to check
-	 * pmd against. Otherwise we can end up replacing wrong page.
+	 * If caller asks to setup a migration entry, we need a folio to check
+	 * pmd against. Otherwise we can end up replacing wrong folio.
 	 */
-	VM_BUG_ON(freeze && !page);
-	if (page) {
-		VM_WARN_ON_ONCE(!PageLocked(page));
-		if (page != pmd_page(*pmd))
+	VM_BUG_ON(freeze && !folio);
+	if (folio) {
+		VM_WARN_ON_ONCE(!folio_test_locked(folio));
+		if (folio != page_folio(pmd_page(*pmd)))
 			goto out;
 	}
 
 repeat:
 	if (pmd_trans_huge(*pmd)) {
-		if (!page) {
-			page = pmd_page(*pmd);
+		if (!folio) {
+			folio = page_folio(pmd_page(*pmd));
 			/*
 			 * An anonymous page must be locked, to ensure that a
 			 * concurrent reuse_swap_page() sees stable mapcount;
@@ -2178,33 +2178,33 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
 			 * and page lock must not be taken when zap_pmd_range()
 			 * calls __split_huge_pmd() while i_mmap_lock is held.
 			 */
-			if (PageAnon(page)) {
-				if (unlikely(!trylock_page(page))) {
-					get_page(page);
+			if (folio_test_anon(folio)) {
+				if (unlikely(!folio_trylock(folio))) {
+					folio_get(folio);
 					_pmd = *pmd;
 					spin_unlock(ptl);
-					lock_page(page);
+					folio_lock(folio);
 					spin_lock(ptl);
 					if (unlikely(!pmd_same(*pmd, _pmd))) {
-						unlock_page(page);
-						put_page(page);
-						page = NULL;
+						folio_unlock(folio);
+						folio_put(folio);
+						folio = NULL;
 						goto repeat;
 					}
-					put_page(page);
+					folio_put(folio);
 				}
-				do_unlock_page = true;
+				do_unlock_folio = true;
 			}
 		}
-		if (PageMlocked(page))
-			clear_page_mlock(page);
+		if (folio_test_mlocked(folio))
+			folio_end_mlock(folio);
 	} else if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd)))
 		goto out;
 	__split_huge_pmd_locked(vma, pmd, range.start, freeze);
 out:
 	spin_unlock(ptl);
-	if (do_unlock_page)
-		unlock_page(page);
+	if (do_unlock_folio)
+		folio_unlock(folio);
 	/*
 	 * No need to double call mmu_notifier->invalidate_range() callback.
 	 * They are 3 cases to consider inside __split_huge_pmd_locked():
@@ -2222,7 +2222,7 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
 }
 
 void split_huge_pmd_address(struct vm_area_struct *vma, unsigned long address,
-		bool freeze, struct page *page)
+		bool freeze, struct folio *folio)
 {
 	pgd_t *pgd;
 	p4d_t *p4d;
@@ -2243,7 +2243,7 @@ void split_huge_pmd_address(struct vm_area_struct *vma, unsigned long address,
 
 	pmd = pmd_offset(pud, address);
 
-	__split_huge_pmd(vma, pmd, address, freeze, page);
+	__split_huge_pmd(vma, pmd, address, freeze, folio);
 }
 
 static inline void split_huge_pmd_if_needed(struct vm_area_struct *vma, unsigned long address)
diff --git a/mm/rmap.c b/mm/rmap.c
index a383e25fb196..42a147746ff8 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1422,6 +1422,7 @@ void page_remove_rmap(struct page *page, bool compound)
 static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 		     unsigned long address, void *arg)
 {
+	struct folio *folio = page_folio(page);
 	struct mm_struct *mm = vma->vm_mm;
 	struct page_vma_mapped_walk pvmw = {
 		.vma = vma,
@@ -1444,7 +1445,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 		pvmw.flags = PVMW_SYNC;
 
 	if (flags & TTU_SPLIT_HUGE_PMD)
-		split_huge_pmd_address(vma, address, false, page);
+		split_huge_pmd_address(vma, address, false, folio);
 
 	/*
 	 * For THP, we have to assume the worse case ie pmd for invalidation.
@@ -1721,6 +1722,7 @@ void try_to_unmap(struct page *page, enum ttu_flags flags)
 static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
 		     unsigned long address, void *arg)
 {
+	struct folio *folio = page_folio(page);
 	struct mm_struct *mm = vma->vm_mm;
 	struct page_vma_mapped_walk pvmw = {
 		.vma = vma,
@@ -1747,7 +1749,7 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
 	 * TTU_SPLIT_HUGE_PMD and it wants to freeze.
 	 */
 	if (flags & TTU_SPLIT_HUGE_PMD)
-		split_huge_pmd_address(vma, address, true, page);
+		split_huge_pmd_address(vma, address, true, folio);
 
 	/*
 	 * For THP, we have to assume the worse case ie pmd for invalidation.
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 51/75] mm/rmap: Convert try_to_unmap() to take a folio
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (49 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 50/75] mm/huge_memory: Convert __split_huge_pmd() to take a folio Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-09 14:24   ` Mauricio Faria de Oliveira
  2022-02-04 19:58 ` [PATCH 52/75] mm/rmap: Convert try_to_migrate() to folios Matthew Wilcox (Oracle)
                   ` (25 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Change both callers and the worker function try_to_unmap_one().
---
 include/linux/rmap.h |  4 +--
 mm/huge_memory.c     |  3 +-
 mm/memory-failure.c  |  7 ++--
 mm/memory_hotplug.c  | 13 ++++---
 mm/rmap.c            | 81 +++++++++++++++++++++++---------------------
 mm/vmscan.c          |  2 +-
 6 files changed, 60 insertions(+), 50 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 31f3a299ef66..66407434c3b5 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -193,7 +193,7 @@ int folio_referenced(struct folio *, int is_locked,
 			struct mem_cgroup *memcg, unsigned long *vm_flags);
 
 void try_to_migrate(struct page *page, enum ttu_flags flags);
-void try_to_unmap(struct page *, enum ttu_flags flags);
+void try_to_unmap(struct folio *, enum ttu_flags flags);
 
 int make_device_exclusive_range(struct mm_struct *mm, unsigned long start,
 				unsigned long end, struct page **pages,
@@ -311,7 +311,7 @@ static inline int folio_referenced(struct folio *folio, int is_locked,
 	return 0;
 }
 
-static inline void try_to_unmap(struct page *page, enum ttu_flags flags)
+static inline void try_to_unmap(struct folio *folio, enum ttu_flags flags)
 {
 }
 
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index f934b93d08ca..4ea22b7319fd 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2283,6 +2283,7 @@ void vma_adjust_trans_huge(struct vm_area_struct *vma,
 
 static void unmap_page(struct page *page)
 {
+	struct folio *folio = page_folio(page);
 	enum ttu_flags ttu_flags = TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PMD |
 		TTU_SYNC;
 
@@ -2296,7 +2297,7 @@ static void unmap_page(struct page *page)
 	if (PageAnon(page))
 		try_to_migrate(page, ttu_flags);
 	else
-		try_to_unmap(page, ttu_flags | TTU_IGNORE_MLOCK);
+		try_to_unmap(folio, ttu_flags | TTU_IGNORE_MLOCK);
 
 	VM_WARN_ON_ONCE_PAGE(page_mapped(page), page);
 }
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 97a9ed8f87a9..1c7a71b5248e 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1347,6 +1347,7 @@ static int get_hwpoison_page(struct page *p, unsigned long flags)
 static bool hwpoison_user_mappings(struct page *p, unsigned long pfn,
 				  int flags, struct page *hpage)
 {
+	struct folio *folio = page_folio(hpage);
 	enum ttu_flags ttu = TTU_IGNORE_MLOCK | TTU_SYNC;
 	struct address_space *mapping;
 	LIST_HEAD(tokill);
@@ -1412,7 +1413,7 @@ static bool hwpoison_user_mappings(struct page *p, unsigned long pfn,
 		collect_procs(hpage, &tokill, flags & MF_ACTION_REQUIRED);
 
 	if (!PageHuge(hpage)) {
-		try_to_unmap(hpage, ttu);
+		try_to_unmap(folio, ttu);
 	} else {
 		if (!PageAnon(hpage)) {
 			/*
@@ -1424,12 +1425,12 @@ static bool hwpoison_user_mappings(struct page *p, unsigned long pfn,
 			 */
 			mapping = hugetlb_page_mapping_lock_write(hpage);
 			if (mapping) {
-				try_to_unmap(hpage, ttu|TTU_RMAP_LOCKED);
+				try_to_unmap(folio, ttu|TTU_RMAP_LOCKED);
 				i_mmap_unlock_write(mapping);
 			} else
 				pr_info("Memory failure: %#lx: could not lock mapping for mapped huge page\n", pfn);
 		} else {
-			try_to_unmap(hpage, ttu);
+			try_to_unmap(folio, ttu);
 		}
 	}
 
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 2a9627dc784c..914057da53c7 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1690,10 +1690,13 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 				      DEFAULT_RATELIMIT_BURST);
 
 	for (pfn = start_pfn; pfn < end_pfn; pfn++) {
+		struct folio *folio;
+
 		if (!pfn_valid(pfn))
 			continue;
 		page = pfn_to_page(pfn);
-		head = compound_head(page);
+		folio = page_folio(page);
+		head = &folio->page;
 
 		if (PageHuge(page)) {
 			pfn = page_to_pfn(head) + compound_nr(head) - 1;
@@ -1710,10 +1713,10 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 		 * the unmap as the catch all safety net).
 		 */
 		if (PageHWPoison(page)) {
-			if (WARN_ON(PageLRU(page)))
-				isolate_lru_page(page);
-			if (page_mapped(page))
-				try_to_unmap(page, TTU_IGNORE_MLOCK);
+			if (WARN_ON(folio_test_lru(folio)))
+				folio_isolate_lru(folio);
+			if (folio_mapped(folio))
+				try_to_unmap(folio, TTU_IGNORE_MLOCK);
 			continue;
 		}
 
diff --git a/mm/rmap.c b/mm/rmap.c
index 42a147746ff8..c598fd667948 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1452,13 +1452,13 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 	 * For hugetlb, it could be much worse if we need to do pud
 	 * invalidation in the case of pmd sharing.
 	 *
-	 * Note that the page can not be free in this function as call of
-	 * try_to_unmap() must hold a reference on the page.
+	 * Note that the folio can not be freed in this function as call of
+	 * try_to_unmap() must hold a reference on the folio.
 	 */
 	range.end = vma_address_end(&pvmw);
 	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
 				address, range.end);
-	if (PageHuge(page)) {
+	if (folio_test_hugetlb(folio)) {
 		/*
 		 * If sharing is possible, start and end will be adjusted
 		 * accordingly.
@@ -1470,31 +1470,34 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 
 	while (page_vma_mapped_walk(&pvmw)) {
 		/*
-		 * If the page is mlock()d, we cannot swap it out.
+		 * If the folio is mlock()d, we cannot swap it out.
 		 */
 		if (!(flags & TTU_IGNORE_MLOCK) &&
 		    (vma->vm_flags & VM_LOCKED)) {
 			/*
-			 * PTE-mapped THP are never marked as mlocked: so do
-			 * not set it on a DoubleMap THP, nor on an Anon THP
+			 * PTE-mapped folios are never marked as mlocked: so do
+			 * not set it on a DoubleMap folio, nor on an Anon folio
 			 * (which may still be PTE-mapped after DoubleMap was
 			 * cleared).  But stop unmapping even in those cases.
 			 */
-			if (!PageTransCompound(page) || (PageHead(page) &&
-			     !PageDoubleMap(page) && !PageAnon(page)))
-				mlock_vma_page(page);
+			if (!folio_test_large(folio) ||
+			    (folio_test_large(folio) &&
+			     !folio_test_double_map(folio) &&
+			     !folio_test_anon(folio)))
+				mlock_vma_folio(folio);
 			page_vma_mapped_walk_done(&pvmw);
 			ret = false;
 			break;
 		}
 
 		/* Unexpected PMD-mapped THP? */
-		VM_BUG_ON_PAGE(!pvmw.pte, page);
+		VM_BUG_ON_FOLIO(!pvmw.pte, folio);
 
-		subpage = page - page_to_pfn(page) + pte_pfn(*pvmw.pte);
+		subpage = folio_page(folio,
+					pte_pfn(*pvmw.pte) - folio_pfn(folio));
 		address = pvmw.address;
 
-		if (PageHuge(page) && !PageAnon(page)) {
+		if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) {
 			/*
 			 * To call huge_pmd_unshare, i_mmap_rwsem must be
 			 * held in write mode.  Caller needs to explicitly
@@ -1533,7 +1536,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 		if (should_defer_flush(mm, flags)) {
 			/*
 			 * We clear the PTE but do not flush so potentially
-			 * a remote CPU could still be writing to the page.
+			 * a remote CPU could still be writing to the folio.
 			 * If the entry was previously clean then the
 			 * architecture must guarantee that a clear->dirty
 			 * transition on a cached TLB entry is written through
@@ -1546,22 +1549,22 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 			pteval = ptep_clear_flush(vma, address, pvmw.pte);
 		}
 
-		/* Move the dirty bit to the page. Now the pte is gone. */
+		/* Set the dirty flag on the folio now the pte is gone. */
 		if (pte_dirty(pteval))
-			set_page_dirty(page);
+			folio_mark_dirty(folio);
 
 		/* Update high watermark before we lower rss */
 		update_hiwater_rss(mm);
 
-		if (PageHWPoison(page) && !(flags & TTU_IGNORE_HWPOISON)) {
+		if (PageHWPoison(subpage) && !(flags & TTU_IGNORE_HWPOISON)) {
 			pteval = swp_entry_to_pte(make_hwpoison_entry(subpage));
-			if (PageHuge(page)) {
-				hugetlb_count_sub(compound_nr(page), mm);
+			if (folio_test_hugetlb(folio)) {
+				hugetlb_count_sub(folio_nr_pages(folio), mm);
 				set_huge_swap_pte_at(mm, address,
 						     pvmw.pte, pteval,
 						     vma_mmu_pagesize(vma));
 			} else {
-				dec_mm_counter(mm, mm_counter(page));
+				dec_mm_counter(mm, mm_counter(&folio->page));
 				set_pte_at(mm, address, pvmw.pte, pteval);
 			}
 
@@ -1576,18 +1579,19 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 			 * migration) will not expect userfaults on already
 			 * copied pages.
 			 */
-			dec_mm_counter(mm, mm_counter(page));
+			dec_mm_counter(mm, mm_counter(&folio->page));
 			/* We have to invalidate as we cleared the pte */
 			mmu_notifier_invalidate_range(mm, address,
 						      address + PAGE_SIZE);
-		} else if (PageAnon(page)) {
+		} else if (folio_test_anon(folio)) {
 			swp_entry_t entry = { .val = page_private(subpage) };
 			pte_t swp_pte;
 			/*
 			 * Store the swap location in the pte.
 			 * See handle_pte_fault() ...
 			 */
-			if (unlikely(PageSwapBacked(page) != PageSwapCache(page))) {
+			if (unlikely(folio_test_swapbacked(folio) !=
+					folio_test_swapcache(folio))) {
 				WARN_ON_ONCE(1);
 				ret = false;
 				/* We have to invalidate as we cleared the pte */
@@ -1598,8 +1602,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 			}
 
 			/* MADV_FREE page check */
-			if (!PageSwapBacked(page)) {
-				if (!PageDirty(page)) {
+			if (!folio_test_swapbacked(folio)) {
+				if (!folio_test_dirty(folio)) {
 					/* Invalidate as we cleared the pte */
 					mmu_notifier_invalidate_range(mm,
 						address, address + PAGE_SIZE);
@@ -1608,11 +1612,11 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 				}
 
 				/*
-				 * If the page was redirtied, it cannot be
+				 * If the folio was redirtied, it cannot be
 				 * discarded. Remap the page to page table.
 				 */
 				set_pte_at(mm, address, pvmw.pte, pteval);
-				SetPageSwapBacked(page);
+				folio_set_swapbacked(folio);
 				ret = false;
 				page_vma_mapped_walk_done(&pvmw);
 				break;
@@ -1649,16 +1653,17 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 						      address + PAGE_SIZE);
 		} else {
 			/*
-			 * This is a locked file-backed page, thus it cannot
-			 * be removed from the page cache and replaced by a new
-			 * page before mmu_notifier_invalidate_range_end, so no
-			 * concurrent thread might update its page table to
-			 * point at new page while a device still is using this
-			 * page.
+			 * This is a locked file-backed folio,
+			 * so it cannot be removed from the page
+			 * cache and replaced by a new folio before
+			 * mmu_notifier_invalidate_range_end, so no
+			 * concurrent thread might update its page table
+			 * to point at a new folio while a device is
+			 * still using this folio.
 			 *
 			 * See Documentation/vm/mmu_notifier.rst
 			 */
-			dec_mm_counter(mm, mm_counter_file(page));
+			dec_mm_counter(mm, mm_counter_file(&folio->page));
 		}
 discard:
 		/*
@@ -1668,8 +1673,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 		 *
 		 * See Documentation/vm/mmu_notifier.rst
 		 */
-		page_remove_rmap(subpage, PageHuge(page));
-		put_page(page);
+		page_remove_rmap(subpage, folio_test_hugetlb(folio));
+		folio_put(folio);
 	}
 
 	mmu_notifier_invalidate_range_end(&range);
@@ -1698,7 +1703,7 @@ static int page_not_mapped(struct page *page)
  * It is the caller's responsibility to check if the page is still
  * mapped when needed (use TTU_SYNC to prevent accounting races).
  */
-void try_to_unmap(struct page *page, enum ttu_flags flags)
+void try_to_unmap(struct folio *folio, enum ttu_flags flags)
 {
 	struct rmap_walk_control rwc = {
 		.rmap_one = try_to_unmap_one,
@@ -1708,9 +1713,9 @@ void try_to_unmap(struct page *page, enum ttu_flags flags)
 	};
 
 	if (flags & TTU_RMAP_LOCKED)
-		rmap_walk_locked(page, &rwc);
+		rmap_walk_locked(&folio->page, &rwc);
 	else
-		rmap_walk(page, &rwc);
+		rmap_walk(&folio->page, &rwc);
 }
 
 /*
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 1e751ba3b4a8..2e94e0b15a76 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1761,7 +1761,7 @@ static unsigned int shrink_page_list(struct list_head *page_list,
 			if (unlikely(PageTransHuge(page)))
 				flags |= TTU_SPLIT_HUGE_PMD;
 
-			try_to_unmap(page, flags);
+			try_to_unmap(folio, flags);
 			if (page_mapped(page)) {
 				stat->nr_unmap_fail += nr_pages;
 				if (!was_swapbacked && PageSwapBacked(page))
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 52/75] mm/rmap: Convert try_to_migrate() to folios
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (50 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 51/75] mm/rmap: Convert try_to_unmap() " Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-09 15:27   ` Zi Yan
  2022-02-04 19:58 ` [PATCH 53/75] mm/rmap: Convert make_device_exclusive_range() to use folios Matthew Wilcox (Oracle)
                   ` (24 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Convert the callers to pass a folio and the try_to_migrate_one()
worker to use a folio throughout.  Fixes an assumption that a
folio must be <= PMD size.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/rmap.h |  2 +-
 mm/huge_memory.c     |  4 ++--
 mm/migrate.c         | 12 ++++++----
 mm/rmap.c            | 57 +++++++++++++++++++++++---------------------
 4 files changed, 41 insertions(+), 34 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 66407434c3b5..502439f20d88 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -192,7 +192,7 @@ static inline void page_dup_rmap(struct page *page, bool compound)
 int folio_referenced(struct folio *, int is_locked,
 			struct mem_cgroup *memcg, unsigned long *vm_flags);
 
-void try_to_migrate(struct page *page, enum ttu_flags flags);
+void try_to_migrate(struct folio *folio, enum ttu_flags flags);
 void try_to_unmap(struct folio *, enum ttu_flags flags);
 
 int make_device_exclusive_range(struct mm_struct *mm, unsigned long start,
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 4ea22b7319fd..21676a4afd07 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2294,8 +2294,8 @@ static void unmap_page(struct page *page)
 	 * pages can simply be left unmapped, then faulted back on demand.
 	 * If that is ever changed (perhaps for mlock), update remap_page().
 	 */
-	if (PageAnon(page))
-		try_to_migrate(page, ttu_flags);
+	if (folio_test_anon(folio))
+		try_to_migrate(folio, ttu_flags);
 	else
 		try_to_unmap(folio, ttu_flags | TTU_IGNORE_MLOCK);
 
diff --git a/mm/migrate.c b/mm/migrate.c
index 766dc67874a1..5dcdd43d983d 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -927,6 +927,7 @@ static int move_to_new_page(struct page *newpage, struct page *page,
 static int __unmap_and_move(struct page *page, struct page *newpage,
 				int force, enum migrate_mode mode)
 {
+	struct folio *folio = page_folio(page);
 	int rc = -EAGAIN;
 	bool page_was_mapped = false;
 	struct anon_vma *anon_vma = NULL;
@@ -1030,7 +1031,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 		/* Establish migration ptes */
 		VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && !anon_vma,
 				page);
-		try_to_migrate(page, 0);
+		try_to_migrate(folio, 0);
 		page_was_mapped = true;
 	}
 
@@ -1173,6 +1174,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
 				enum migrate_mode mode, int reason,
 				struct list_head *ret)
 {
+	struct folio *src = page_folio(hpage);
 	int rc = -EAGAIN;
 	int page_was_mapped = 0;
 	struct page *new_hpage;
@@ -1249,7 +1251,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
 			ttu |= TTU_RMAP_LOCKED;
 		}
 
-		try_to_migrate(hpage, ttu);
+		try_to_migrate(src, ttu);
 		page_was_mapped = 1;
 
 		if (mapping_locked)
@@ -2449,6 +2451,7 @@ static void migrate_vma_unmap(struct migrate_vma *migrate)
 
 	for (i = 0; i < npages; i++) {
 		struct page *page = migrate_pfn_to_page(migrate->src[i]);
+		struct folio *folio;
 
 		if (!page)
 			continue;
@@ -2472,8 +2475,9 @@ static void migrate_vma_unmap(struct migrate_vma *migrate)
 			put_page(page);
 		}
 
-		if (page_mapped(page))
-			try_to_migrate(page, 0);
+		folio = page_folio(page);
+		if (folio_mapped(folio))
+			try_to_migrate(folio, 0);
 
 		if (page_mapped(page) || !migrate_vma_check_page(page)) {
 			if (!is_zone_device_page(page)) {
diff --git a/mm/rmap.c b/mm/rmap.c
index c598fd667948..4cfac67e328c 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1767,7 +1767,7 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
 	range.end = vma_address_end(&pvmw);
 	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
 				address, range.end);
-	if (PageHuge(page)) {
+	if (folio_test_hugetlb(folio)) {
 		/*
 		 * If sharing is possible, start and end will be adjusted
 		 * accordingly.
@@ -1781,21 +1781,24 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
 #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
 		/* PMD-mapped THP migration entry */
 		if (!pvmw.pte) {
-			VM_BUG_ON_PAGE(PageHuge(page) ||
-				       !PageTransCompound(page), page);
+			subpage = folio_page(folio,
+				pmd_pfn(*pvmw.pmd) - folio_pfn(folio));
+			VM_BUG_ON_FOLIO(folio_test_hugetlb(folio) ||
+					!folio_test_pmd_mappable(folio), folio);
 
-			set_pmd_migration_entry(&pvmw, page);
+			set_pmd_migration_entry(&pvmw, subpage);
 			continue;
 		}
 #endif
 
 		/* Unexpected PMD-mapped THP? */
-		VM_BUG_ON_PAGE(!pvmw.pte, page);
+		VM_BUG_ON_FOLIO(!pvmw.pte, folio);
 
-		subpage = page - page_to_pfn(page) + pte_pfn(*pvmw.pte);
+		subpage = folio_page(folio,
+				pte_pfn(*pvmw.pte) - folio_pfn(folio));
 		address = pvmw.address;
 
-		if (PageHuge(page) && !PageAnon(page)) {
+		if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) {
 			/*
 			 * To call huge_pmd_unshare, i_mmap_rwsem must be
 			 * held in write mode.  Caller needs to explicitly
@@ -1833,15 +1836,15 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
 		flush_cache_page(vma, address, pte_pfn(*pvmw.pte));
 		pteval = ptep_clear_flush(vma, address, pvmw.pte);
 
-		/* Move the dirty bit to the page. Now the pte is gone. */
+		/* Set the dirty flag on the folio now the pte is gone. */
 		if (pte_dirty(pteval))
-			set_page_dirty(page);
+			folio_mark_dirty(folio);
 
 		/* Update high watermark before we lower rss */
 		update_hiwater_rss(mm);
 
-		if (is_zone_device_page(page)) {
-			unsigned long pfn = page_to_pfn(page);
+		if (folio_is_zone_device(folio)) {
+			unsigned long pfn = folio_pfn(folio);
 			swp_entry_t entry;
 			pte_t swp_pte;
 
@@ -1877,16 +1880,16 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
 			 * changed when hugepage migrations to device private
 			 * memory are supported.
 			 */
-			subpage = page;
-		} else if (PageHWPoison(page)) {
+			subpage = &folio->page;
+		} else if (PageHWPoison(subpage)) {
 			pteval = swp_entry_to_pte(make_hwpoison_entry(subpage));
-			if (PageHuge(page)) {
-				hugetlb_count_sub(compound_nr(page), mm);
+			if (folio_test_hugetlb(folio)) {
+				hugetlb_count_sub(folio_nr_pages(folio), mm);
 				set_huge_swap_pte_at(mm, address,
 						     pvmw.pte, pteval,
 						     vma_mmu_pagesize(vma));
 			} else {
-				dec_mm_counter(mm, mm_counter(page));
+				dec_mm_counter(mm, mm_counter(&folio->page));
 				set_pte_at(mm, address, pvmw.pte, pteval);
 			}
 
@@ -1901,7 +1904,7 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
 			 * migration) will not expect userfaults on already
 			 * copied pages.
 			 */
-			dec_mm_counter(mm, mm_counter(page));
+			dec_mm_counter(mm, mm_counter(&folio->page));
 			/* We have to invalidate as we cleared the pte */
 			mmu_notifier_invalidate_range(mm, address,
 						      address + PAGE_SIZE);
@@ -1947,8 +1950,8 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
 		 *
 		 * See Documentation/vm/mmu_notifier.rst
 		 */
-		page_remove_rmap(subpage, PageHuge(page));
-		put_page(page);
+		page_remove_rmap(subpage, folio_test_hugetlb(folio));
+		folio_put(folio);
 	}
 
 	mmu_notifier_invalidate_range_end(&range);
@@ -1958,13 +1961,13 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
 
 /**
  * try_to_migrate - try to replace all page table mappings with swap entries
- * @page: the page to replace page table entries for
+ * @folio: the folio to replace page table entries for
  * @flags: action and flags
  *
- * Tries to remove all the page table entries which are mapping this page and
- * replace them with special swap entries. Caller must hold the page lock.
+ * Tries to remove all the page table entries which are mapping this folio and
+ * replace them with special swap entries. Caller must hold the folio lock.
  */
-void try_to_migrate(struct page *page, enum ttu_flags flags)
+void try_to_migrate(struct folio *folio, enum ttu_flags flags)
 {
 	struct rmap_walk_control rwc = {
 		.rmap_one = try_to_migrate_one,
@@ -1981,7 +1984,7 @@ void try_to_migrate(struct page *page, enum ttu_flags flags)
 					TTU_SYNC)))
 		return;
 
-	if (is_zone_device_page(page) && !is_device_private_page(page))
+	if (folio_is_zone_device(folio) && !folio_is_device_private(folio))
 		return;
 
 	/*
@@ -1992,13 +1995,13 @@ void try_to_migrate(struct page *page, enum ttu_flags flags)
 	 * locking requirements of exec(), migration skips
 	 * temporary VMAs until after exec() completes.
 	 */
-	if (!PageKsm(page) && PageAnon(page))
+	if (!folio_test_ksm(folio) && folio_test_anon(folio))
 		rwc.invalid_vma = invalid_migration_vma;
 
 	if (flags & TTU_RMAP_LOCKED)
-		rmap_walk_locked(page, &rwc);
+		rmap_walk_locked(&folio->page, &rwc);
 	else
-		rmap_walk(page, &rwc);
+		rmap_walk(&folio->page, &rwc);
 }
 
 /*
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 53/75] mm/rmap: Convert make_device_exclusive_range() to use folios
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (51 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 52/75] mm/rmap: Convert try_to_migrate() to folios Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 54/75] mm/migrate: Convert remove_migration_ptes() to folios Matthew Wilcox (Oracle)
                   ` (23 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Move the PageTail check earlier so we can avoid even taking the page
lock on tail pages.  Otherwise, this is a straightforward use of
folios throughout.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/rmap.c | 39 +++++++++++++++++++++------------------
 1 file changed, 21 insertions(+), 18 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 4cfac67e328c..ffc1b2f0cf24 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2088,6 +2088,7 @@ struct make_exclusive_args {
 static bool page_make_device_exclusive_one(struct page *page,
 		struct vm_area_struct *vma, unsigned long address, void *priv)
 {
+	struct folio *folio = page_folio(page);
 	struct mm_struct *mm = vma->vm_mm;
 	struct page_vma_mapped_walk pvmw = {
 		.vma = vma,
@@ -2104,12 +2105,13 @@ static bool page_make_device_exclusive_one(struct page *page,
 	pvmw_set_page(&pvmw, page);
 	mmu_notifier_range_init_owner(&range, MMU_NOTIFY_EXCLUSIVE, 0, vma,
 				      vma->vm_mm, address, min(vma->vm_end,
-				      address + page_size(page)), args->owner);
+				      address + folio_size(folio)),
+				      args->owner);
 	mmu_notifier_invalidate_range_start(&range);
 
 	while (page_vma_mapped_walk(&pvmw)) {
 		/* Unexpected PMD-mapped THP? */
-		VM_BUG_ON_PAGE(!pvmw.pte, page);
+		VM_BUG_ON_FOLIO(!pvmw.pte, folio);
 
 		if (!pte_present(*pvmw.pte)) {
 			ret = false;
@@ -2117,16 +2119,17 @@ static bool page_make_device_exclusive_one(struct page *page,
 			break;
 		}
 
-		subpage = page - page_to_pfn(page) + pte_pfn(*pvmw.pte);
+		subpage = folio_page(folio,
+				pte_pfn(*pvmw.pte) - folio_pfn(folio));
 		address = pvmw.address;
 
 		/* Nuke the page table entry. */
 		flush_cache_page(vma, address, pte_pfn(*pvmw.pte));
 		pteval = ptep_clear_flush(vma, address, pvmw.pte);
 
-		/* Move the dirty bit to the page. Now the pte is gone. */
+		/* Set the dirty flag on the folio now the pte is gone. */
 		if (pte_dirty(pteval))
-			set_page_dirty(page);
+			folio_mark_dirty(folio);
 
 		/*
 		 * Check that our target page is still mapped at the expected
@@ -2181,8 +2184,8 @@ static bool page_make_device_exclusive_one(struct page *page,
  * Returns false if the page is still mapped, or if it could not be unmapped
  * from the expected address. Otherwise returns true (success).
  */
-static bool page_make_device_exclusive(struct page *page, struct mm_struct *mm,
-				unsigned long address, void *owner)
+static bool folio_make_device_exclusive(struct folio *folio,
+		struct mm_struct *mm, unsigned long address, void *owner)
 {
 	struct make_exclusive_args args = {
 		.mm = mm,
@@ -2198,16 +2201,15 @@ static bool page_make_device_exclusive(struct page *page, struct mm_struct *mm,
 	};
 
 	/*
-	 * Restrict to anonymous pages for now to avoid potential writeback
-	 * issues. Also tail pages shouldn't be passed to rmap_walk so skip
-	 * those.
+	 * Restrict to anonymous folios for now to avoid potential writeback
+	 * issues.
 	 */
-	if (!PageAnon(page) || PageTail(page))
+	if (!folio_test_anon(folio))
 		return false;
 
-	rmap_walk(page, &rwc);
+	rmap_walk(&folio->page, &rwc);
 
-	return args.valid && !page_mapcount(page);
+	return args.valid && !folio_mapcount(folio);
 }
 
 /**
@@ -2245,15 +2247,16 @@ int make_device_exclusive_range(struct mm_struct *mm, unsigned long start,
 		return npages;
 
 	for (i = 0; i < npages; i++, start += PAGE_SIZE) {
-		if (!trylock_page(pages[i])) {
-			put_page(pages[i]);
+		struct folio *folio = page_folio(pages[i]);
+		if (PageTail(pages[i]) || !folio_trylock(folio)) {
+			folio_put(folio);
 			pages[i] = NULL;
 			continue;
 		}
 
-		if (!page_make_device_exclusive(pages[i], mm, start, owner)) {
-			unlock_page(pages[i]);
-			put_page(pages[i]);
+		if (!folio_make_device_exclusive(folio, mm, start, owner)) {
+			folio_unlock(folio);
+			folio_put(folio);
 			pages[i] = NULL;
 		}
 	}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 54/75] mm/migrate: Convert remove_migration_ptes() to folios
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (52 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 53/75] mm/rmap: Convert make_device_exclusive_range() to use folios Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 55/75] mm/damon: Convert damon_pa_mkold() to use a folio Matthew Wilcox (Oracle)
                   ` (22 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Convert the implementation and all callers.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/rmap.h |  2 +-
 mm/huge_memory.c     | 24 +++++++-------
 mm/migrate.c         | 74 +++++++++++++++++++++++++-------------------
 3 files changed, 56 insertions(+), 44 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 502439f20d88..85d17a38642c 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -263,7 +263,7 @@ int folio_mkclean(struct folio *);
 void page_mlock(struct page *page);
 void folio_mlock(struct folio *folio);
 
-void remove_migration_ptes(struct page *old, struct page *new, bool locked);
+void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked);
 
 /*
  * Called by memory-failure.c to kill processes.
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 21676a4afd07..7a0f4aaf7838 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2302,18 +2302,19 @@ static void unmap_page(struct page *page)
 	VM_WARN_ON_ONCE_PAGE(page_mapped(page), page);
 }
 
-static void remap_page(struct page *page, unsigned int nr)
+static void remap_page(struct folio *folio, unsigned long nr)
 {
-	int i;
+	int i = 0;
 
 	/* If unmap_page() uses try_to_migrate() on file, remove this check */
-	if (!PageAnon(page))
+	if (!folio_test_anon(folio))
 		return;
-	if (PageTransHuge(page)) {
-		remove_migration_ptes(page, page, true);
-	} else {
-		for (i = 0; i < nr; i++)
-			remove_migration_ptes(page + i, page + i, true);
+	for (;;) {
+		remove_migration_ptes(folio, folio, true);
+		i += folio_nr_pages(folio);
+		if (i >= nr)
+			break;
+		folio = folio_next(folio);
 	}
 }
 
@@ -2470,7 +2471,7 @@ static void __split_huge_page(struct page *page, struct list_head *list,
 	}
 	local_irq_enable();
 
-	remap_page(head, nr);
+	remap_page(folio, nr);
 
 	if (PageSwapCache(head)) {
 		swp_entry_t entry = { .val = page_private(head) };
@@ -2579,7 +2580,8 @@ bool can_split_huge_page(struct page *page, int *pextra_pins)
  */
 int split_huge_page_to_list(struct page *page, struct list_head *list)
 {
-	struct page *head = compound_head(page);
+	struct folio *folio = page_folio(page);
+	struct page *head = &folio->page;
 	struct deferred_split *ds_queue = get_deferred_split_queue(head);
 	XA_STATE(xas, &head->mapping->i_pages, head->index);
 	struct anon_vma *anon_vma = NULL;
@@ -2696,7 +2698,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 		if (mapping)
 			xas_unlock(&xas);
 		local_irq_enable();
-		remap_page(head, thp_nr_pages(head));
+		remap_page(folio, folio_nr_pages(folio));
 		ret = -EBUSY;
 	}
 
diff --git a/mm/migrate.c b/mm/migrate.c
index 5dcdd43d983d..4daa8298c79a 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -176,34 +176,36 @@ void putback_movable_pages(struct list_head *l)
 static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
 				 unsigned long addr, void *old)
 {
+	struct folio *folio = page_folio(page);
 	struct page_vma_mapped_walk pvmw = {
 		.vma = vma,
 		.address = addr,
 		.flags = PVMW_SYNC | PVMW_MIGRATION,
 	};
-	struct page *new;
-	pte_t pte;
-	swp_entry_t entry;
 
 	VM_BUG_ON_PAGE(PageTail(page), page);
 	pvmw_set_page(&pvmw, old);
 	while (page_vma_mapped_walk(&pvmw)) {
-		if (PageKsm(page))
-			new = page;
-		else
-			new = page - pvmw.pgoff +
-				linear_page_index(vma, pvmw.address);
+		pte_t pte;
+		swp_entry_t entry;
+		struct page *new;
+		unsigned long idx = 0;
+
+		if (!folio_test_ksm(folio))
+			idx = linear_page_index(vma, pvmw.address) - pvmw.pgoff;
+		new = folio_page(folio, idx);
 
 #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
 		/* PMD-mapped THP migration entry */
 		if (!pvmw.pte) {
-			VM_BUG_ON_PAGE(PageHuge(page) || !PageTransCompound(page), page);
+			VM_BUG_ON_FOLIO(folio_test_hugetlb(folio) ||
+					!folio_test_pmd_mappable(folio), folio);
 			remove_migration_pmd(&pvmw, new);
 			continue;
 		}
 #endif
 
-		get_page(new);
+		folio_get(folio);
 		pte = pte_mkold(mk_pte(new, READ_ONCE(vma->vm_page_prot)));
 		if (pte_swp_soft_dirty(*pvmw.pte))
 			pte = pte_mksoft_dirty(pte);
@@ -232,12 +234,12 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
 		}
 
 #ifdef CONFIG_HUGETLB_PAGE
-		if (PageHuge(new)) {
+		if (folio_test_hugetlb(folio)) {
 			unsigned int shift = huge_page_shift(hstate_vma(vma));
 
 			pte = pte_mkhuge(pte);
 			pte = arch_make_huge_pte(pte, shift, vma->vm_flags);
-			if (PageAnon(new))
+			if (folio_test_anon(folio))
 				hugepage_add_anon_rmap(new, vma, pvmw.address);
 			else
 				page_dup_rmap(new, true);
@@ -245,17 +247,17 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
 		} else
 #endif
 		{
-			if (PageAnon(new))
+			if (folio_test_anon(folio))
 				page_add_anon_rmap(new, vma, pvmw.address, false);
 			else
 				page_add_file_rmap(new, false);
 			set_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte);
 		}
-		if (vma->vm_flags & VM_LOCKED && !PageTransCompound(new))
-			mlock_vma_page(new);
+		if (vma->vm_flags & VM_LOCKED && !folio_test_large(folio))
+			mlock_vma_folio(folio);
 
-		if (PageTransHuge(page) && PageMlocked(page))
-			clear_page_mlock(page);
+		if (folio_test_large(folio) && folio_test_mlocked(folio))
+			folio_end_mlock(folio);
 
 		/* No need to invalidate - it was non-present before */
 		update_mmu_cache(vma, pvmw.address, pvmw.pte);
@@ -268,17 +270,17 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
  * Get rid of all migration entries and replace them by
  * references to the indicated page.
  */
-void remove_migration_ptes(struct page *old, struct page *new, bool locked)
+void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked)
 {
 	struct rmap_walk_control rwc = {
 		.rmap_one = remove_migration_pte,
-		.arg = old,
+		.arg = src,
 	};
 
 	if (locked)
-		rmap_walk_locked(new, &rwc);
+		rmap_walk_locked(&dst->page, &rwc);
 	else
-		rmap_walk(new, &rwc);
+		rmap_walk(&dst->page, &rwc);
 }
 
 /*
@@ -771,6 +773,7 @@ int buffer_migrate_page_norefs(struct address_space *mapping,
  */
 static int writeout(struct address_space *mapping, struct page *page)
 {
+	struct folio *folio = page_folio(page);
 	struct writeback_control wbc = {
 		.sync_mode = WB_SYNC_NONE,
 		.nr_to_write = 1,
@@ -796,7 +799,7 @@ static int writeout(struct address_space *mapping, struct page *page)
 	 * At this point we know that the migration attempt cannot
 	 * be successful.
 	 */
-	remove_migration_ptes(page, page, false);
+	remove_migration_ptes(folio, folio, false);
 
 	rc = mapping->a_ops->writepage(page, &wbc);
 
@@ -928,6 +931,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 				int force, enum migrate_mode mode)
 {
 	struct folio *folio = page_folio(page);
+	struct folio *dst = page_folio(newpage);
 	int rc = -EAGAIN;
 	bool page_was_mapped = false;
 	struct anon_vma *anon_vma = NULL;
@@ -1039,8 +1043,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 		rc = move_to_new_page(newpage, page, mode);
 
 	if (page_was_mapped)
-		remove_migration_ptes(page,
-			rc == MIGRATEPAGE_SUCCESS ? newpage : page, false);
+		remove_migration_ptes(folio,
+			rc == MIGRATEPAGE_SUCCESS ? dst : folio, false);
 
 out_unlock_both:
 	unlock_page(newpage);
@@ -1174,7 +1178,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
 				enum migrate_mode mode, int reason,
 				struct list_head *ret)
 {
-	struct folio *src = page_folio(hpage);
+	struct folio *dst, *src = page_folio(hpage);
 	int rc = -EAGAIN;
 	int page_was_mapped = 0;
 	struct page *new_hpage;
@@ -1202,6 +1206,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
 	new_hpage = get_new_page(hpage, private);
 	if (!new_hpage)
 		return -ENOMEM;
+	dst = page_folio(new_hpage);
 
 	if (!trylock_page(hpage)) {
 		if (!force)
@@ -1262,8 +1267,8 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
 		rc = move_to_new_page(new_hpage, hpage, mode);
 
 	if (page_was_mapped)
-		remove_migration_ptes(hpage,
-			rc == MIGRATEPAGE_SUCCESS ? new_hpage : hpage, false);
+		remove_migration_ptes(src,
+			rc == MIGRATEPAGE_SUCCESS ? dst : src, false);
 
 unlock_put_anon:
 	unlock_page(new_hpage);
@@ -2494,15 +2499,17 @@ static void migrate_vma_unmap(struct migrate_vma *migrate)
 
 	for (i = 0; i < npages && restore; i++) {
 		struct page *page = migrate_pfn_to_page(migrate->src[i]);
+		struct folio *folio;
 
 		if (!page || (migrate->src[i] & MIGRATE_PFN_MIGRATE))
 			continue;
 
-		remove_migration_ptes(page, page, false);
+		folio = page_folio(page);
+		remove_migration_ptes(folio, folio, false);
 
 		migrate->src[i] = 0;
-		unlock_page(page);
-		put_page(page);
+		folio_unlock(folio);
+		folio_put(folio);
 		restore--;
 	}
 }
@@ -2851,6 +2858,7 @@ void migrate_vma_finalize(struct migrate_vma *migrate)
 	unsigned long i;
 
 	for (i = 0; i < npages; i++) {
+		struct folio *dst, *src;
 		struct page *newpage = migrate_pfn_to_page(migrate->dst[i]);
 		struct page *page = migrate_pfn_to_page(migrate->src[i]);
 
@@ -2870,8 +2878,10 @@ void migrate_vma_finalize(struct migrate_vma *migrate)
 			newpage = page;
 		}
 
-		remove_migration_ptes(page, newpage, false);
-		unlock_page(page);
+		src = page_folio(page);
+		dst = page_folio(newpage);
+		remove_migration_ptes(src, dst, false);
+		folio_unlock(src);
 
 		if (is_zone_device_page(page))
 			put_page(page);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 55/75] mm/damon: Convert damon_pa_mkold() to use a folio
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (53 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 54/75] mm/migrate: Convert remove_migration_ptes() to folios Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 56/75] mm/damon: Convert damon_pa_young() " Matthew Wilcox (Oracle)
                   ` (21 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Ensure that we're passing the entire folio to rmap_walk().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/damon/paddr.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/mm/damon/paddr.c b/mm/damon/paddr.c
index 4e27d64abbb7..a92d8b146527 100644
--- a/mm/damon/paddr.c
+++ b/mm/damon/paddr.c
@@ -37,6 +37,7 @@ static bool __damon_pa_mkold(struct page *page, struct vm_area_struct *vma,
 
 static void damon_pa_mkold(unsigned long paddr)
 {
+	struct folio *folio;
 	struct page *page = damon_get_page(PHYS_PFN(paddr));
 	struct rmap_walk_control rwc = {
 		.rmap_one = __damon_pa_mkold,
@@ -46,23 +47,24 @@ static void damon_pa_mkold(unsigned long paddr)
 
 	if (!page)
 		return;
+	folio = page_folio(page);
 
-	if (!page_mapped(page) || !page_rmapping(page)) {
-		set_page_idle(page);
+	if (!folio_mapped(folio) || !folio_raw_mapping(folio)) {
+		folio_set_idle(folio);
 		goto out;
 	}
 
-	need_lock = !PageAnon(page) || PageKsm(page);
-	if (need_lock && !trylock_page(page))
+	need_lock = !folio_test_anon(folio) || folio_test_ksm(folio);
+	if (need_lock && !folio_trylock(folio))
 		goto out;
 
-	rmap_walk(page, &rwc);
+	rmap_walk(&folio->page, &rwc);
 
 	if (need_lock)
-		unlock_page(page);
+		folio_unlock(folio);
 
 out:
-	put_page(page);
+	folio_put(folio);
 }
 
 static void __damon_pa_prepare_access_check(struct damon_ctx *ctx,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 56/75] mm/damon: Convert damon_pa_young() to use a folio
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (54 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 55/75] mm/damon: Convert damon_pa_mkold() to use a folio Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 57/75] mm/rmap: Turn page_lock_anon_vma_read() into folio_lock_anon_vma_read() Matthew Wilcox (Oracle)
                   ` (20 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Ensure that we're passing the entire folio to rmap_walk().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/damon/paddr.c | 25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/mm/damon/paddr.c b/mm/damon/paddr.c
index a92d8b146527..05e85a131a49 100644
--- a/mm/damon/paddr.c
+++ b/mm/damon/paddr.c
@@ -94,6 +94,7 @@ struct damon_pa_access_chk_result {
 static bool __damon_pa_young(struct page *page, struct vm_area_struct *vma,
 		unsigned long addr, void *arg)
 {
+	struct folio *folio = page_folio(page);
 	struct damon_pa_access_chk_result *result = arg;
 	struct page_vma_mapped_walk pvmw = {
 		.vma = vma,
@@ -107,12 +108,12 @@ static bool __damon_pa_young(struct page *page, struct vm_area_struct *vma,
 		addr = pvmw.address;
 		if (pvmw.pte) {
 			result->accessed = pte_young(*pvmw.pte) ||
-				!page_is_idle(page) ||
+				!folio_test_idle(folio) ||
 				mmu_notifier_test_young(vma->vm_mm, addr);
 		} else {
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 			result->accessed = pmd_young(*pvmw.pmd) ||
-				!page_is_idle(page) ||
+				!folio_test_idle(folio) ||
 				mmu_notifier_test_young(vma->vm_mm, addr);
 			result->page_sz = ((1UL) << HPAGE_PMD_SHIFT);
 #else
@@ -131,6 +132,7 @@ static bool __damon_pa_young(struct page *page, struct vm_area_struct *vma,
 
 static bool damon_pa_young(unsigned long paddr, unsigned long *page_sz)
 {
+	struct folio *folio;
 	struct page *page = damon_get_page(PHYS_PFN(paddr));
 	struct damon_pa_access_chk_result result = {
 		.page_sz = PAGE_SIZE,
@@ -145,27 +147,28 @@ static bool damon_pa_young(unsigned long paddr, unsigned long *page_sz)
 
 	if (!page)
 		return false;
+	folio = page_folio(page);
 
-	if (!page_mapped(page) || !page_rmapping(page)) {
-		if (page_is_idle(page))
+	if (!folio_mapped(folio) || !folio_raw_mapping(folio)) {
+		if (folio_test_idle(folio))
 			result.accessed = false;
 		else
 			result.accessed = true;
-		put_page(page);
+		folio_put(folio);
 		goto out;
 	}
 
-	need_lock = !PageAnon(page) || PageKsm(page);
-	if (need_lock && !trylock_page(page)) {
-		put_page(page);
+	need_lock = !folio_test_anon(folio) || folio_test_ksm(folio);
+	if (need_lock && !folio_trylock(folio)) {
+		folio_put(folio);
 		return NULL;
 	}
 
-	rmap_walk(page, &rwc);
+	rmap_walk(&folio->page, &rwc);
 
 	if (need_lock)
-		unlock_page(page);
-	put_page(page);
+		folio_unlock(folio);
+	folio_put(folio);
 
 out:
 	*page_sz = result.page_sz;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 57/75] mm/rmap: Turn page_lock_anon_vma_read() into folio_lock_anon_vma_read()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (55 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 56/75] mm/damon: Convert damon_pa_young() " Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 58/75] mm: Turn page_anon_vma() into folio_anon_vma() Matthew Wilcox (Oracle)
                   ` (19 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Add back page_lock_anon_vma_read() as a wrapper.  This saves a few calls
to compound_head().  If any callers were passing a tail page before,
this would have failed to lock the anon VMA as page->mapping is not
valid for tail pages.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/rmap.h |  1 +
 mm/folio-compat.c    |  5 +++++
 mm/memory-failure.c  |  3 ++-
 mm/rmap.c            | 12 ++++++------
 4 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 85d17a38642c..71798112a575 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -269,6 +269,7 @@ void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked);
  * Called by memory-failure.c to kill processes.
  */
 struct anon_vma *page_lock_anon_vma_read(struct page *page);
+struct anon_vma *folio_lock_anon_vma_read(struct folio *folio);
 void page_unlock_anon_vma_read(struct anon_vma *anon_vma);
 int page_mapped_in_vma(struct page *page, struct vm_area_struct *vma);
 
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 3804fd8c1f20..e04fba5e45e5 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -185,3 +185,8 @@ void page_mlock(struct page *page)
 {
 	folio_mlock(page_folio(page));
 }
+
+struct anon_vma *page_lock_anon_vma_read(struct page *page)
+{
+	return folio_lock_anon_vma_read(page_folio(page));
+}
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 1c7a71b5248e..ed1a47d9c35d 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -487,12 +487,13 @@ static struct task_struct *task_early_kill(struct task_struct *tsk,
 static void collect_procs_anon(struct page *page, struct list_head *to_kill,
 				int force_early)
 {
+	struct folio *folio = page_folio(page);
 	struct vm_area_struct *vma;
 	struct task_struct *tsk;
 	struct anon_vma *av;
 	pgoff_t pgoff;
 
-	av = page_lock_anon_vma_read(page);
+	av = folio_lock_anon_vma_read(folio);
 	if (av == NULL)	/* Not actually mapped anymore */
 		return;
 
diff --git a/mm/rmap.c b/mm/rmap.c
index ffc1b2f0cf24..ba65d5d3eb5a 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -526,28 +526,28 @@ struct anon_vma *page_get_anon_vma(struct page *page)
  * atomic op -- the trylock. If we fail the trylock, we fall back to getting a
  * reference like with page_get_anon_vma() and then block on the mutex.
  */
-struct anon_vma *page_lock_anon_vma_read(struct page *page)
+struct anon_vma *folio_lock_anon_vma_read(struct folio *folio)
 {
 	struct anon_vma *anon_vma = NULL;
 	struct anon_vma *root_anon_vma;
 	unsigned long anon_mapping;
 
 	rcu_read_lock();
-	anon_mapping = (unsigned long)READ_ONCE(page->mapping);
+	anon_mapping = (unsigned long)READ_ONCE(folio->mapping);
 	if ((anon_mapping & PAGE_MAPPING_FLAGS) != PAGE_MAPPING_ANON)
 		goto out;
-	if (!page_mapped(page))
+	if (!folio_mapped(folio))
 		goto out;
 
 	anon_vma = (struct anon_vma *) (anon_mapping - PAGE_MAPPING_ANON);
 	root_anon_vma = READ_ONCE(anon_vma->root);
 	if (down_read_trylock(&root_anon_vma->rwsem)) {
 		/*
-		 * If the page is still mapped, then this anon_vma is still
+		 * If the folio is still mapped, then this anon_vma is still
 		 * its anon_vma, and holding the mutex ensures that it will
 		 * not go away, see anon_vma_free().
 		 */
-		if (!page_mapped(page)) {
+		if (!folio_mapped(folio)) {
 			up_read(&root_anon_vma->rwsem);
 			anon_vma = NULL;
 		}
@@ -560,7 +560,7 @@ struct anon_vma *page_lock_anon_vma_read(struct page *page)
 		goto out;
 	}
 
-	if (!page_mapped(page)) {
+	if (!folio_mapped(folio)) {
 		rcu_read_unlock();
 		put_anon_vma(anon_vma);
 		return NULL;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 58/75] mm: Turn page_anon_vma() into folio_anon_vma()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (56 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 57/75] mm/rmap: Turn page_lock_anon_vma_read() into folio_lock_anon_vma_read() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 59/75] mm/rmap: Convert rmap_walk() to take a folio Matthew Wilcox (Oracle)
                   ` (18 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Move the prototype from mm.h to mm/internal.h and convert all callers
to pass a folio.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h |  1 -
 mm/internal.h      |  1 +
 mm/ksm.c           |  3 ++-
 mm/rmap.c          | 19 ++++++++++++-------
 mm/util.c          |  3 +--
 5 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 028bd9336e82..74d9cda7cfd6 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1765,7 +1765,6 @@ static inline void *folio_address(const struct folio *folio)
 }
 
 extern void *page_rmapping(struct page *page);
-extern struct anon_vma *page_anon_vma(struct page *page);
 extern pgoff_t __page_file_index(struct page *page);
 
 /*
diff --git a/mm/internal.h b/mm/internal.h
index 66645972cbd7..360256e4ee06 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -393,6 +393,7 @@ static inline bool is_data_mapping(vm_flags_t flags)
 void __vma_link_list(struct mm_struct *mm, struct vm_area_struct *vma,
 		struct vm_area_struct *prev);
 void __vma_unlink_list(struct mm_struct *mm, struct vm_area_struct *vma);
+struct anon_vma *folio_anon_vma(struct folio *folio);
 
 #ifdef CONFIG_MMU
 void unmap_mapping_folio(struct folio *folio);
diff --git a/mm/ksm.c b/mm/ksm.c
index 1639160c9e9a..212186dbc89f 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2567,7 +2567,8 @@ void __ksm_exit(struct mm_struct *mm)
 struct page *ksm_might_need_to_copy(struct page *page,
 			struct vm_area_struct *vma, unsigned long address)
 {
-	struct anon_vma *anon_vma = page_anon_vma(page);
+	struct folio *folio = page_folio(page);
+	struct anon_vma *anon_vma = folio_anon_vma(folio);
 	struct page *new_page;
 
 	if (PageKsm(page)) {
diff --git a/mm/rmap.c b/mm/rmap.c
index ba65d5d3eb5a..8bbbbea483cf 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -737,8 +737,9 @@ static bool should_defer_flush(struct mm_struct *mm, enum ttu_flags flags)
  */
 unsigned long page_address_in_vma(struct page *page, struct vm_area_struct *vma)
 {
-	if (PageAnon(page)) {
-		struct anon_vma *page__anon_vma = page_anon_vma(page);
+	struct folio *folio = page_folio(page);
+	if (folio_test_anon(folio)) {
+		struct anon_vma *page__anon_vma = folio_anon_vma(folio);
 		/*
 		 * Note: swapoff's unuse_vma() is more efficient with this
 		 * check, and needs it to match anon_vma when KSM is active.
@@ -748,7 +749,7 @@ unsigned long page_address_in_vma(struct page *page, struct vm_area_struct *vma)
 			return -EFAULT;
 	} else if (!vma->vm_file) {
 		return -EFAULT;
-	} else if (vma->vm_file->f_mapping != compound_head(page)->mapping) {
+	} else if (vma->vm_file->f_mapping != folio->mapping) {
 		return -EFAULT;
 	}
 
@@ -1109,6 +1110,7 @@ static void __page_set_anon_rmap(struct page *page,
 static void __page_check_anon_rmap(struct page *page,
 	struct vm_area_struct *vma, unsigned long address)
 {
+	struct folio *folio = page_folio(page);
 	/*
 	 * The page's anon-rmap details (mapping and index) are guaranteed to
 	 * be set up correctly at this point.
@@ -1120,7 +1122,8 @@ static void __page_check_anon_rmap(struct page *page,
 	 * are initially only visible via the pagetables, and the pte is locked
 	 * over the call to page_add_new_anon_rmap.
 	 */
-	VM_BUG_ON_PAGE(page_anon_vma(page)->root != vma->anon_vma->root, page);
+	VM_BUG_ON_FOLIO(folio_anon_vma(folio)->root != vma->anon_vma->root,
+			folio);
 	VM_BUG_ON_PAGE(page_to_pgoff(page) != linear_page_index(vma, address),
 		       page);
 }
@@ -2278,6 +2281,7 @@ void __put_anon_vma(struct anon_vma *anon_vma)
 static struct anon_vma *rmap_walk_anon_lock(struct page *page,
 					struct rmap_walk_control *rwc)
 {
+	struct folio *folio = page_folio(page);
 	struct anon_vma *anon_vma;
 
 	if (rwc->anon_lock)
@@ -2289,7 +2293,7 @@ static struct anon_vma *rmap_walk_anon_lock(struct page *page,
 	 * are holding mmap_lock. Users without mmap_lock are required to
 	 * take a reference count to prevent the anon_vma disappearing
 	 */
-	anon_vma = page_anon_vma(page);
+	anon_vma = folio_anon_vma(folio);
 	if (!anon_vma)
 		return NULL;
 
@@ -2314,14 +2318,15 @@ static struct anon_vma *rmap_walk_anon_lock(struct page *page,
 static void rmap_walk_anon(struct page *page, struct rmap_walk_control *rwc,
 		bool locked)
 {
+	struct folio *folio = page_folio(page);
 	struct anon_vma *anon_vma;
 	pgoff_t pgoff_start, pgoff_end;
 	struct anon_vma_chain *avc;
 
 	if (locked) {
-		anon_vma = page_anon_vma(page);
+		anon_vma = folio_anon_vma(folio);
 		/* anon_vma disappear under us? */
-		VM_BUG_ON_PAGE(!anon_vma, page);
+		VM_BUG_ON_FOLIO(!anon_vma, folio);
 	} else {
 		anon_vma = rmap_walk_anon_lock(page, rwc);
 	}
diff --git a/mm/util.c b/mm/util.c
index b614f423aaa4..13fc88ac8e70 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -679,9 +679,8 @@ bool folio_mapped(struct folio *folio)
 }
 EXPORT_SYMBOL(folio_mapped);
 
-struct anon_vma *page_anon_vma(struct page *page)
+struct anon_vma *folio_anon_vma(struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
 	unsigned long mapping = (unsigned long)folio->mapping;
 
 	if ((mapping & PAGE_MAPPING_FLAGS) != PAGE_MAPPING_ANON)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 59/75] mm/rmap: Convert rmap_walk() to take a folio
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (57 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 58/75] mm: Turn page_anon_vma() into folio_anon_vma() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 60/75] mm/rmap: Constify the rmap_walk_control argument Matthew Wilcox (Oracle)
                   ` (17 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This ripples all the way through to every calling and called function
from rmap.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/ksm.h  |   4 +-
 include/linux/rmap.h |  11 ++--
 mm/damon/paddr.c     |  17 +++---
 mm/folio-compat.c    |   5 --
 mm/huge_memory.c     |   2 +-
 mm/ksm.c             |  12 ++--
 mm/migrate.c         |  12 ++--
 mm/page_idle.c       |   9 ++-
 mm/rmap.c            | 128 ++++++++++++++++++++-----------------------
 9 files changed, 91 insertions(+), 109 deletions(-)

diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index a38a5bca1ba5..0b4f17418f64 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -51,7 +51,7 @@ static inline void ksm_exit(struct mm_struct *mm)
 struct page *ksm_might_need_to_copy(struct page *page,
 			struct vm_area_struct *vma, unsigned long address);
 
-void rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc);
+void rmap_walk_ksm(struct folio *folio, struct rmap_walk_control *rwc);
 void folio_migrate_ksm(struct folio *newfolio, struct folio *folio);
 
 #else  /* !CONFIG_KSM */
@@ -78,7 +78,7 @@ static inline struct page *ksm_might_need_to_copy(struct page *page,
 	return page;
 }
 
-static inline void rmap_walk_ksm(struct page *page,
+static inline void rmap_walk_ksm(struct folio *folio,
 			struct rmap_walk_control *rwc)
 {
 }
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 71798112a575..4e4c4412b295 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -268,7 +268,6 @@ void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked);
 /*
  * Called by memory-failure.c to kill processes.
  */
-struct anon_vma *page_lock_anon_vma_read(struct page *page);
 struct anon_vma *folio_lock_anon_vma_read(struct folio *folio);
 void page_unlock_anon_vma_read(struct anon_vma *anon_vma);
 int page_mapped_in_vma(struct page *page, struct vm_area_struct *vma);
@@ -288,15 +287,15 @@ struct rmap_walk_control {
 	 * Return false if page table scanning in rmap_walk should be stopped.
 	 * Otherwise, return true.
 	 */
-	bool (*rmap_one)(struct page *page, struct vm_area_struct *vma,
+	bool (*rmap_one)(struct folio *folio, struct vm_area_struct *vma,
 					unsigned long addr, void *arg);
-	int (*done)(struct page *page);
-	struct anon_vma *(*anon_lock)(struct page *page);
+	int (*done)(struct folio *folio);
+	struct anon_vma *(*anon_lock)(struct folio *folio);
 	bool (*invalid_vma)(struct vm_area_struct *vma, void *arg);
 };
 
-void rmap_walk(struct page *page, struct rmap_walk_control *rwc);
-void rmap_walk_locked(struct page *page, struct rmap_walk_control *rwc);
+void rmap_walk(struct folio *folio, struct rmap_walk_control *rwc);
+void rmap_walk_locked(struct folio *folio, struct rmap_walk_control *rwc);
 
 #else	/* !CONFIG_MMU */
 
diff --git a/mm/damon/paddr.c b/mm/damon/paddr.c
index 05e85a131a49..d336eafb74f8 100644
--- a/mm/damon/paddr.c
+++ b/mm/damon/paddr.c
@@ -16,7 +16,7 @@
 #include "../internal.h"
 #include "prmtv-common.h"
 
-static bool __damon_pa_mkold(struct page *page, struct vm_area_struct *vma,
+static bool __damon_pa_mkold(struct folio *folio, struct vm_area_struct *vma,
 		unsigned long addr, void *arg)
 {
 	struct page_vma_mapped_walk pvmw = {
@@ -24,7 +24,7 @@ static bool __damon_pa_mkold(struct page *page, struct vm_area_struct *vma,
 		.address = addr,
 	};
 
-	pvmw_set_page(&pvmw, page);
+	pvmw_set_folio(&pvmw, folio);
 	while (page_vma_mapped_walk(&pvmw)) {
 		addr = pvmw.address;
 		if (pvmw.pte)
@@ -41,7 +41,7 @@ static void damon_pa_mkold(unsigned long paddr)
 	struct page *page = damon_get_page(PHYS_PFN(paddr));
 	struct rmap_walk_control rwc = {
 		.rmap_one = __damon_pa_mkold,
-		.anon_lock = page_lock_anon_vma_read,
+		.anon_lock = folio_lock_anon_vma_read,
 	};
 	bool need_lock;
 
@@ -58,7 +58,7 @@ static void damon_pa_mkold(unsigned long paddr)
 	if (need_lock && !folio_trylock(folio))
 		goto out;
 
-	rmap_walk(&folio->page, &rwc);
+	rmap_walk(folio, &rwc);
 
 	if (need_lock)
 		folio_unlock(folio);
@@ -91,17 +91,16 @@ struct damon_pa_access_chk_result {
 	bool accessed;
 };
 
-static bool __damon_pa_young(struct page *page, struct vm_area_struct *vma,
+static bool __damon_pa_young(struct folio *folio, struct vm_area_struct *vma,
 		unsigned long addr, void *arg)
 {
-	struct folio *folio = page_folio(page);
 	struct damon_pa_access_chk_result *result = arg;
 	struct page_vma_mapped_walk pvmw = {
 		.vma = vma,
 		.address = addr,
 	};
 
-	pvmw_set_page(&pvmw, page);
+	pvmw_set_folio(&pvmw, folio);
 	result->accessed = false;
 	result->page_sz = PAGE_SIZE;
 	while (page_vma_mapped_walk(&pvmw)) {
@@ -141,7 +140,7 @@ static bool damon_pa_young(unsigned long paddr, unsigned long *page_sz)
 	struct rmap_walk_control rwc = {
 		.arg = &result,
 		.rmap_one = __damon_pa_young,
-		.anon_lock = page_lock_anon_vma_read,
+		.anon_lock = folio_lock_anon_vma_read,
 	};
 	bool need_lock;
 
@@ -164,7 +163,7 @@ static bool damon_pa_young(unsigned long paddr, unsigned long *page_sz)
 		return NULL;
 	}
 
-	rmap_walk(&folio->page, &rwc);
+	rmap_walk(folio, &rwc);
 
 	if (need_lock)
 		folio_unlock(folio);
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index e04fba5e45e5..3804fd8c1f20 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -185,8 +185,3 @@ void page_mlock(struct page *page)
 {
 	folio_mlock(page_folio(page));
 }
-
-struct anon_vma *page_lock_anon_vma_read(struct page *page)
-{
-	return folio_lock_anon_vma_read(page_folio(page));
-}
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 7a0f4aaf7838..f711dabc9c62 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2601,7 +2601,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 		 * The caller does not necessarily hold an mmap_lock that would
 		 * prevent the anon_vma disappearing so we first we take a
 		 * reference to it and then lock the anon_vma for write. This
-		 * is similar to page_lock_anon_vma_read except the write lock
+		 * is similar to folio_lock_anon_vma_read except the write lock
 		 * is taken to serialise against parallel split or collapse
 		 * operations.
 		 */
diff --git a/mm/ksm.c b/mm/ksm.c
index 212186dbc89f..0ec3d9035419 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2601,21 +2601,21 @@ struct page *ksm_might_need_to_copy(struct page *page,
 	return new_page;
 }
 
-void rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc)
+void rmap_walk_ksm(struct folio *folio, struct rmap_walk_control *rwc)
 {
 	struct stable_node *stable_node;
 	struct rmap_item *rmap_item;
 	int search_new_forks = 0;
 
-	VM_BUG_ON_PAGE(!PageKsm(page), page);
+	VM_BUG_ON_FOLIO(!folio_test_ksm(folio), folio);
 
 	/*
 	 * Rely on the page lock to protect against concurrent modifications
 	 * to that page's node of the stable tree.
 	 */
-	VM_BUG_ON_PAGE(!PageLocked(page), page);
+	VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
 
-	stable_node = page_stable_node(page);
+	stable_node = folio_stable_node(folio);
 	if (!stable_node)
 		return;
 again:
@@ -2650,11 +2650,11 @@ void rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc)
 			if (rwc->invalid_vma && rwc->invalid_vma(vma, rwc->arg))
 				continue;
 
-			if (!rwc->rmap_one(page, vma, addr, rwc->arg)) {
+			if (!rwc->rmap_one(folio, vma, addr, rwc->arg)) {
 				anon_vma_unlock_read(anon_vma);
 				return;
 			}
-			if (rwc->done && rwc->done(page)) {
+			if (rwc->done && rwc->done(folio)) {
 				anon_vma_unlock_read(anon_vma);
 				return;
 			}
diff --git a/mm/migrate.c b/mm/migrate.c
index 4daa8298c79a..e9f369a8ee15 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -173,18 +173,16 @@ void putback_movable_pages(struct list_head *l)
 /*
  * Restore a potential migration pte to a working pte entry
  */
-static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
-				 unsigned long addr, void *old)
+static bool remove_migration_pte(struct folio *folio,
+		struct vm_area_struct *vma, unsigned long addr, void *old)
 {
-	struct folio *folio = page_folio(page);
 	struct page_vma_mapped_walk pvmw = {
 		.vma = vma,
 		.address = addr,
 		.flags = PVMW_SYNC | PVMW_MIGRATION,
 	};
 
-	VM_BUG_ON_PAGE(PageTail(page), page);
-	pvmw_set_page(&pvmw, old);
+	pvmw_set_folio(&pvmw, old);
 	while (page_vma_mapped_walk(&pvmw)) {
 		pte_t pte;
 		swp_entry_t entry;
@@ -278,9 +276,9 @@ void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked)
 	};
 
 	if (locked)
-		rmap_walk_locked(&dst->page, &rwc);
+		rmap_walk_locked(dst, &rwc);
 	else
-		rmap_walk(&dst->page, &rwc);
+		rmap_walk(dst, &rwc);
 }
 
 /*
diff --git a/mm/page_idle.c b/mm/page_idle.c
index 35e53db430df..3563c3850795 100644
--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -46,18 +46,17 @@ static struct page *page_idle_get_page(unsigned long pfn)
 	return page;
 }
 
-static bool page_idle_clear_pte_refs_one(struct page *page,
+static bool page_idle_clear_pte_refs_one(struct folio *folio,
 					struct vm_area_struct *vma,
 					unsigned long addr, void *arg)
 {
-	struct folio *folio = page_folio(page);
 	struct page_vma_mapped_walk pvmw = {
 		.vma = vma,
 		.address = addr,
 	};
 	bool referenced = false;
 
-	pvmw_set_page(&pvmw, page);
+	pvmw_set_folio(&pvmw, folio);
 	while (page_vma_mapped_walk(&pvmw)) {
 		addr = pvmw.address;
 		if (pvmw.pte) {
@@ -97,7 +96,7 @@ static void page_idle_clear_pte_refs(struct page *page)
 	 */
 	static const struct rmap_walk_control rwc = {
 		.rmap_one = page_idle_clear_pte_refs_one,
-		.anon_lock = page_lock_anon_vma_read,
+		.anon_lock = folio_lock_anon_vma_read,
 	};
 	bool need_lock;
 
@@ -108,7 +107,7 @@ static void page_idle_clear_pte_refs(struct page *page)
 	if (need_lock && !folio_trylock(folio))
 		return;
 
-	rmap_walk(&folio->page, (struct rmap_walk_control *)&rwc);
+	rmap_walk(folio, (struct rmap_walk_control *)&rwc);
 
 	if (need_lock)
 		folio_unlock(folio);
diff --git a/mm/rmap.c b/mm/rmap.c
index 8bbbbea483cf..1ade44970ab1 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -107,15 +107,15 @@ static inline void anon_vma_free(struct anon_vma *anon_vma)
 	VM_BUG_ON(atomic_read(&anon_vma->refcount));
 
 	/*
-	 * Synchronize against page_lock_anon_vma_read() such that
+	 * Synchronize against folio_lock_anon_vma_read() such that
 	 * we can safely hold the lock without the anon_vma getting
 	 * freed.
 	 *
 	 * Relies on the full mb implied by the atomic_dec_and_test() from
 	 * put_anon_vma() against the acquire barrier implied by
-	 * down_read_trylock() from page_lock_anon_vma_read(). This orders:
+	 * down_read_trylock() from folio_lock_anon_vma_read(). This orders:
 	 *
-	 * page_lock_anon_vma_read()	VS	put_anon_vma()
+	 * folio_lock_anon_vma_read()	VS	put_anon_vma()
 	 *   down_read_trylock()		  atomic_dec_and_test()
 	 *   LOCK				  MB
 	 *   atomic_read()			  rwsem_is_locked()
@@ -168,7 +168,7 @@ static void anon_vma_chain_link(struct vm_area_struct *vma,
  * allocate a new one.
  *
  * Anon-vma allocations are very subtle, because we may have
- * optimistically looked up an anon_vma in page_lock_anon_vma_read()
+ * optimistically looked up an anon_vma in folio_lock_anon_vma_read()
  * and that may actually touch the rwsem even in the newly
  * allocated vma (it depends on RCU to make sure that the
  * anon_vma isn't actually destroyed).
@@ -799,10 +799,9 @@ struct page_referenced_arg {
 /*
  * arg: page_referenced_arg will be passed
  */
-static bool page_referenced_one(struct page *page, struct vm_area_struct *vma,
+static bool page_referenced_one(struct folio *folio, struct vm_area_struct *vma,
 			unsigned long address, void *arg)
 {
-	struct folio *folio = page_folio(page);
 	struct page_referenced_arg *pra = arg;
 	struct page_vma_mapped_walk pvmw = {
 		.vma = vma,
@@ -810,7 +809,7 @@ static bool page_referenced_one(struct page *page, struct vm_area_struct *vma,
 	};
 	int referenced = 0;
 
-	pvmw_set_page(&pvmw, page);
+	pvmw_set_folio(&pvmw, folio);
 	while (page_vma_mapped_walk(&pvmw)) {
 		address = pvmw.address;
 
@@ -895,7 +894,7 @@ int folio_referenced(struct folio *folio, int is_locked,
 	struct rmap_walk_control rwc = {
 		.rmap_one = page_referenced_one,
 		.arg = (void *)&pra,
-		.anon_lock = page_lock_anon_vma_read,
+		.anon_lock = folio_lock_anon_vma_read,
 	};
 
 	*vm_flags = 0;
@@ -920,7 +919,7 @@ int folio_referenced(struct folio *folio, int is_locked,
 		rwc.invalid_vma = invalid_page_referenced_vma;
 	}
 
-	rmap_walk(&folio->page, &rwc);
+	rmap_walk(folio, &rwc);
 	*vm_flags = pra.vm_flags;
 
 	if (we_locked)
@@ -929,10 +928,9 @@ int folio_referenced(struct folio *folio, int is_locked,
 	return pra.referenced;
 }
 
-static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
+static bool page_mkclean_one(struct folio *folio, struct vm_area_struct *vma,
 			    unsigned long address, void *arg)
 {
-	struct folio *folio = page_folio(page);
 	struct page_vma_mapped_walk pvmw = {
 		.vma = vma,
 		.address = address,
@@ -941,7 +939,7 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
 	struct mmu_notifier_range range;
 	int *cleaned = arg;
 
-	pvmw_set_page(&pvmw, page);
+	pvmw_set_folio(&pvmw, folio);
 	/*
 	 * We have to assume the worse case ie pmd for invalidation. Note that
 	 * the folio can not be freed from this function.
@@ -1031,7 +1029,7 @@ int folio_mkclean(struct folio *folio)
 	if (!mapping)
 		return 0;
 
-	rmap_walk(&folio->page, &rwc);
+	rmap_walk(folio, &rwc);
 
 	return cleaned;
 }
@@ -1422,10 +1420,9 @@ void page_remove_rmap(struct page *page, bool compound)
 /*
  * @arg: enum ttu_flags will be passed to this argument
  */
-static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
+static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
 		     unsigned long address, void *arg)
 {
-	struct folio *folio = page_folio(page);
 	struct mm_struct *mm = vma->vm_mm;
 	struct page_vma_mapped_walk pvmw = {
 		.vma = vma,
@@ -1437,7 +1434,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 	struct mmu_notifier_range range;
 	enum ttu_flags flags = (enum ttu_flags)(long)arg;
 
-	pvmw_set_page(&pvmw, page);
+	pvmw_set_folio(&pvmw, folio);
 	/*
 	 * When racing against e.g. zap_pte_range() on another cpu,
 	 * in between its ptep_get_and_clear_full() and page_remove_rmap(),
@@ -1690,9 +1687,9 @@ static bool invalid_migration_vma(struct vm_area_struct *vma, void *arg)
 	return vma_is_temporary_stack(vma);
 }
 
-static int page_not_mapped(struct page *page)
+static int page_not_mapped(struct folio *folio)
 {
-	return !page_mapped(page);
+	return !folio_mapped(folio);
 }
 
 /**
@@ -1712,13 +1709,13 @@ void try_to_unmap(struct folio *folio, enum ttu_flags flags)
 		.rmap_one = try_to_unmap_one,
 		.arg = (void *)flags,
 		.done = page_not_mapped,
-		.anon_lock = page_lock_anon_vma_read,
+		.anon_lock = folio_lock_anon_vma_read,
 	};
 
 	if (flags & TTU_RMAP_LOCKED)
-		rmap_walk_locked(&folio->page, &rwc);
+		rmap_walk_locked(folio, &rwc);
 	else
-		rmap_walk(&folio->page, &rwc);
+		rmap_walk(folio, &rwc);
 }
 
 /*
@@ -1727,10 +1724,9 @@ void try_to_unmap(struct folio *folio, enum ttu_flags flags)
  * If TTU_SPLIT_HUGE_PMD is specified any PMD mappings will be split into PTEs
  * containing migration entries.
  */
-static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
+static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
 		     unsigned long address, void *arg)
 {
-	struct folio *folio = page_folio(page);
 	struct mm_struct *mm = vma->vm_mm;
 	struct page_vma_mapped_walk pvmw = {
 		.vma = vma,
@@ -1742,7 +1738,7 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
 	struct mmu_notifier_range range;
 	enum ttu_flags flags = (enum ttu_flags)(long)arg;
 
-	pvmw_set_page(&pvmw, page);
+	pvmw_set_folio(&pvmw, folio);
 	/*
 	 * When racing against e.g. zap_pte_range() on another cpu,
 	 * in between its ptep_get_and_clear_full() and page_remove_rmap(),
@@ -1976,7 +1972,7 @@ void try_to_migrate(struct folio *folio, enum ttu_flags flags)
 		.rmap_one = try_to_migrate_one,
 		.arg = (void *)flags,
 		.done = page_not_mapped,
-		.anon_lock = page_lock_anon_vma_read,
+		.anon_lock = folio_lock_anon_vma_read,
 	};
 
 	/*
@@ -2002,25 +1998,24 @@ void try_to_migrate(struct folio *folio, enum ttu_flags flags)
 		rwc.invalid_vma = invalid_migration_vma;
 
 	if (flags & TTU_RMAP_LOCKED)
-		rmap_walk_locked(&folio->page, &rwc);
+		rmap_walk_locked(folio, &rwc);
 	else
-		rmap_walk(&folio->page, &rwc);
+		rmap_walk(folio, &rwc);
 }
 
 /*
  * Walks the vma's mapping a page and mlocks the page if any locked vma's are
  * found. Once one is found the page is locked and the scan can be terminated.
  */
-static bool page_mlock_one(struct page *page, struct vm_area_struct *vma,
+static bool page_mlock_one(struct folio *folio, struct vm_area_struct *vma,
 				 unsigned long address, void *unused)
 {
-	struct folio *folio = page_folio(page);
 	struct page_vma_mapped_walk pvmw = {
 		.vma = vma,
 		.address = address,
 	};
 
-	pvmw_set_page(&pvmw, page);
+	pvmw_set_folio(&pvmw, folio);
 	/* An un-locked vma doesn't have any pages to lock, continue the scan */
 	if (!(vma->vm_flags & VM_LOCKED))
 		return true;
@@ -2064,7 +2059,7 @@ void folio_mlock(struct folio *folio)
 	struct rmap_walk_control rwc = {
 		.rmap_one = page_mlock_one,
 		.done = page_not_mapped,
-		.anon_lock = page_lock_anon_vma_read,
+		.anon_lock = folio_lock_anon_vma_read,
 
 	};
 
@@ -2077,7 +2072,7 @@ void folio_mlock(struct folio *folio)
 	if (folio_test_large(folio) && folio_test_anon(folio))
 		return;
 
-	rmap_walk(&folio->page, &rwc);
+	rmap_walk(folio, &rwc);
 }
 
 #ifdef CONFIG_DEVICE_PRIVATE
@@ -2088,10 +2083,9 @@ struct make_exclusive_args {
 	bool valid;
 };
 
-static bool page_make_device_exclusive_one(struct page *page,
+static bool page_make_device_exclusive_one(struct folio *folio,
 		struct vm_area_struct *vma, unsigned long address, void *priv)
 {
-	struct folio *folio = page_folio(page);
 	struct mm_struct *mm = vma->vm_mm;
 	struct page_vma_mapped_walk pvmw = {
 		.vma = vma,
@@ -2105,7 +2099,7 @@ static bool page_make_device_exclusive_one(struct page *page,
 	swp_entry_t entry;
 	pte_t swp_pte;
 
-	pvmw_set_page(&pvmw, page);
+	pvmw_set_folio(&pvmw, folio);
 	mmu_notifier_range_init_owner(&range, MMU_NOTIFY_EXCLUSIVE, 0, vma,
 				      vma->vm_mm, address, min(vma->vm_end,
 				      address + folio_size(folio)),
@@ -2199,7 +2193,7 @@ static bool folio_make_device_exclusive(struct folio *folio,
 	struct rmap_walk_control rwc = {
 		.rmap_one = page_make_device_exclusive_one,
 		.done = page_not_mapped,
-		.anon_lock = page_lock_anon_vma_read,
+		.anon_lock = folio_lock_anon_vma_read,
 		.arg = &args,
 	};
 
@@ -2210,7 +2204,7 @@ static bool folio_make_device_exclusive(struct folio *folio,
 	if (!folio_test_anon(folio))
 		return false;
 
-	rmap_walk(&folio->page, &rwc);
+	rmap_walk(folio, &rwc);
 
 	return args.valid && !folio_mapcount(folio);
 }
@@ -2278,17 +2272,16 @@ void __put_anon_vma(struct anon_vma *anon_vma)
 		anon_vma_free(root);
 }
 
-static struct anon_vma *rmap_walk_anon_lock(struct page *page,
+static struct anon_vma *rmap_walk_anon_lock(struct folio *folio,
 					struct rmap_walk_control *rwc)
 {
-	struct folio *folio = page_folio(page);
 	struct anon_vma *anon_vma;
 
 	if (rwc->anon_lock)
-		return rwc->anon_lock(page);
+		return rwc->anon_lock(folio);
 
 	/*
-	 * Note: remove_migration_ptes() cannot use page_lock_anon_vma_read()
+	 * Note: remove_migration_ptes() cannot use folio_lock_anon_vma_read()
 	 * because that depends on page_mapped(); but not all its usages
 	 * are holding mmap_lock. Users without mmap_lock are required to
 	 * take a reference count to prevent the anon_vma disappearing
@@ -2315,10 +2308,9 @@ static struct anon_vma *rmap_walk_anon_lock(struct page *page,
  * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
  * LOCKED.
  */
-static void rmap_walk_anon(struct page *page, struct rmap_walk_control *rwc,
+static void rmap_walk_anon(struct folio *folio, struct rmap_walk_control *rwc,
 		bool locked)
 {
-	struct folio *folio = page_folio(page);
 	struct anon_vma *anon_vma;
 	pgoff_t pgoff_start, pgoff_end;
 	struct anon_vma_chain *avc;
@@ -2328,17 +2320,17 @@ static void rmap_walk_anon(struct page *page, struct rmap_walk_control *rwc,
 		/* anon_vma disappear under us? */
 		VM_BUG_ON_FOLIO(!anon_vma, folio);
 	} else {
-		anon_vma = rmap_walk_anon_lock(page, rwc);
+		anon_vma = rmap_walk_anon_lock(folio, rwc);
 	}
 	if (!anon_vma)
 		return;
 
-	pgoff_start = page_to_pgoff(page);
-	pgoff_end = pgoff_start + thp_nr_pages(page) - 1;
+	pgoff_start = folio_pgoff(folio);
+	pgoff_end = pgoff_start + folio_nr_pages(folio) - 1;
 	anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root,
 			pgoff_start, pgoff_end) {
 		struct vm_area_struct *vma = avc->vma;
-		unsigned long address = vma_address(page, vma);
+		unsigned long address = vma_address(&folio->page, vma);
 
 		VM_BUG_ON_VMA(address == -EFAULT, vma);
 		cond_resched();
@@ -2346,9 +2338,9 @@ static void rmap_walk_anon(struct page *page, struct rmap_walk_control *rwc,
 		if (rwc->invalid_vma && rwc->invalid_vma(vma, rwc->arg))
 			continue;
 
-		if (!rwc->rmap_one(page, vma, address, rwc->arg))
+		if (!rwc->rmap_one(folio, vma, address, rwc->arg))
 			break;
-		if (rwc->done && rwc->done(page))
+		if (rwc->done && rwc->done(folio))
 			break;
 	}
 
@@ -2369,10 +2361,10 @@ static void rmap_walk_anon(struct page *page, struct rmap_walk_control *rwc,
  * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
  * LOCKED.
  */
-static void rmap_walk_file(struct page *page, struct rmap_walk_control *rwc,
+static void rmap_walk_file(struct folio *folio, struct rmap_walk_control *rwc,
 		bool locked)
 {
-	struct address_space *mapping = page_mapping(page);
+	struct address_space *mapping = folio_mapping(folio);
 	pgoff_t pgoff_start, pgoff_end;
 	struct vm_area_struct *vma;
 
@@ -2382,18 +2374,18 @@ static void rmap_walk_file(struct page *page, struct rmap_walk_control *rwc,
 	 * structure at mapping cannot be freed and reused yet,
 	 * so we can safely take mapping->i_mmap_rwsem.
 	 */
-	VM_BUG_ON_PAGE(!PageLocked(page), page);
+	VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
 
 	if (!mapping)
 		return;
 
-	pgoff_start = page_to_pgoff(page);
-	pgoff_end = pgoff_start + thp_nr_pages(page) - 1;
+	pgoff_start = folio_pgoff(folio);
+	pgoff_end = pgoff_start + folio_nr_pages(folio) - 1;
 	if (!locked)
 		i_mmap_lock_read(mapping);
 	vma_interval_tree_foreach(vma, &mapping->i_mmap,
 			pgoff_start, pgoff_end) {
-		unsigned long address = vma_address(page, vma);
+		unsigned long address = vma_address(&folio->page, vma);
 
 		VM_BUG_ON_VMA(address == -EFAULT, vma);
 		cond_resched();
@@ -2401,9 +2393,9 @@ static void rmap_walk_file(struct page *page, struct rmap_walk_control *rwc,
 		if (rwc->invalid_vma && rwc->invalid_vma(vma, rwc->arg))
 			continue;
 
-		if (!rwc->rmap_one(page, vma, address, rwc->arg))
+		if (!rwc->rmap_one(folio, vma, address, rwc->arg))
 			goto done;
-		if (rwc->done && rwc->done(page))
+		if (rwc->done && rwc->done(folio))
 			goto done;
 	}
 
@@ -2412,25 +2404,25 @@ static void rmap_walk_file(struct page *page, struct rmap_walk_control *rwc,
 		i_mmap_unlock_read(mapping);
 }
 
-void rmap_walk(struct page *page, struct rmap_walk_control *rwc)
+void rmap_walk(struct folio *folio, struct rmap_walk_control *rwc)
 {
-	if (unlikely(PageKsm(page)))
-		rmap_walk_ksm(page, rwc);
-	else if (PageAnon(page))
-		rmap_walk_anon(page, rwc, false);
+	if (unlikely(folio_test_ksm(folio)))
+		rmap_walk_ksm(folio, rwc);
+	else if (folio_test_anon(folio))
+		rmap_walk_anon(folio, rwc, false);
 	else
-		rmap_walk_file(page, rwc, false);
+		rmap_walk_file(folio, rwc, false);
 }
 
 /* Like rmap_walk, but caller holds relevant rmap lock */
-void rmap_walk_locked(struct page *page, struct rmap_walk_control *rwc)
+void rmap_walk_locked(struct folio *folio, struct rmap_walk_control *rwc)
 {
 	/* no ksm support for now */
-	VM_BUG_ON_PAGE(PageKsm(page), page);
-	if (PageAnon(page))
-		rmap_walk_anon(page, rwc, true);
+	VM_BUG_ON_FOLIO(folio_test_ksm(folio), folio);
+	if (folio_test_anon(folio))
+		rmap_walk_anon(folio, rwc, true);
 	else
-		rmap_walk_file(page, rwc, true);
+		rmap_walk_file(folio, rwc, true);
 }
 
 #ifdef CONFIG_HUGETLB_PAGE
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 60/75] mm/rmap: Constify the rmap_walk_control argument
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (58 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 59/75] mm/rmap: Convert rmap_walk() to take a folio Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 61/75] mm/vmscan: Free non-shmem folios without splitting them Matthew Wilcox (Oracle)
                   ` (16 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

The rmap walking functions do not modify the rmap_walk_control, and
page_idle_clear_pte_refs() takes advantage of that to move construction
of the rmap_walk_control to compile time.  This lets us remove an
unclean cast.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/ksm.h  |  4 ++--
 include/linux/rmap.h |  4 ++--
 mm/ksm.c             |  2 +-
 mm/page_idle.c       |  2 +-
 mm/rmap.c            | 14 +++++++-------
 5 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 0b4f17418f64..0630e545f4cb 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -51,7 +51,7 @@ static inline void ksm_exit(struct mm_struct *mm)
 struct page *ksm_might_need_to_copy(struct page *page,
 			struct vm_area_struct *vma, unsigned long address);
 
-void rmap_walk_ksm(struct folio *folio, struct rmap_walk_control *rwc);
+void rmap_walk_ksm(struct folio *folio, const struct rmap_walk_control *rwc);
 void folio_migrate_ksm(struct folio *newfolio, struct folio *folio);
 
 #else  /* !CONFIG_KSM */
@@ -79,7 +79,7 @@ static inline struct page *ksm_might_need_to_copy(struct page *page,
 }
 
 static inline void rmap_walk_ksm(struct folio *folio,
-			struct rmap_walk_control *rwc)
+			const struct rmap_walk_control *rwc)
 {
 }
 
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 4e4c4412b295..96522944739e 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -294,8 +294,8 @@ struct rmap_walk_control {
 	bool (*invalid_vma)(struct vm_area_struct *vma, void *arg);
 };
 
-void rmap_walk(struct folio *folio, struct rmap_walk_control *rwc);
-void rmap_walk_locked(struct folio *folio, struct rmap_walk_control *rwc);
+void rmap_walk(struct folio *folio, const struct rmap_walk_control *rwc);
+void rmap_walk_locked(struct folio *folio, const struct rmap_walk_control *rwc);
 
 #else	/* !CONFIG_MMU */
 
diff --git a/mm/ksm.c b/mm/ksm.c
index 0ec3d9035419..e95c454303a2 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2601,7 +2601,7 @@ struct page *ksm_might_need_to_copy(struct page *page,
 	return new_page;
 }
 
-void rmap_walk_ksm(struct folio *folio, struct rmap_walk_control *rwc)
+void rmap_walk_ksm(struct folio *folio, const struct rmap_walk_control *rwc)
 {
 	struct stable_node *stable_node;
 	struct rmap_item *rmap_item;
diff --git a/mm/page_idle.c b/mm/page_idle.c
index 3563c3850795..982f35d91b96 100644
--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -107,7 +107,7 @@ static void page_idle_clear_pte_refs(struct page *page)
 	if (need_lock && !folio_trylock(folio))
 		return;
 
-	rmap_walk(folio, (struct rmap_walk_control *)&rwc);
+	rmap_walk(folio, &rwc);
 
 	if (need_lock)
 		folio_unlock(folio);
diff --git a/mm/rmap.c b/mm/rmap.c
index 1ade44970ab1..1d22cb825931 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2273,7 +2273,7 @@ void __put_anon_vma(struct anon_vma *anon_vma)
 }
 
 static struct anon_vma *rmap_walk_anon_lock(struct folio *folio,
-					struct rmap_walk_control *rwc)
+					const struct rmap_walk_control *rwc)
 {
 	struct anon_vma *anon_vma;
 
@@ -2308,8 +2308,8 @@ static struct anon_vma *rmap_walk_anon_lock(struct folio *folio,
  * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
  * LOCKED.
  */
-static void rmap_walk_anon(struct folio *folio, struct rmap_walk_control *rwc,
-		bool locked)
+static void rmap_walk_anon(struct folio *folio,
+		const struct rmap_walk_control *rwc, bool locked)
 {
 	struct anon_vma *anon_vma;
 	pgoff_t pgoff_start, pgoff_end;
@@ -2361,8 +2361,8 @@ static void rmap_walk_anon(struct folio *folio, struct rmap_walk_control *rwc,
  * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
  * LOCKED.
  */
-static void rmap_walk_file(struct folio *folio, struct rmap_walk_control *rwc,
-		bool locked)
+static void rmap_walk_file(struct folio *folio,
+		const struct rmap_walk_control *rwc, bool locked)
 {
 	struct address_space *mapping = folio_mapping(folio);
 	pgoff_t pgoff_start, pgoff_end;
@@ -2404,7 +2404,7 @@ static void rmap_walk_file(struct folio *folio, struct rmap_walk_control *rwc,
 		i_mmap_unlock_read(mapping);
 }
 
-void rmap_walk(struct folio *folio, struct rmap_walk_control *rwc)
+void rmap_walk(struct folio *folio, const struct rmap_walk_control *rwc)
 {
 	if (unlikely(folio_test_ksm(folio)))
 		rmap_walk_ksm(folio, rwc);
@@ -2415,7 +2415,7 @@ void rmap_walk(struct folio *folio, struct rmap_walk_control *rwc)
 }
 
 /* Like rmap_walk, but caller holds relevant rmap lock */
-void rmap_walk_locked(struct folio *folio, struct rmap_walk_control *rwc)
+void rmap_walk_locked(struct folio *folio, const struct rmap_walk_control *rwc)
 {
 	/* no ksm support for now */
 	VM_BUG_ON_FOLIO(folio_test_ksm(folio), folio);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 61/75] mm/vmscan: Free non-shmem folios without splitting them
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (59 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 60/75] mm/rmap: Constify the rmap_walk_control argument Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 62/75] mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios Matthew Wilcox (Oracle)
                   ` (15 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

We have to allocate memory in order to split a file-backed folio, so
it's not a good idea to split them in the memory freeing path.  It also
doesn't work for XFS because pages have an extra reference count from
page_has_private() and split_huge_page() expects that reference to have
already been removed.  Unfortunately, we still have to split shmem THPs
because we can't handle swapping out an entire THP yet.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/vmscan.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 2e94e0b15a76..794cba8511f1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1732,8 +1732,8 @@ static unsigned int shrink_page_list(struct list_head *page_list,
 				/* Adding to swap updated mapping */
 				mapping = page_mapping(page);
 			}
-		} else if (unlikely(PageTransHuge(page))) {
-			/* Split file THP */
+		} else if (PageSwapBacked(page) && PageTransHuge(page)) {
+			/* Split shmem THP */
 			if (split_folio_to_list(folio, page_list))
 				goto keep_locked;
 		}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 62/75] mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (60 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 61/75] mm/vmscan: Free non-shmem folios without splitting them Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 63/75] mm/vmscan: Account large folios correctly Matthew Wilcox (Oracle)
                   ` (14 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

A large folio which is smaller than a PMD does not need to do the extra
work in try_to_unmap() of trying to split a PMD entry.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/vmscan.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 794cba8511f1..edcca2424eaa 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1758,7 +1758,8 @@ static unsigned int shrink_page_list(struct list_head *page_list,
 			enum ttu_flags flags = TTU_BATCH_FLUSH;
 			bool was_swapbacked = PageSwapBacked(page);
 
-			if (unlikely(PageTransHuge(page)))
+			if (PageTransHuge(page) &&
+					thp_order(page) >= HPAGE_PMD_ORDER)
 				flags |= TTU_SPLIT_HUGE_PMD;
 
 			try_to_unmap(folio, flags);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 63/75] mm/vmscan: Account large folios correctly
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (61 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 62/75] mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 64/75] mm/vmscan: Turn page_check_references() into folio_check_references() Matthew Wilcox (Oracle)
                   ` (13 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

The statistics we gather should count the number of pages, not the
number of folios.  The logic in this function is somewhat convoluted,
but even if we split the folio, I think the accounting is now correct.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/vmscan.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index edcca2424eaa..5ceed53cb326 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1568,10 +1568,10 @@ static unsigned int shrink_page_list(struct list_head *page_list,
 		 */
 		folio_check_dirty_writeback(folio, &dirty, &writeback);
 		if (dirty || writeback)
-			stat->nr_dirty++;
+			stat->nr_dirty += nr_pages;
 
 		if (dirty && !writeback)
-			stat->nr_unqueued_dirty++;
+			stat->nr_unqueued_dirty += nr_pages;
 
 		/*
 		 * Treat this page as congested if the underlying BDI is or if
@@ -1583,7 +1583,7 @@ static unsigned int shrink_page_list(struct list_head *page_list,
 		if (((dirty || writeback) && mapping &&
 		     inode_write_congested(mapping->host)) ||
 		    (writeback && PageReclaim(page)))
-			stat->nr_congested++;
+			stat->nr_congested += nr_pages;
 
 		/*
 		 * If a page at the tail of the LRU is under writeback, there
@@ -1632,7 +1632,7 @@ static unsigned int shrink_page_list(struct list_head *page_list,
 			if (current_is_kswapd() &&
 			    PageReclaim(page) &&
 			    test_bit(PGDAT_WRITEBACK, &pgdat->flags)) {
-				stat->nr_immediate++;
+				stat->nr_immediate += nr_pages;
 				goto activate_locked;
 
 			/* Case 2 above */
@@ -1650,7 +1650,7 @@ static unsigned int shrink_page_list(struct list_head *page_list,
 				 * and it's also appropriate in global reclaim.
 				 */
 				SetPageReclaim(page);
-				stat->nr_writeback++;
+				stat->nr_writeback += nr_pages;
 				goto activate_locked;
 
 			/* Case 3 above */
@@ -1816,7 +1816,7 @@ static unsigned int shrink_page_list(struct list_head *page_list,
 			case PAGE_ACTIVATE:
 				goto activate_locked;
 			case PAGE_SUCCESS:
-				stat->nr_pageout += thp_nr_pages(page);
+				stat->nr_pageout += nr_pages;
 
 				if (PageWriteback(page))
 					goto keep;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 64/75] mm/vmscan: Turn page_check_references() into folio_check_references()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (62 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 63/75] mm/vmscan: Account large folios correctly Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 65/75] mm/vmscan: Convert pageout() to take a folio Matthew Wilcox (Oracle)
                   ` (12 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This function only has one caller, and it already has a folio.  This
removes a number of calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/vmscan.c | 31 +++++++++++++++----------------
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 5ceed53cb326..450dd9c3395f 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1376,55 +1376,54 @@ enum page_references {
 	PAGEREF_ACTIVATE,
 };
 
-static enum page_references page_check_references(struct page *page,
+static enum page_references folio_check_references(struct folio *folio,
 						  struct scan_control *sc)
 {
-	struct folio *folio = page_folio(page);
-	int referenced_ptes, referenced_page;
+	int referenced_ptes, referenced_folio;
 	unsigned long vm_flags;
 
 	referenced_ptes = folio_referenced(folio, 1, sc->target_mem_cgroup,
 					   &vm_flags);
-	referenced_page = TestClearPageReferenced(page);
+	referenced_folio = folio_test_clear_referenced(folio);
 
 	/*
 	 * Mlock lost the isolation race with us.  Let try_to_unmap()
-	 * move the page to the unevictable list.
+	 * move the folio to the unevictable list.
 	 */
 	if (vm_flags & VM_LOCKED)
 		return PAGEREF_RECLAIM;
 
 	if (referenced_ptes) {
 		/*
-		 * All mapped pages start out with page table
+		 * All mapped folios start out with page table
 		 * references from the instantiating fault, so we need
-		 * to look twice if a mapped file page is used more
+		 * to look twice if a mapped file folio is used more
 		 * than once.
 		 *
 		 * Mark it and spare it for another trip around the
 		 * inactive list.  Another page table reference will
 		 * lead to its activation.
 		 *
-		 * Note: the mark is set for activated pages as well
-		 * so that recently deactivated but used pages are
+		 * Note: the mark is set for activated folios as well
+		 * so that recently deactivated but used folios are
 		 * quickly recovered.
 		 */
-		SetPageReferenced(page);
+		folio_set_referenced(folio);
 
-		if (referenced_page || referenced_ptes > 1)
+		if (referenced_folio || referenced_ptes > 1)
 			return PAGEREF_ACTIVATE;
 
 		/*
-		 * Activate file-backed executable pages after first usage.
+		 * Activate file-backed executable folios after first usage.
 		 */
-		if ((vm_flags & VM_EXEC) && !PageSwapBacked(page))
+		if ((vm_flags & VM_EXEC) && !folio_test_swapbacked(folio))
 			return PAGEREF_ACTIVATE;
 
 		return PAGEREF_KEEP;
 	}
 
-	/* Reclaim if clean, defer dirty pages to writeback */
-	if (referenced_page && !PageSwapBacked(page))
+	/* Reclaim if clean, defer dirty folios to writeback */
+	if (referenced_folio && !folio_test_swapbacked(folio))
 		return PAGEREF_RECLAIM_CLEAN;
 
 	return PAGEREF_RECLAIM;
@@ -1664,7 +1663,7 @@ static unsigned int shrink_page_list(struct list_head *page_list,
 		}
 
 		if (!ignore_references)
-			references = page_check_references(page, sc);
+			references = folio_check_references(folio, sc);
 
 		switch (references) {
 		case PAGEREF_ACTIVATE:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 65/75] mm/vmscan: Convert pageout() to take a folio
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (63 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 64/75] mm/vmscan: Turn page_check_references() into folio_check_references() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 66/75] mm: Turn can_split_huge_page() into can_split_folio() Matthew Wilcox (Oracle)
                   ` (11 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

We always write out an entire folio at once.  This conversion removes
a few calls to compound_head() and gets the NR_VMSCAN_WRITE statistic
right when writing out a large folio.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/trace/events/vmscan.h | 10 +++---
 mm/vmscan.c                   | 64 +++++++++++++++++------------------
 2 files changed, 37 insertions(+), 37 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index ca2e9009a651..de136dbd623a 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -327,11 +327,11 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
 		__print_symbolic(__entry->lru, LRU_NAMES))
 );
 
-TRACE_EVENT(mm_vmscan_writepage,
+TRACE_EVENT(mm_vmscan_write_folio,
 
-	TP_PROTO(struct page *page),
+	TP_PROTO(struct folio *folio),
 
-	TP_ARGS(page),
+	TP_ARGS(folio),
 
 	TP_STRUCT__entry(
 		__field(unsigned long, pfn)
@@ -339,9 +339,9 @@ TRACE_EVENT(mm_vmscan_writepage,
 	),
 
 	TP_fast_assign(
-		__entry->pfn = page_to_pfn(page);
+		__entry->pfn = folio_pfn(folio);
 		__entry->reclaim_flags = trace_reclaim_flags(
-						page_is_file_lru(page));
+						folio_is_file_lru(folio));
 	),
 
 	TP_printk("page=%p pfn=0x%lx flags=%s",
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 450dd9c3395f..efe041c2859d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -978,15 +978,15 @@ void drop_slab(void)
 		drop_slab_node(nid);
 }
 
-static inline int is_page_cache_freeable(struct page *page)
+static inline int is_page_cache_freeable(struct folio *folio)
 {
 	/*
 	 * A freeable page cache page is referenced only by the caller
 	 * that isolated the page, the page cache and optional buffer
 	 * heads at page->private.
 	 */
-	int page_cache_pins = thp_nr_pages(page);
-	return page_count(page) - page_has_private(page) == 1 + page_cache_pins;
+	return folio_ref_count(folio) - folio_test_private(folio) ==
+		1 + folio_nr_pages(folio);
 }
 
 static int may_write_to_inode(struct inode *inode)
@@ -1001,24 +1001,24 @@ static int may_write_to_inode(struct inode *inode)
 }
 
 /*
- * We detected a synchronous write error writing a page out.  Probably
+ * We detected a synchronous write error writing a folio out.  Probably
  * -ENOSPC.  We need to propagate that into the address_space for a subsequent
  * fsync(), msync() or close().
  *
  * The tricky part is that after writepage we cannot touch the mapping: nothing
- * prevents it from being freed up.  But we have a ref on the page and once
- * that page is locked, the mapping is pinned.
+ * prevents it from being freed up.  But we have a ref on the folio and once
+ * that folio is locked, the mapping is pinned.
  *
- * We're allowed to run sleeping lock_page() here because we know the caller has
+ * We're allowed to run sleeping folio_lock() here because we know the caller has
  * __GFP_FS.
  */
 static void handle_write_error(struct address_space *mapping,
-				struct page *page, int error)
+				struct folio *folio, int error)
 {
-	lock_page(page);
-	if (page_mapping(page) == mapping)
+	folio_lock(folio);
+	if (folio_mapping(folio) == mapping)
 		mapping_set_error(mapping, error);
-	unlock_page(page);
+	folio_unlock(folio);
 }
 
 static bool skip_throttle_noprogress(pg_data_t *pgdat)
@@ -1163,35 +1163,35 @@ typedef enum {
  * pageout is called by shrink_page_list() for each dirty page.
  * Calls ->writepage().
  */
-static pageout_t pageout(struct page *page, struct address_space *mapping)
+static pageout_t pageout(struct folio *folio, struct address_space *mapping)
 {
 	/*
-	 * If the page is dirty, only perform writeback if that write
+	 * If the folio is dirty, only perform writeback if that write
 	 * will be non-blocking.  To prevent this allocation from being
 	 * stalled by pagecache activity.  But note that there may be
 	 * stalls if we need to run get_block().  We could test
 	 * PagePrivate for that.
 	 *
 	 * If this process is currently in __generic_file_write_iter() against
-	 * this page's queue, we can perform writeback even if that
+	 * this folio's queue, we can perform writeback even if that
 	 * will block.
 	 *
-	 * If the page is swapcache, write it back even if that would
+	 * If the folio is swapcache, write it back even if that would
 	 * block, for some throttling. This happens by accident, because
 	 * swap_backing_dev_info is bust: it doesn't reflect the
 	 * congestion state of the swapdevs.  Easy to fix, if needed.
 	 */
-	if (!is_page_cache_freeable(page))
+	if (!is_page_cache_freeable(folio))
 		return PAGE_KEEP;
 	if (!mapping) {
 		/*
-		 * Some data journaling orphaned pages can have
-		 * page->mapping == NULL while being dirty with clean buffers.
+		 * Some data journaling orphaned folios can have
+		 * folio->mapping == NULL while being dirty with clean buffers.
 		 */
-		if (page_has_private(page)) {
-			if (try_to_free_buffers(page)) {
-				ClearPageDirty(page);
-				pr_info("%s: orphaned page\n", __func__);
+		if (folio_test_private(folio)) {
+			if (try_to_free_buffers(&folio->page)) {
+				folio_clear_dirty(folio);
+				pr_info("%s: orphaned folio\n", __func__);
 				return PAGE_CLEAN;
 			}
 		}
@@ -1202,7 +1202,7 @@ static pageout_t pageout(struct page *page, struct address_space *mapping)
 	if (!may_write_to_inode(mapping->host))
 		return PAGE_KEEP;
 
-	if (clear_page_dirty_for_io(page)) {
+	if (folio_clear_dirty_for_io(folio)) {
 		int res;
 		struct writeback_control wbc = {
 			.sync_mode = WB_SYNC_NONE,
@@ -1212,21 +1212,21 @@ static pageout_t pageout(struct page *page, struct address_space *mapping)
 			.for_reclaim = 1,
 		};
 
-		SetPageReclaim(page);
-		res = mapping->a_ops->writepage(page, &wbc);
+		folio_set_reclaim(folio);
+		res = mapping->a_ops->writepage(&folio->page, &wbc);
 		if (res < 0)
-			handle_write_error(mapping, page, res);
+			handle_write_error(mapping, folio, res);
 		if (res == AOP_WRITEPAGE_ACTIVATE) {
-			ClearPageReclaim(page);
+			folio_clear_reclaim(folio);
 			return PAGE_ACTIVATE;
 		}
 
-		if (!PageWriteback(page)) {
+		if (!folio_test_writeback(folio)) {
 			/* synchronous write or broken a_ops? */
-			ClearPageReclaim(page);
+			folio_clear_reclaim(folio);
 		}
-		trace_mm_vmscan_writepage(page);
-		inc_node_page_state(page, NR_VMSCAN_WRITE);
+		trace_mm_vmscan_write_folio(folio);
+		node_stat_add_folio(folio, NR_VMSCAN_WRITE);
 		return PAGE_SUCCESS;
 	}
 
@@ -1809,7 +1809,7 @@ static unsigned int shrink_page_list(struct list_head *page_list,
 			 * starts and then write it out here.
 			 */
 			try_to_unmap_flush_dirty();
-			switch (pageout(page, mapping)) {
+			switch (pageout(folio, mapping)) {
 			case PAGE_KEEP:
 				goto keep_locked;
 			case PAGE_ACTIVATE:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 66/75] mm: Turn can_split_huge_page() into can_split_folio()
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (64 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 65/75] mm/vmscan: Convert pageout() to take a folio Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 67/75] mm/filemap: Allow large folios to be added to the page cache Matthew Wilcox (Oracle)
                   ` (10 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This function already required a head page to be passed, so this
just adds type-safety and removes a few implicit calls to
compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/huge_mm.h |  4 ++--
 mm/huge_memory.c        | 15 ++++++++-------
 mm/vmscan.c             |  6 +++---
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 4368b314d9c8..e0348bca3d66 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -185,7 +185,7 @@ void prep_transhuge_page(struct page *page);
 void free_transhuge_page(struct page *page);
 bool is_transparent_hugepage(struct page *page);
 
-bool can_split_huge_page(struct page *page, int *pextra_pins);
+bool can_split_folio(struct folio *folio, int *pextra_pins);
 int split_huge_page_to_list(struct page *page, struct list_head *list);
 static inline int split_huge_page(struct page *page)
 {
@@ -387,7 +387,7 @@ static inline bool is_transparent_hugepage(struct page *page)
 #define thp_get_unmapped_area	NULL
 
 static inline bool
-can_split_huge_page(struct page *page, int *pextra_pins)
+can_split_folio(struct folio *folio, int *pextra_pins)
 {
 	BUILD_BUG();
 	return false;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index f711dabc9c62..a80d0408ebf4 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2545,18 +2545,19 @@ int page_trans_huge_mapcount(struct page *page)
 }
 
 /* Racy check whether the huge page can be split */
-bool can_split_huge_page(struct page *page, int *pextra_pins)
+bool can_split_folio(struct folio *folio, int *pextra_pins)
 {
 	int extra_pins;
 
 	/* Additional pins from page cache */
-	if (PageAnon(page))
-		extra_pins = PageSwapCache(page) ? thp_nr_pages(page) : 0;
+	if (folio_test_anon(folio))
+		extra_pins = folio_test_swapcache(folio) ?
+				folio_nr_pages(folio) : 0;
 	else
-		extra_pins = thp_nr_pages(page);
+		extra_pins = folio_nr_pages(folio);
 	if (pextra_pins)
 		*pextra_pins = extra_pins;
-	return total_mapcount(page) == page_count(page) - extra_pins - 1;
+	return folio_mapcount(folio) == folio_ref_count(folio) - extra_pins - 1;
 }
 
 /*
@@ -2648,7 +2649,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 	 * Racy check if we can split the page, before unmap_page() will
 	 * split PMDs
 	 */
-	if (!can_split_huge_page(head, &extra_pins)) {
+	if (!can_split_folio(folio, &extra_pins)) {
 		ret = -EBUSY;
 		goto out_unlock;
 	}
@@ -2957,7 +2958,7 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start,
 			goto next;
 
 		total++;
-		if (!can_split_huge_page(compound_head(page), NULL))
+		if (!can_split_folio(page_folio(page), NULL))
 			goto next;
 
 		if (!trylock_page(page))
diff --git a/mm/vmscan.c b/mm/vmscan.c
index efe041c2859d..6d2e4da77392 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1696,18 +1696,18 @@ static unsigned int shrink_page_list(struct list_head *page_list,
 			if (!PageSwapCache(page)) {
 				if (!(sc->gfp_mask & __GFP_IO))
 					goto keep_locked;
-				if (page_maybe_dma_pinned(page))
+				if (folio_maybe_dma_pinned(folio))
 					goto keep_locked;
 				if (PageTransHuge(page)) {
 					/* cannot split THP, skip it */
-					if (!can_split_huge_page(page, NULL))
+					if (!can_split_folio(folio, NULL))
 						goto activate_locked;
 					/*
 					 * Split pages without a PMD map right
 					 * away. Chances are some or all of the
 					 * tail pages can be freed without IO.
 					 */
-					if (!compound_mapcount(page) &&
+					if (!folio_entire_mapcount(folio) &&
 					    split_folio_to_list(folio,
 								page_list))
 						goto activate_locked;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 67/75] mm/filemap: Allow large folios to be added to the page cache
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (65 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 66/75] mm: Turn can_split_huge_page() into can_split_folio() Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 68/75] mm: Fix READ_ONLY_THP warning Matthew Wilcox (Oracle)
                   ` (9 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

We return -EEXIST if there are any non-shadow entries in the page
cache in the range covered by the folio.  If there are multiple
shadow entries in the range, we set *shadowp to one of them (currently
the one at the highest index).  If that turns out to be the wrong
answer, we can implement something more complex.  This is mostly
modelled after the equivalent function in the shmem code.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/filemap.c | 39 ++++++++++++++++++++++-----------------
 1 file changed, 22 insertions(+), 17 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index ad8c39d90bf9..8f7ac3de9098 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -842,26 +842,27 @@ noinline int __filemap_add_folio(struct address_space *mapping,
 {
 	XA_STATE(xas, &mapping->i_pages, index);
 	int huge = folio_test_hugetlb(folio);
-	int error;
 	bool charged = false;
+	long nr = 1;
 
 	VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
 	VM_BUG_ON_FOLIO(folio_test_swapbacked(folio), folio);
 	mapping_set_update(&xas, mapping);
 
-	folio_get(folio);
-	folio->mapping = mapping;
-	folio->index = index;
-
 	if (!huge) {
-		error = mem_cgroup_charge(folio, NULL, gfp);
+		int error = mem_cgroup_charge(folio, NULL, gfp);
 		VM_BUG_ON_FOLIO(index & (folio_nr_pages(folio) - 1), folio);
 		if (error)
-			goto error;
+			return error;
 		charged = true;
+		xas_set_order(&xas, index, folio_order(folio));
+		nr = folio_nr_pages(folio);
 	}
 
 	gfp &= GFP_RECLAIM_MASK;
+	folio_ref_add(folio, nr);
+	folio->mapping = mapping;
+	folio->index = xas.xa_index;
 
 	do {
 		unsigned int order = xa_get_order(xas.xa, xas.xa_index);
@@ -885,6 +886,8 @@ noinline int __filemap_add_folio(struct address_space *mapping,
 			/* entry may have been split before we acquired lock */
 			order = xa_get_order(xas.xa, xas.xa_index);
 			if (order > folio_order(folio)) {
+				/* How to handle large swap entries? */
+				BUG_ON(shmem_mapping(mapping));
 				xas_split(&xas, old, order);
 				xas_reset(&xas);
 			}
@@ -894,29 +897,31 @@ noinline int __filemap_add_folio(struct address_space *mapping,
 		if (xas_error(&xas))
 			goto unlock;
 
-		mapping->nrpages++;
+		mapping->nrpages += nr;
 
 		/* hugetlb pages do not participate in page cache accounting */
-		if (!huge)
-			__lruvec_stat_add_folio(folio, NR_FILE_PAGES);
+		if (!huge) {
+			__lruvec_stat_mod_folio(folio, NR_FILE_PAGES, nr);
+			if (folio_test_pmd_mappable(folio))
+				__lruvec_stat_mod_folio(folio,
+						NR_FILE_THPS, nr);
+		}
 unlock:
 		xas_unlock_irq(&xas);
 	} while (xas_nomem(&xas, gfp));
 
-	if (xas_error(&xas)) {
-		error = xas_error(&xas);
-		if (charged)
-			mem_cgroup_uncharge(folio);
+	if (xas_error(&xas))
 		goto error;
-	}
 
 	trace_mm_filemap_add_to_page_cache(folio);
 	return 0;
 error:
+	if (charged)
+		mem_cgroup_uncharge(folio);
 	folio->mapping = NULL;
 	/* Leave page->index set: truncation relies upon it */
-	folio_put(folio);
-	return error;
+	folio_put_refs(folio, nr);
+	return xas_error(&xas);
 }
 ALLOW_ERROR_INJECTION(__filemap_add_folio, ERRNO);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 68/75] mm: Fix READ_ONLY_THP warning
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (66 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 67/75] mm/filemap: Allow large folios to be added to the page cache Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 69/75] mm: Make large folios depend on THP Matthew Wilcox (Oracle)
                   ` (8 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

These counters only exist if CONFIG_READ_ONLY_THP_FOR_FS is defined,
but we do not need to warn if the filesystem natively supports large
folios.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/pagemap.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index dddd660da24f..39115f75962c 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -212,7 +212,7 @@ static inline void filemap_nr_thps_inc(struct address_space *mapping)
 	if (!mapping_large_folio_support(mapping))
 		atomic_inc(&mapping->nr_thps);
 #else
-	WARN_ON_ONCE(1);
+	WARN_ON_ONCE(mapping_large_folio_support(mapping) == 0);
 #endif
 }
 
@@ -222,7 +222,7 @@ static inline void filemap_nr_thps_dec(struct address_space *mapping)
 	if (!mapping_large_folio_support(mapping))
 		atomic_dec(&mapping->nr_thps);
 #else
-	WARN_ON_ONCE(1);
+	WARN_ON_ONCE(mapping_large_folio_support(mapping) == 0);
 #endif
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 69/75] mm: Make large folios depend on THP
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (67 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 68/75] mm: Fix READ_ONLY_THP warning Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 70/75] mm: Support arbitrary THP sizes Matthew Wilcox (Oracle)
                   ` (7 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Some parts of the VM still depend on THP to handle large folios
correctly.  Until those are fixed, prevent creating large folios
if THP are disabled.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/pagemap.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 39115f75962c..ccf02a7d4d65 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -192,9 +192,14 @@ static inline void mapping_set_large_folios(struct address_space *mapping)
 	__set_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags);
 }
 
+/*
+ * Large folio support currently depends on THP.  These dependencies are
+ * being worked on but are not yet fixed.
+ */
 static inline bool mapping_large_folio_support(struct address_space *mapping)
 {
-	return test_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags);
+	return IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
+		test_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags);
 }
 
 static inline int filemap_nr_thps(struct address_space *mapping)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 70/75] mm: Support arbitrary THP sizes
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (68 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 69/75] mm: Make large folios depend on THP Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 71/75] mm/readahead: Add large folio readahead Matthew Wilcox (Oracle)
                   ` (6 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

For code which has not yet been converted from THP to folios, use the
compound size of the page instead of assuming PTE or PMD size.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/huge_mm.h | 47 -----------------------------------------
 include/linux/mm.h      | 31 +++++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 47 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index e0348bca3d66..0734aff8fa19 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -250,30 +250,6 @@ static inline spinlock_t *pud_trans_huge_lock(pud_t *pud,
 		return NULL;
 }
 
-/**
- * thp_order - Order of a transparent huge page.
- * @page: Head page of a transparent huge page.
- */
-static inline unsigned int thp_order(struct page *page)
-{
-	VM_BUG_ON_PGFLAGS(PageTail(page), page);
-	if (PageHead(page))
-		return HPAGE_PMD_ORDER;
-	return 0;
-}
-
-/**
- * thp_nr_pages - The number of regular pages in this huge page.
- * @page: The head page of a huge page.
- */
-static inline int thp_nr_pages(struct page *page)
-{
-	VM_BUG_ON_PGFLAGS(PageTail(page), page);
-	if (PageHead(page))
-		return HPAGE_PMD_NR;
-	return 1;
-}
-
 /**
  * folio_test_pmd_mappable - Can we map this folio with a PMD?
  * @folio: The folio to test
@@ -336,18 +312,6 @@ static inline struct list_head *page_deferred_list(struct page *page)
 #define HPAGE_PUD_MASK ({ BUILD_BUG(); 0; })
 #define HPAGE_PUD_SIZE ({ BUILD_BUG(); 0; })
 
-static inline unsigned int thp_order(struct page *page)
-{
-	VM_BUG_ON_PGFLAGS(PageTail(page), page);
-	return 0;
-}
-
-static inline int thp_nr_pages(struct page *page)
-{
-	VM_BUG_ON_PGFLAGS(PageTail(page), page);
-	return 1;
-}
-
 static inline bool folio_test_pmd_mappable(struct folio *folio)
 {
 	return false;
@@ -489,15 +453,4 @@ static inline int split_folio_to_list(struct folio *folio,
 	return split_huge_page_to_list(&folio->page, list);
 }
 
-/**
- * thp_size - Size of a transparent huge page.
- * @page: Head page of a transparent huge page.
- *
- * Return: Number of bytes in this page.
- */
-static inline unsigned long thp_size(struct page *page)
-{
-	return PAGE_SIZE << thp_order(page);
-}
-
 #endif /* _LINUX_HUGE_MM_H */
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 74d9cda7cfd6..0c2a0f4bda1b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -945,6 +945,37 @@ static inline unsigned int page_shift(struct page *page)
 	return PAGE_SHIFT + compound_order(page);
 }
 
+/**
+ * thp_order - Order of a transparent huge page.
+ * @page: Head page of a transparent huge page.
+ */
+static inline unsigned int thp_order(struct page *page)
+{
+	VM_BUG_ON_PGFLAGS(PageTail(page), page);
+	return compound_order(page);
+}
+
+/**
+ * thp_nr_pages - The number of regular pages in this huge page.
+ * @page: The head page of a huge page.
+ */
+static inline int thp_nr_pages(struct page *page)
+{
+	VM_BUG_ON_PGFLAGS(PageTail(page), page);
+	return compound_nr(page);
+}
+
+/**
+ * thp_size - Size of a transparent huge page.
+ * @page: Head page of a transparent huge page.
+ *
+ * Return: Number of bytes in this page.
+ */
+static inline unsigned long thp_size(struct page *page)
+{
+	return PAGE_SIZE << thp_order(page);
+}
+
 void free_compound_page(struct page *page);
 
 #ifdef CONFIG_MMU
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 71/75] mm/readahead: Add large folio readahead
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (69 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 70/75] mm: Support arbitrary THP sizes Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-06 13:10   ` Mark Hemment
  2022-02-04 19:58 ` [PATCH 72/75] mm/readahead: Align file mappings for non-DAX Matthew Wilcox (Oracle)
                   ` (5 subsequent siblings)
  76 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Allocate large folios in the readahead code when the filesystem supports
them and it seems worth doing.  The heuristic for choosing which folio
sizes will surely need some tuning, but this aggressive ramp-up has been
good for testing.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/readahead.c | 106 +++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 99 insertions(+), 7 deletions(-)

diff --git a/mm/readahead.c b/mm/readahead.c
index cf0dcf89eb69..5100eaf5b0ee 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -148,7 +148,7 @@ static void read_pages(struct readahead_control *rac, struct list_head *pages,
 
 	blk_finish_plug(&plug);
 
-	BUG_ON(!list_empty(pages));
+	BUG_ON(pages && !list_empty(pages));
 	BUG_ON(readahead_count(rac));
 
 out:
@@ -431,11 +431,103 @@ static int try_context_readahead(struct address_space *mapping,
 	return 1;
 }
 
+/*
+ * There are some parts of the kernel which assume that PMD entries
+ * are exactly HPAGE_PMD_ORDER.  Those should be fixed, but until then,
+ * limit the maximum allocation order to PMD size.  I'm not aware of any
+ * assumptions about maximum order if THP are disabled, but 8 seems like
+ * a good order (that's 1MB if you're using 4kB pages)
+ */
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+#define MAX_PAGECACHE_ORDER	HPAGE_PMD_ORDER
+#else
+#define MAX_PAGECACHE_ORDER	8
+#endif
+
+static inline int ra_alloc_folio(struct readahead_control *ractl, pgoff_t index,
+		pgoff_t mark, unsigned int order, gfp_t gfp)
+{
+	int err;
+	struct folio *folio = filemap_alloc_folio(gfp, order);
+
+	if (!folio)
+		return -ENOMEM;
+	if (mark - index < (1UL << order))
+		folio_set_readahead(folio);
+	err = filemap_add_folio(ractl->mapping, folio, index, gfp);
+	if (err)
+		folio_put(folio);
+	else
+		ractl->_nr_pages += 1UL << order;
+	return err;
+}
+
+static void page_cache_ra_order(struct readahead_control *ractl,
+		struct file_ra_state *ra, unsigned int new_order)
+{
+	struct address_space *mapping = ractl->mapping;
+	pgoff_t index = readahead_index(ractl);
+	pgoff_t limit = (i_size_read(mapping->host) - 1) >> PAGE_SHIFT;
+	pgoff_t mark = index + ra->size - ra->async_size;
+	int err = 0;
+	gfp_t gfp = readahead_gfp_mask(mapping);
+
+	if (!mapping_large_folio_support(mapping) || ra->size < 4)
+		goto fallback;
+
+	limit = min(limit, index + ra->size - 1);
+
+	if (new_order < MAX_PAGECACHE_ORDER) {
+		new_order += 2;
+		if (new_order > MAX_PAGECACHE_ORDER)
+			new_order = MAX_PAGECACHE_ORDER;
+		while ((1 << new_order) > ra->size)
+			new_order--;
+	}
+
+	while (index <= limit) {
+		unsigned int order = new_order;
+
+		/* Align with smaller pages if needed */
+		if (index & ((1UL << order) - 1)) {
+			order = __ffs(index);
+			if (order == 1)
+				order = 0;
+		}
+		/* Don't allocate pages past EOF */
+		while (index + (1UL << order) - 1 > limit) {
+			if (--order == 1)
+				order = 0;
+		}
+		err = ra_alloc_folio(ractl, index, mark, order, gfp);
+		if (err)
+			break;
+		index += 1UL << order;
+	}
+
+	if (index > limit) {
+		ra->size += index - limit - 1;
+		ra->async_size += index - limit - 1;
+	}
+
+	read_pages(ractl, NULL, false);
+
+	/*
+	 * If there were already pages in the page cache, then we may have
+	 * left some gaps.  Let the regular readahead code take care of this
+	 * situation.
+	 */
+	if (!err)
+		return;
+fallback:
+	do_page_cache_ra(ractl, ra->size, ra->async_size);
+}
+
 /*
  * A minimal readahead algorithm for trivial sequential/random reads.
  */
 static void ondemand_readahead(struct readahead_control *ractl,
-		bool hit_readahead_marker, unsigned long req_size)
+		struct folio *folio, unsigned long req_size)
 {
 	struct backing_dev_info *bdi = inode_to_bdi(ractl->mapping->host);
 	struct file_ra_state *ra = ractl->ra;
@@ -470,12 +562,12 @@ static void ondemand_readahead(struct readahead_control *ractl,
 	}
 
 	/*
-	 * Hit a marked page without valid readahead state.
+	 * Hit a marked folio without valid readahead state.
 	 * E.g. interleaved reads.
 	 * Query the pagecache for async_size, which normally equals to
 	 * readahead size. Ramp it up and use it as the new readahead size.
 	 */
-	if (hit_readahead_marker) {
+	if (folio) {
 		pgoff_t start;
 
 		rcu_read_lock();
@@ -548,7 +640,7 @@ static void ondemand_readahead(struct readahead_control *ractl,
 	}
 
 	ractl->_index = ra->start;
-	do_page_cache_ra(ractl, ra->size, ra->async_size);
+	page_cache_ra_order(ractl, ra, folio ? folio_order(folio) : 0);
 }
 
 void page_cache_sync_ra(struct readahead_control *ractl,
@@ -576,7 +668,7 @@ void page_cache_sync_ra(struct readahead_control *ractl,
 	}
 
 	/* do read-ahead */
-	ondemand_readahead(ractl, false, req_count);
+	ondemand_readahead(ractl, NULL, req_count);
 }
 EXPORT_SYMBOL_GPL(page_cache_sync_ra);
 
@@ -605,7 +697,7 @@ void page_cache_async_ra(struct readahead_control *ractl,
 		return;
 
 	/* do read-ahead */
-	ondemand_readahead(ractl, true, req_count);
+	ondemand_readahead(ractl, folio, req_count);
 }
 EXPORT_SYMBOL_GPL(page_cache_async_ra);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 72/75] mm/readahead: Align file mappings for non-DAX
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (70 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 71/75] mm/readahead: Add large folio readahead Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 73/75] mm/readahead: Switch to page_cache_ra_order Matthew Wilcox (Oracle)
                   ` (4 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: William Kucharski, linux-kernel, Matthew Wilcox

From: William Kucharski <william.kucharski@oracle.com>

When we have the opportunity to use PMDs to map a file, we want to follow
the same rules as DAX.

Signed-off-by: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/huge_memory.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index a80d0408ebf4..dd3e14700220 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -582,13 +582,10 @@ unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
 	unsigned long ret;
 	loff_t off = (loff_t)pgoff << PAGE_SHIFT;
 
-	if (!IS_DAX(filp->f_mapping->host) || !IS_ENABLED(CONFIG_FS_DAX_PMD))
-		goto out;
-
 	ret = __thp_get_unmapped_area(filp, addr, len, off, flags, PMD_SIZE);
 	if (ret)
 		return ret;
-out:
+
 	return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
 }
 EXPORT_SYMBOL_GPL(thp_get_unmapped_area);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 73/75] mm/readahead: Switch to page_cache_ra_order
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (71 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 72/75] mm/readahead: Align file mappings for non-DAX Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 74/75] mm/filemap: Support VM_HUGEPAGE for file mappings Matthew Wilcox (Oracle)
                   ` (3 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

do_page_cache_ra() was being exposed for the benefit of
do_sync_mmap_readahead().  Switch it over to page_cache_ra_order()
partly because it's a better interface but mostly for the benefit of
the next patch.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/filemap.c   | 2 +-
 mm/internal.h  | 4 ++--
 mm/readahead.c | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 8f7ac3de9098..fe764225ae99 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3027,7 +3027,7 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf)
 	ra->size = ra->ra_pages;
 	ra->async_size = ra->ra_pages / 4;
 	ractl._index = ra->start;
-	do_page_cache_ra(&ractl, ra->size, ra->async_size);
+	page_cache_ra_order(&ractl, ra, 0);
 	return fpin;
 }
 
diff --git a/mm/internal.h b/mm/internal.h
index 360256e4ee06..4b401313b9f2 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -83,8 +83,8 @@ void unmap_page_range(struct mmu_gather *tlb,
 			     unsigned long addr, unsigned long end,
 			     struct zap_details *details);
 
-void do_page_cache_ra(struct readahead_control *, unsigned long nr_to_read,
-		unsigned long lookahead_size);
+void page_cache_ra_order(struct readahead_control *, struct file_ra_state *,
+		unsigned int order);
 void force_page_cache_ra(struct readahead_control *, unsigned long nr);
 static inline void force_page_cache_readahead(struct address_space *mapping,
 		struct file *file, pgoff_t index, unsigned long nr_to_read)
diff --git a/mm/readahead.c b/mm/readahead.c
index 5100eaf5b0ee..a20391d6a71b 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -247,7 +247,7 @@ EXPORT_SYMBOL_GPL(page_cache_ra_unbounded);
  * behaviour which would occur if page allocations are causing VM writeback.
  * We really don't want to intermingle reads and writes like that.
  */
-void do_page_cache_ra(struct readahead_control *ractl,
+static void do_page_cache_ra(struct readahead_control *ractl,
 		unsigned long nr_to_read, unsigned long lookahead_size)
 {
 	struct inode *inode = ractl->mapping->host;
@@ -462,7 +462,7 @@ static inline int ra_alloc_folio(struct readahead_control *ractl, pgoff_t index,
 	return err;
 }
 
-static void page_cache_ra_order(struct readahead_control *ractl,
+void page_cache_ra_order(struct readahead_control *ractl,
 		struct file_ra_state *ra, unsigned int new_order)
 {
 	struct address_space *mapping = ractl->mapping;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 74/75] mm/filemap: Support VM_HUGEPAGE for file mappings
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (72 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 73/75] mm/readahead: Switch to page_cache_ra_order Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-04 19:58 ` [PATCH 75/75] selftests/vm/transhuge-stress: Support file-backed PMD folios Matthew Wilcox (Oracle)
                   ` (2 subsequent siblings)
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

If the VM_HUGEPAGE flag is set, attempt to allocate PMD-sized folios
during readahead, even if we have no history of readahead being
successful.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/filemap.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/mm/filemap.c b/mm/filemap.c
index fe764225ae99..7608ee030662 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2995,6 +2995,24 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf)
 	struct file *fpin = NULL;
 	unsigned int mmap_miss;
 
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+	/* Use the readahead code, even if readahead is disabled */
+	if (vmf->vma->vm_flags & VM_HUGEPAGE) {
+		fpin = maybe_unlock_mmap_for_io(vmf, fpin);
+		ractl._index &= ~((unsigned long)HPAGE_PMD_NR - 1);
+		ra->size = HPAGE_PMD_NR;
+		/*
+		 * Fetch two PMD folios, so we get the chance to actually
+		 * readahead, unless we've been told not to.
+		 */
+		if (!(vmf->vma->vm_flags & VM_RAND_READ))
+			ra->size *= 2;
+		ra->async_size = HPAGE_PMD_NR;
+		page_cache_ra_order(&ractl, ra, HPAGE_PMD_ORDER);
+		return fpin;
+	}
+#endif
+
 	/* If we don't want any read-ahead, don't bother */
 	if (vmf->vma->vm_flags & VM_RAND_READ)
 		return fpin;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH 75/75] selftests/vm/transhuge-stress: Support file-backed PMD folios
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (73 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 74/75] mm/filemap: Support VM_HUGEPAGE for file mappings Matthew Wilcox (Oracle)
@ 2022-02-04 19:58 ` Matthew Wilcox (Oracle)
  2022-02-13 22:31 ` [PATCH 00/75] MM folio patches for 5.18 John Hubbard
  2022-02-14  4:33 ` Matthew Wilcox
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-04 19:58 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Add a -f <filename> option to test PMD folios on files

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 tools/testing/selftests/vm/transhuge-stress.c | 35 +++++++++++++------
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/tools/testing/selftests/vm/transhuge-stress.c b/tools/testing/selftests/vm/transhuge-stress.c
index 5e4c036f6ad3..a03cb3fce1f6 100644
--- a/tools/testing/selftests/vm/transhuge-stress.c
+++ b/tools/testing/selftests/vm/transhuge-stress.c
@@ -26,15 +26,17 @@
 #define PAGEMAP_PFN(ent)	((ent) & ((1ull << 55) - 1))
 
 int pagemap_fd;
+int backing_fd = -1;
+int mmap_flags = MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE;
+#define PROT_RW (PROT_READ | PROT_WRITE)
 
 int64_t allocate_transhuge(void *ptr)
 {
 	uint64_t ent[2];
 
 	/* drop pmd */
-	if (mmap(ptr, HPAGE_SIZE, PROT_READ | PROT_WRITE,
-				MAP_FIXED | MAP_ANONYMOUS |
-				MAP_NORESERVE | MAP_PRIVATE, -1, 0) != ptr)
+	if (mmap(ptr, HPAGE_SIZE, PROT_RW, MAP_FIXED | mmap_flags,
+		 backing_fd, 0) != ptr)
 		errx(2, "mmap transhuge");
 
 	if (madvise(ptr, HPAGE_SIZE, MADV_HUGEPAGE))
@@ -60,6 +62,8 @@ int main(int argc, char **argv)
 	size_t ram, len;
 	void *ptr, *p;
 	struct timespec a, b;
+	int i = 0;
+	char *name = NULL;
 	double s;
 	uint8_t *map;
 	size_t map_len;
@@ -69,13 +73,23 @@ int main(int argc, char **argv)
 		ram = SIZE_MAX / 4;
 	else
 		ram *= sysconf(_SC_PAGESIZE);
+	len = ram;
+
+	while (++i < argc) {
+		if (!strcmp(argv[i], "-h"))
+			errx(1, "usage: %s [size in MiB]", argv[0]);
+		else if (!strcmp(argv[i], "-f"))
+			name = argv[++i];
+		else
+			len = atoll(argv[i]) << 20;
+	}
 
-	if (argc == 1)
-		len = ram;
-	else if (!strcmp(argv[1], "-h"))
-		errx(1, "usage: %s [size in MiB]", argv[0]);
-	else
-		len = atoll(argv[1]) << 20;
+	if (name) {
+		backing_fd = open(name, O_RDWR);
+		if (backing_fd == -1)
+			errx(2, "open %s", name);
+		mmap_flags = MAP_SHARED;
+	}
 
 	warnx("allocate %zd transhuge pages, using %zd MiB virtual memory"
 	      " and %zd MiB of ram", len >> HPAGE_SHIFT, len >> 20,
@@ -86,8 +100,7 @@ int main(int argc, char **argv)
 		err(2, "open pagemap");
 
 	len -= len % HPAGE_SIZE;
-	ptr = mmap(NULL, len + HPAGE_SIZE, PROT_READ | PROT_WRITE,
-			MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0);
+	ptr = mmap(NULL, len + HPAGE_SIZE, PROT_RW, mmap_flags, backing_fd, 0);
 	if (ptr == MAP_FAILED)
 		err(2, "initial mmap");
 	ptr += HPAGE_SIZE - (uintptr_t)ptr % HPAGE_SIZE;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: [PATCH 01/75] mm/gup: Increment the page refcount before the pincount
  2022-02-04 19:57 ` [PATCH 01/75] mm/gup: Increment the page refcount before the pincount Matthew Wilcox (Oracle)
@ 2022-02-04 21:13   ` John Hubbard
  2022-02-04 21:28     ` Matthew Wilcox
  2022-02-07  7:45   ` Christoph Hellwig
  1 sibling, 1 reply; 115+ messages in thread
From: John Hubbard @ 2022-02-04 21:13 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-mm; +Cc: linux-kernel

On 2/4/22 11:57, Matthew Wilcox (Oracle) wrote:
> We should always increase the refcount before doing anything else to
> the page so that other page users see the elevated refcount first.

Absolutely agree in principle. Is there anything else to say, though,
such as why this matters here? Or is the change just being done for
"best practices"? (Which is still a very solid reason, of course.)
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>   mm/gup.c | 12 ++++++------
>   1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/gup.c b/mm/gup.c
> index a9d4d724aef7..08020987dfc0 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -220,18 +220,18 @@ bool __must_check try_grab_page(struct page *page, unsigned int flags)
>   		if (WARN_ON_ONCE(page_ref_count(page) <= 0))
>   			return false;
>   
> -		if (hpage_pincount_available(page))
> -			hpage_pincount_add(page, 1);
> -		else
> -			refs = GUP_PIN_COUNTING_BIAS;
> -
>   		/*
>   		 * Similar to try_grab_compound_head(): even if using the
>   		 * hpage_pincount_add/_sub() routines, be sure to
>   		 * *also* increment the normal page refcount field at least
>   		 * once, so that the page really is pinned.
>   		 */
> -		page_ref_add(page, refs);

A fine point: this hunk removes the last use of "refs", which means that
this patch will lead to an unused variable warning. So I think it would
be best to remove the "int refs = 1;" line in this patch, rather than
waiting until patch #10.

With that change, please feel free to add:

Reviewed-by: John Hubbard <jhubbard@nvidia.com>


thanks,
-- 
John Hubbard
NVIDIA

> +		if (hpage_pincount_available(page)) {
> +			page_ref_add(page, 1);
> +			hpage_pincount_add(page, 1);
> +		} else {
> +			page_ref_add(page, GUP_PIN_COUNTING_BIAS);
> +		}
>   
>   		mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_ACQUIRED, 1);
>   	}


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 01/75] mm/gup: Increment the page refcount before the pincount
  2022-02-04 21:13   ` John Hubbard
@ 2022-02-04 21:28     ` Matthew Wilcox
  0 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox @ 2022-02-04 21:28 UTC (permalink / raw)
  To: John Hubbard; +Cc: linux-mm, linux-kernel

On Fri, Feb 04, 2022 at 01:13:29PM -0800, John Hubbard wrote:
> On 2/4/22 11:57, Matthew Wilcox (Oracle) wrote:
> > We should always increase the refcount before doing anything else to
> > the page so that other page users see the elevated refcount first.
> 
> Absolutely agree in principle. Is there anything else to say, though,
> such as why this matters here? Or is the change just being done for
> "best practices"? (Which is still a very solid reason, of course.)

I'm not sure if any software examines both refcount and pincount to
determine whether it knows what all of the refcounts to a page mean,
but a human might.  If someone else calls dump_page() after we update
pincount and before we update refcount, we might make a bad decision
while debugging.  When the updates are the other way around, a spurious
extra refcount is to be expected and should not confuse anyone more than
they already are.

> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > ---
> >   mm/gup.c | 12 ++++++------
> >   1 file changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/mm/gup.c b/mm/gup.c
> > index a9d4d724aef7..08020987dfc0 100644
> > --- a/mm/gup.c
> > +++ b/mm/gup.c
> > @@ -220,18 +220,18 @@ bool __must_check try_grab_page(struct page *page, unsigned int flags)
> >   		if (WARN_ON_ONCE(page_ref_count(page) <= 0))
> >   			return false;
> > -		if (hpage_pincount_available(page))
> > -			hpage_pincount_add(page, 1);
> > -		else
> > -			refs = GUP_PIN_COUNTING_BIAS;
> > -
> >   		/*
> >   		 * Similar to try_grab_compound_head(): even if using the
> >   		 * hpage_pincount_add/_sub() routines, be sure to
> >   		 * *also* increment the normal page refcount field at least
> >   		 * once, so that the page really is pinned.
> >   		 */
> > -		page_ref_add(page, refs);
> 
> A fine point: this hunk removes the last use of "refs", which means that
> this patch will lead to an unused variable warning. So I think it would
> be best to remove the "int refs = 1;" line in this patch, rather than
> waiting until patch #10.

Argh!  I noticed that I needed to do that, and then forgot to do it.
Thanks!

> With that change, please feel free to add:
> 
> Reviewed-by: John Hubbard <jhubbard@nvidia.com>

Added, and git tree updated.


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 10/75] mm/gup: Remove hpage_pincount_add()
  2022-02-04 19:57 ` [PATCH 10/75] mm/gup: Remove hpage_pincount_add() Matthew Wilcox (Oracle)
@ 2022-02-04 21:29   ` John Hubbard
  2022-02-07  7:46   ` Christoph Hellwig
  1 sibling, 0 replies; 115+ messages in thread
From: John Hubbard @ 2022-02-04 21:29 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-mm; +Cc: linux-kernel

On 2/4/22 11:57, Matthew Wilcox (Oracle) wrote:
> It's clearer to call atomic_add() in the callers; the assertions clearly
> can't fire there because they're part of the condition for calling
> atomic_add().
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>   mm/gup.c | 33 +++++++++++----------------------
>   1 file changed, 11 insertions(+), 22 deletions(-)

Looks nice.

Reviewed-by: John Hubbard <jhubbard@nvidia.com>

thanks,
-- 
John Hubbard
NVIDIA

> 
> diff --git a/mm/gup.c b/mm/gup.c
> index 923a0d44203c..60168a09d52a 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -29,14 +29,6 @@ struct follow_page_context {
>   	unsigned int page_mask;
>   };
>   
> -static void hpage_pincount_add(struct page *page, int refs)
> -{
> -	VM_BUG_ON_PAGE(!hpage_pincount_available(page), page);
> -	VM_BUG_ON_PAGE(page != compound_head(page), page);
> -
> -	atomic_add(refs, compound_pincount_ptr(page));
> -}
> -
>   static void hpage_pincount_sub(struct page *page, int refs)
>   {
>   	VM_BUG_ON_PAGE(!hpage_pincount_available(page), page);
> @@ -151,17 +143,17 @@ __maybe_unused struct page *try_grab_compound_head(struct page *page,
>   			return NULL;
>   
>   		/*
> -		 * When pinning a compound page of order > 1 (which is what
> -		 * hpage_pincount_available() checks for), use an exact count to
> -		 * track it, via hpage_pincount_add/_sub().
> +		 * When pinning a compound page of order > 1 (which is
> +		 * what hpage_pincount_available() checks for), use an
> +		 * exact count to track it.
>   		 *
> -		 * However, be sure to *also* increment the normal page refcount
> -		 * field at least once, so that the page really is pinned.
> -		 * That's why the refcount from the earlier
> +		 * However, be sure to *also* increment the normal page
> +		 * refcount field at least once, so that the page really
> +		 * is pinned.  That's why the refcount from the earlier
>   		 * try_get_compound_head() is left intact.
>   		 */
>   		if (hpage_pincount_available(page))
> -			hpage_pincount_add(page, refs);
> +			atomic_add(refs, compound_pincount_ptr(page));
>   		else
>   			page_ref_add(page, refs * (GUP_PIN_COUNTING_BIAS - 1));
>   
> @@ -216,22 +208,19 @@ bool __must_check try_grab_page(struct page *page, unsigned int flags)
>   	if (flags & FOLL_GET)
>   		return try_get_page(page);
>   	else if (flags & FOLL_PIN) {
> -		int refs = 1;
> -
>   		page = compound_head(page);
>   
>   		if (WARN_ON_ONCE(page_ref_count(page) <= 0))
>   			return false;
>   
>   		/*
> -		 * Similar to try_grab_compound_head(): even if using the
> -		 * hpage_pincount_add/_sub() routines, be sure to
> -		 * *also* increment the normal page refcount field at least
> -		 * once, so that the page really is pinned.
> +		 * Similar to try_grab_compound_head(): be sure to *also*
> +		 * increment the normal page refcount field at least once,
> +		 * so that the page really is pinned.
>   		 */
>   		if (hpage_pincount_available(page)) {
>   			page_ref_add(page, 1);
> -			hpage_pincount_add(page, 1);
> +			atomic_add(1, compound_pincount_ptr(page));
>   		} else {
>   			page_ref_add(page, GUP_PIN_COUNTING_BIAS);
>   		}



^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 16/75] mm/gup: Convert try_grab_page() to use a folio
  2022-02-04 19:57 ` [PATCH 16/75] mm/gup: Convert try_grab_page() to use a folio Matthew Wilcox (Oracle)
@ 2022-02-06  2:12   ` John Hubbard
  2022-02-07  7:47   ` Christoph Hellwig
  1 sibling, 0 replies; 115+ messages in thread
From: John Hubbard @ 2022-02-06  2:12 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-mm; +Cc: linux-kernel

On 2/4/22 11:57, Matthew Wilcox (Oracle) wrote:
> Hoist the folio conversion and the folio_ref_count() check to the
> top of the function instead of using the one buried in try_get_page().
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>   mm/gup.c | 28 +++++++++++++---------------
>   1 file changed, 13 insertions(+), 15 deletions(-)

Reviewed-by: John Hubbard <jhubbard@nvidia.com>

thanks,
-- 
John Hubbard
NVIDIA

> 
> diff --git a/mm/gup.c b/mm/gup.c
> index 4f1669db92f5..d18ce4da573f 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -174,15 +174,14 @@ static void put_compound_head(struct page *page, int refs, unsigned int flags)
>   
>   /**
>    * try_grab_page() - elevate a page's refcount by a flag-dependent amount
> + * @page:    pointer to page to be grabbed
> + * @flags:   gup flags: these are the FOLL_* flag values.
>    *
>    * This might not do anything at all, depending on the flags argument.
>    *
>    * "grab" names in this file mean, "look at flags to decide whether to use
>    * FOLL_PIN or FOLL_GET behavior, when incrementing the page's refcount.
>    *
> - * @page:    pointer to page to be grabbed
> - * @flags:   gup flags: these are the FOLL_* flag values.
> - *
>    * Either FOLL_PIN or FOLL_GET (or neither) may be set, but not both at the same
>    * time. Cases: please see the try_grab_folio() documentation, with
>    * "refs=1".
> @@ -193,29 +192,28 @@ static void put_compound_head(struct page *page, int refs, unsigned int flags)
>    */
>   bool __must_check try_grab_page(struct page *page, unsigned int flags)
>   {
> +	struct folio *folio = page_folio(page);
> +
>   	WARN_ON_ONCE((flags & (FOLL_GET | FOLL_PIN)) == (FOLL_GET | FOLL_PIN));
> +	if (WARN_ON_ONCE(folio_ref_count(folio) <= 0))
> +		return false;
>   
>   	if (flags & FOLL_GET)
> -		return try_get_page(page);
> +		folio_ref_inc(folio);
>   	else if (flags & FOLL_PIN) {
> -		page = compound_head(page);
> -
> -		if (WARN_ON_ONCE(page_ref_count(page) <= 0))
> -			return false;
> -
>   		/*
> -		 * Similar to try_grab_compound_head(): be sure to *also*
> +		 * Similar to try_grab_folio(): be sure to *also*
>   		 * increment the normal page refcount field at least once,
>   		 * so that the page really is pinned.
>   		 */
> -		if (PageHead(page)) {
> -			page_ref_add(page, 1);
> -			atomic_add(1, compound_pincount_ptr(page));
> +		if (folio_test_large(folio)) {
> +			folio_ref_add(folio, 1);
> +			atomic_add(1, folio_pincount_ptr(folio));
>   		} else {
> -			page_ref_add(page, GUP_PIN_COUNTING_BIAS);
> +			folio_ref_add(folio, GUP_PIN_COUNTING_BIAS);
>   		}
>   
> -		mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_ACQUIRED, 1);
> +		node_stat_mod_folio(folio, NR_FOLL_PIN_ACQUIRED, 1);
>   	}
>   
>   	return true;



^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 71/75] mm/readahead: Add large folio readahead
  2022-02-04 19:58 ` [PATCH 71/75] mm/readahead: Add large folio readahead Matthew Wilcox (Oracle)
@ 2022-02-06 13:10   ` Mark Hemment
  0 siblings, 0 replies; 115+ messages in thread
From: Mark Hemment @ 2022-02-06 13:10 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

On Fri, 4 Feb 2022 at 20:00, Matthew Wilcox (Oracle)
<willy@infradead.org> wrote:
>
> Allocate large folios in the readahead code when the filesystem supports
> them and it seems worth doing.  The heuristic for choosing which folio
> sizes will surely need some tuning, but this aggressive ramp-up has been
> good for testing.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  mm/readahead.c | 106 +++++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 99 insertions(+), 7 deletions(-)

...
> +static void page_cache_ra_order(struct readahead_control *ractl,
> +               struct file_ra_state *ra, unsigned int new_order)
> +{
> +       struct address_space *mapping = ractl->mapping;
> +       pgoff_t index = readahead_index(ractl);
> +       pgoff_t limit = (i_size_read(mapping->host) - 1) >> PAGE_SHIFT;

Not sure if can be called for an empty file, but do _page_cache_ra()
has an explicit check for i_size_read() == 0.

> +       pgoff_t mark = index + ra->size - ra->async_size;
> +       int err = 0;
> +       gfp_t gfp = readahead_gfp_mask(mapping);
> +
> +       if (!mapping_large_folio_support(mapping) || ra->size < 4)
> +               goto fallback;
> +
> +       limit = min(limit, index + ra->size - 1);

Cheers,
Mark


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 20/75] mm/gup: Convert gup_pte_range() to use a folio
  2022-02-04 19:57 ` [PATCH 20/75] mm/gup: Convert gup_pte_range() to use a folio Matthew Wilcox (Oracle)
@ 2022-02-06 14:52   ` Mark Hemment
  2022-02-11 20:20     ` Matthew Wilcox
  0 siblings, 1 reply; 115+ messages in thread
From: Mark Hemment @ 2022-02-06 14:52 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: linux-mm, linux-kernel, Christoph Hellwig, John Hubbard,
	Jason Gunthorpe, William Kucharski

On Fri, 4 Feb 2022 at 20:21, Matthew Wilcox (Oracle)
<willy@infradead.org> wrote:
>
> We still call try_grab_folio() once per PTE; a future patch could
> optimise to just adjust the reference count for each page within
> the folio.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: John Hubbard <jhubbard@nvidia.com>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: William Kucharski <william.kucharski@oracle.com>
> ---
>  mm/gup.c | 16 +++++++---------
>  1 file changed, 7 insertions(+), 9 deletions(-)
>
> diff --git a/mm/gup.c b/mm/gup.c
> index 00227b2cb1cf..44281350db1a 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -2252,7 +2252,8 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
>         ptem = ptep = pte_offset_map(&pmd, addr);
>         do {
>                 pte_t pte = ptep_get_lockless(ptep);
> -               struct page *head, *page;
> +               struct page *page;
> +               struct folio *folio;
>
>                 /*
>                  * Similar to the PMD case below, NUMA hinting must take slow
> @@ -2279,22 +2280,20 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
>                 VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
>                 page = pte_page(pte);
>
> -               head = try_grab_compound_head(page, 1, flags);
> -               if (!head)
> +               folio = try_grab_folio(page, 1, flags);
> +               if (!folio)
>                         goto pte_unmap;
>
>                 if (unlikely(page_is_secretmem(page))) {
> -                       put_compound_head(head, 1, flags);
> +                       gup_put_folio(folio, 1, flags);
>                         goto pte_unmap;
>                 }
>
>                 if (unlikely(pte_val(pte) != pte_val(*ptep))) {
> -                       put_compound_head(head, 1, flags);
> +                       gup_put_folio(folio, 1, flags);
>                         goto pte_unmap;
>                 }
>
> -               VM_BUG_ON_PAGE(compound_head(page) != head, page);
> -
>                 /*
>                  * We need to make the page accessible if and only if we are
>                  * going to access its content (the FOLL_PIN case).  Please
> @@ -2308,10 +2307,9 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
>                                 goto pte_unmap;
>                         }

V. Minor/nit: Above the "goto pte_unmap;" is the code block;
           if (flags & FOLL_PIN) {
                        ret = arch_make_page_accessible(page);
                        if (ret) {
                                unpin_user_page(page);
                                goto pte_unmap;
                        }
                }
Other conditions which goto pte_unmap, after successful
try_grab_folio(), call gup_put_folio() (rather than
unpin_user_page()).
No change in functionality, but suggest calling gup_put_folio() here
too for consistency.

Cheers,
Mark


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 41/75] hexagon: Add pmd_pfn()
  2022-02-04 19:58 ` [PATCH 41/75] hexagon: Add pmd_pfn() Matthew Wilcox (Oracle)
@ 2022-02-06 18:13   ` Mike Rapoport
  2022-02-06 20:46     ` Matthew Wilcox
  0 siblings, 1 reply; 115+ messages in thread
From: Mike Rapoport @ 2022-02-06 18:13 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

On Fri, Feb 04, 2022 at 07:58:18PM +0000, Matthew Wilcox (Oracle) wrote:
> I need to use this function in common code, so define it for hexagon.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  arch/hexagon/include/asm/pgtable.h | 3 ++-

Why hexagon out of all architectures?
What about m68k, nios2, nds32 etc?

>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/hexagon/include/asm/pgtable.h b/arch/hexagon/include/asm/pgtable.h
> index 18cd6ea9ab23..87e96463ccd6 100644
> --- a/arch/hexagon/include/asm/pgtable.h
> +++ b/arch/hexagon/include/asm/pgtable.h
> @@ -235,10 +235,11 @@ static inline int pmd_bad(pmd_t pmd)
>  	return 0;
>  }
>  
> +#define pmd_pfn(pmd)	(pmd_val(pmd) >> PAGE_SHIFT)

I'd put it in include/linux/pgtable.h inside #ifndef pmd_pfn

>  /*
>   * pmd_page - converts a PMD entry to a page pointer
>   */
> -#define pmd_page(pmd)  (pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT))
> +#define pmd_page(pmd)  (pfn_to_page(pmd_pfn(pmd)))
>  
>  /**
>   * pte_none - check if pte is mapped
> -- 
> 2.34.1
> 
> 

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 41/75] hexagon: Add pmd_pfn()
  2022-02-06 18:13   ` Mike Rapoport
@ 2022-02-06 20:46     ` Matthew Wilcox
  2022-02-06 21:33       ` Mike Rapoport
  0 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox @ 2022-02-06 20:46 UTC (permalink / raw)
  To: Mike Rapoport; +Cc: linux-mm, linux-kernel

On Sun, Feb 06, 2022 at 08:13:11PM +0200, Mike Rapoport wrote:
> On Fri, Feb 04, 2022 at 07:58:18PM +0000, Matthew Wilcox (Oracle) wrote:
> > I need to use this function in common code, so define it for hexagon.
> > 
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > ---
> >  arch/hexagon/include/asm/pgtable.h | 3 ++-
> 
> Why hexagon out of all architectures?
> What about m68k, nios2, nds32 etc?

Presumably they don't support CONFIG_TRANSPARENT_HUGEPAGE.
This code isn't compiled when THP are disabled; at least I haven't
had a buildbot complaint for any other architectures.

> > +#define pmd_pfn(pmd)	(pmd_val(pmd) >> PAGE_SHIFT)
> 
> I'd put it in include/linux/pgtable.h inside #ifndef pmd_pfn

That's completely upside down.  pmd_pfn() should be defined by each
architecture (because generic code can't know anything about the format
of PMDs); if anything pmd_page() should be defined in linux/pgtable.h
in terms of pmd_pfn.

I'm not signing up to do that work as part of this series.  That seems
like a distraction.


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 41/75] hexagon: Add pmd_pfn()
  2022-02-06 20:46     ` Matthew Wilcox
@ 2022-02-06 21:33       ` Mike Rapoport
  2022-02-06 22:05         ` Matthew Wilcox
  0 siblings, 1 reply; 115+ messages in thread
From: Mike Rapoport @ 2022-02-06 21:33 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm, linux-kernel

On Sun, Feb 06, 2022 at 08:46:45PM +0000, Matthew Wilcox wrote:
> On Sun, Feb 06, 2022 at 08:13:11PM +0200, Mike Rapoport wrote:
> > On Fri, Feb 04, 2022 at 07:58:18PM +0000, Matthew Wilcox (Oracle) wrote:
> > > I need to use this function in common code, so define it for hexagon.
> > > 
> > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > > ---
> > >  arch/hexagon/include/asm/pgtable.h | 3 ++-
> > 
> > Why hexagon out of all architectures?
> > What about m68k, nios2, nds32 etc?

> Presumably they don't support CONFIG_TRANSPARENT_HUGEPAGE.
> This code isn't compiled when THP are disabled; at least I haven't
> had a buildbot complaint for any other architectures.

m68k defconfig fails:

  CC      mm/page_vma_mapped.o
mm/page_vma_mapped.c: In function 'page_vma_mapped_walk':
mm/page_vma_mapped.c:219:20: error: implicit declaration of function 'pmd_pfn'; did you mean 'pmd_off'? [-Werror=implicit-function-declaration]
  219 |     if (!check_pmd(pmd_pfn(pmde), pvmw))
      |                    ^~~~~~~
      |                    pmd_off
> 
> > > +#define pmd_pfn(pmd)	(pmd_val(pmd) >> PAGE_SHIFT)
> > 
> > I'd put it in include/linux/pgtable.h inside #ifndef pmd_pfn
> 
> That's completely upside down.  pmd_pfn() should be defined by each
> architecture (because generic code can't know anything about the format
> of PMDs); if anything pmd_page() should be defined in linux/pgtable.h
> in terms of pmd_pfn.
> 
> I'm not signing up to do that work as part of this series.  That seems
> like a distraction.

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 41/75] hexagon: Add pmd_pfn()
  2022-02-06 21:33       ` Mike Rapoport
@ 2022-02-06 22:05         ` Matthew Wilcox
  2022-02-07 14:24           ` Mike Rapoport
  0 siblings, 1 reply; 115+ messages in thread
From: Matthew Wilcox @ 2022-02-06 22:05 UTC (permalink / raw)
  To: Mike Rapoport; +Cc: linux-mm, linux-kernel

On Sun, Feb 06, 2022 at 11:33:40PM +0200, Mike Rapoport wrote:
> On Sun, Feb 06, 2022 at 08:46:45PM +0000, Matthew Wilcox wrote:
> > On Sun, Feb 06, 2022 at 08:13:11PM +0200, Mike Rapoport wrote:
> > > On Fri, Feb 04, 2022 at 07:58:18PM +0000, Matthew Wilcox (Oracle) wrote:
> > > > I need to use this function in common code, so define it for hexagon.
> > > > 
> > > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > > > ---
> > > >  arch/hexagon/include/asm/pgtable.h | 3 ++-
> > > 
> > > Why hexagon out of all architectures?
> > > What about m68k, nios2, nds32 etc?
> 
> > Presumably they don't support CONFIG_TRANSPARENT_HUGEPAGE.
> > This code isn't compiled when THP are disabled; at least I haven't
> > had a buildbot complaint for any other architectures.
> 
> m68k defconfig fails:
> 
>   CC      mm/page_vma_mapped.o
> mm/page_vma_mapped.c: In function 'page_vma_mapped_walk':
> mm/page_vma_mapped.c:219:20: error: implicit declaration of function 'pmd_pfn'; did you mean 'pmd_off'? [-Werror=implicit-function-declaration]
>   219 |     if (!check_pmd(pmd_pfn(pmde), pvmw))
>       |                    ^~~~~~~
>       |                    pmd_off

Ok, so this whole thing is protected by "if (pmd_trans_huge(pmd))"
which is defined to be 0 if !THP.  But obviously, gcc is trying to parse
it.  So I think we need something like:

#ifndef CONFIG_TRANSPARENT_HUGEPAGE
static inline unsigned long pmd_pfn(pmd_t pmd)
{
        BUG();
        return 0UL;
}
#endif

but that can have its own problem if architectures define pmd_pfn()
when they don't have CONFIG_TRANSPARENT_HUGEPAGE.  Which of course, they
can do.


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 01/75] mm/gup: Increment the page refcount before the pincount
  2022-02-04 19:57 ` [PATCH 01/75] mm/gup: Increment the page refcount before the pincount Matthew Wilcox (Oracle)
  2022-02-04 21:13   ` John Hubbard
@ 2022-02-07  7:45   ` Christoph Hellwig
  1 sibling, 0 replies; 115+ messages in thread
From: Christoph Hellwig @ 2022-02-07  7:45 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

Looks good, modulo the refs fixup:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 04/75] mm/gup: Change the calling convention for compound_range_next()
  2022-02-04 19:57 ` [PATCH 04/75] mm/gup: Change the calling convention for compound_range_next() Matthew Wilcox (Oracle)
@ 2022-02-07  7:45   ` Christoph Hellwig
  0 siblings, 0 replies; 115+ messages in thread
From: Christoph Hellwig @ 2022-02-07  7:45 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: linux-mm, linux-kernel, John Hubbard, Jason Gunthorpe, William Kucharski

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 10/75] mm/gup: Remove hpage_pincount_add()
  2022-02-04 19:57 ` [PATCH 10/75] mm/gup: Remove hpage_pincount_add() Matthew Wilcox (Oracle)
  2022-02-04 21:29   ` John Hubbard
@ 2022-02-07  7:46   ` Christoph Hellwig
  1 sibling, 0 replies; 115+ messages in thread
From: Christoph Hellwig @ 2022-02-07  7:46 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 16/75] mm/gup: Convert try_grab_page() to use a folio
  2022-02-04 19:57 ` [PATCH 16/75] mm/gup: Convert try_grab_page() to use a folio Matthew Wilcox (Oracle)
  2022-02-06  2:12   ` John Hubbard
@ 2022-02-07  7:47   ` Christoph Hellwig
  1 sibling, 0 replies; 115+ messages in thread
From: Christoph Hellwig @ 2022-02-07  7:47 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 29/75] mm/workingset: Convert workingset_eviction() to take a folio
  2022-02-04 19:58 ` [PATCH 29/75] mm/workingset: Convert workingset_eviction() to take " Matthew Wilcox (Oracle)
@ 2022-02-07  7:49   ` Christoph Hellwig
  0 siblings, 0 replies; 115+ messages in thread
From: Christoph Hellwig @ 2022-02-07  7:49 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 30/75] mm/memcg: Convert mem_cgroup_swapout() to take a folio
  2022-02-04 19:58 ` [PATCH 30/75] mm/memcg: Convert mem_cgroup_swapout() " Matthew Wilcox (Oracle)
@ 2022-02-07  7:49   ` Christoph Hellwig
  0 siblings, 0 replies; 115+ messages in thread
From: Christoph Hellwig @ 2022-02-07  7:49 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

On Fri, Feb 04, 2022 at 07:58:07PM +0000, Matthew Wilcox (Oracle) wrote:
>  #ifdef CONFIG_MEMCG_SWAP
> -extern void mem_cgroup_swapout(struct page *page, swp_entry_t entry);
> +extern void mem_cgroup_swapout(struct folio *folio, swp_entry_t entry);

Might be worth to drop the extern while you're at it.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 31/75] mm: Add lru_to_folio()
  2022-02-04 19:58 ` [PATCH 31/75] mm: Add lru_to_folio() Matthew Wilcox (Oracle)
@ 2022-02-07  7:50   ` Christoph Hellwig
  2022-02-11 20:24     ` Matthew Wilcox
  0 siblings, 1 reply; 115+ messages in thread
From: Christoph Hellwig @ 2022-02-07  7:50 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

On Fri, Feb 04, 2022 at 07:58:08PM +0000, Matthew Wilcox (Oracle) wrote:
> Since page->lru occupies the same bytes as compound_head, any page
> on the LRU list must be a folio.

Any reason to not turn this into an inline function?


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 32/75] mm: Turn putback_lru_page() into folio_putback_lru()
  2022-02-04 19:58 ` [PATCH 32/75] mm: Turn putback_lru_page() into folio_putback_lru() Matthew Wilcox (Oracle)
@ 2022-02-07  7:50   ` Christoph Hellwig
  0 siblings, 0 replies; 115+ messages in thread
From: Christoph Hellwig @ 2022-02-07  7:50 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 33/75] mm/vmscan: Convert __remove_mapping() to take a folio
  2022-02-04 19:58 ` [PATCH 33/75] mm/vmscan: Convert __remove_mapping() to take a folio Matthew Wilcox (Oracle)
@ 2022-02-07  7:51   ` Christoph Hellwig
  0 siblings, 0 replies; 115+ messages in thread
From: Christoph Hellwig @ 2022-02-07  7:51 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 34/75] mm/vmscan: Turn page_check_dirty_writeback() into folio_check_dirty_writeback()
  2022-02-04 19:58 ` [PATCH 34/75] mm/vmscan: Turn page_check_dirty_writeback() into folio_check_dirty_writeback() Matthew Wilcox (Oracle)
@ 2022-02-07  7:51   ` Christoph Hellwig
  2022-02-12  1:49     ` Matthew Wilcox
  0 siblings, 1 reply; 115+ messages in thread
From: Christoph Hellwig @ 2022-02-07  7:51 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

> -	mapping = page_mapping(page);
> +	mapping = folio_mapping(folio);
>  	if (mapping && mapping->a_ops->is_dirty_writeback)
> -		mapping->a_ops->is_dirty_writeback(page, dirty, writeback);
> +		mapping->a_ops->is_dirty_writeback(&folio->page, dirty, writeback);

This adds an overly long line.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>



^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 35/75] mm: Turn head_compound_mapcount() into folio_entire_mapcount()
  2022-02-04 19:58 ` [PATCH 35/75] mm: Turn head_compound_mapcount() into folio_entire_mapcount() Matthew Wilcox (Oracle)
@ 2022-02-07  7:52   ` Christoph Hellwig
  0 siblings, 0 replies; 115+ messages in thread
From: Christoph Hellwig @ 2022-02-07  7:52 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 36/75] mm: Add folio_mapcount()
  2022-02-04 19:58 ` [PATCH 36/75] mm: Add folio_mapcount() Matthew Wilcox (Oracle)
@ 2022-02-07  7:53   ` Christoph Hellwig
  0 siblings, 0 replies; 115+ messages in thread
From: Christoph Hellwig @ 2022-02-07  7:53 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 37/75] mm: Add split_folio_to_list()
  2022-02-04 19:58 ` [PATCH 37/75] mm: Add split_folio_to_list() Matthew Wilcox (Oracle)
@ 2022-02-07  7:54   ` Christoph Hellwig
  0 siblings, 0 replies; 115+ messages in thread
From: Christoph Hellwig @ 2022-02-07  7:54 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 38/75] mm: Add folio_is_zone_device() and folio_is_device_private()
  2022-02-04 19:58 ` [PATCH 38/75] mm: Add folio_is_zone_device() and folio_is_device_private() Matthew Wilcox (Oracle)
@ 2022-02-07  7:54   ` Christoph Hellwig
  0 siblings, 0 replies; 115+ messages in thread
From: Christoph Hellwig @ 2022-02-07  7:54 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

On Fri, Feb 04, 2022 at 07:58:15PM +0000, Matthew Wilcox (Oracle) wrote:
> These two wrappers are the equivalent of is_zone_device_page()
> and is_device_private_page().

The looks sane, but I just send out a series this morning that moves
is_device_private_page to memremap.h and removes the inclusion of
memremap.h in mm.h, so there will be a bit of a conflict.


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 39/75] mm: Add folio_pgoff()
  2022-02-04 19:58 ` [PATCH 39/75] mm: Add folio_pgoff() Matthew Wilcox (Oracle)
@ 2022-02-07  7:55   ` Christoph Hellwig
  0 siblings, 0 replies; 115+ messages in thread
From: Christoph Hellwig @ 2022-02-07  7:55 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

On Fri, Feb 04, 2022 at 07:58:16PM +0000, Matthew Wilcox (Oracle) wrote:
> This is the folio equivalent of page_to_pgoff().

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 40/75] mm: Add pvmw_set_page() and pvmw_set_folio()
  2022-02-04 19:58 ` [PATCH 40/75] mm: Add pvmw_set_page() and pvmw_set_folio() Matthew Wilcox (Oracle)
@ 2022-02-07  7:55   ` Christoph Hellwig
  0 siblings, 0 replies; 115+ messages in thread
From: Christoph Hellwig @ 2022-02-07  7:55 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

On Fri, Feb 04, 2022 at 07:58:17PM +0000, Matthew Wilcox (Oracle) wrote:
> Instead of setting the page directly in struct page_vma_mapped_walk,
> use this helper to allow us to transition to a PFN approach in the
> next patch.

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 43/75] mm/page_idle: Convert page_idle_clear_pte_refs() to use a folio
  2022-02-04 19:58 ` [PATCH 43/75] mm/page_idle: Convert page_idle_clear_pte_refs() to use a folio Matthew Wilcox (Oracle)
@ 2022-02-07  7:57   ` Christoph Hellwig
  0 siblings, 0 replies; 115+ messages in thread
From: Christoph Hellwig @ 2022-02-07  7:57 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 44/75] mm/rmap: Use a folio in page_mkclean_one()
  2022-02-04 19:58 ` [PATCH 44/75] mm/rmap: Use a folio in page_mkclean_one() Matthew Wilcox (Oracle)
@ 2022-02-07  7:57   ` Christoph Hellwig
  0 siblings, 0 replies; 115+ messages in thread
From: Christoph Hellwig @ 2022-02-07  7:57 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 45/75] mm/rmap: Turn page_referenced() into folio_referenced()
  2022-02-04 19:58 ` [PATCH 45/75] mm/rmap: Turn page_referenced() into folio_referenced() Matthew Wilcox (Oracle)
@ 2022-02-07  7:58   ` Christoph Hellwig
  0 siblings, 0 replies; 115+ messages in thread
From: Christoph Hellwig @ 2022-02-07  7:58 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 47/75] mm/mlock: Turn mlock_vma_page() into mlock_vma_folio()
  2022-02-04 19:58 ` [PATCH 47/75] mm/mlock: Turn mlock_vma_page() into mlock_vma_folio() Matthew Wilcox (Oracle)
@ 2022-02-07 10:46   ` Mike Rapoport
  0 siblings, 0 replies; 115+ messages in thread
From: Mike Rapoport @ 2022-02-07 10:46 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

On Fri, Feb 04, 2022 at 07:58:24PM +0000, Matthew Wilcox (Oracle) wrote:
> Add mlock_vma_page() back as a wrapper.  Saves a few calls to
> compound_head() and an assertion that the page is not a tail page.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  mm/folio-compat.c |  5 +++++
>  mm/internal.h     |  3 ++-
>  mm/mlock.c        | 18 +++++++++---------
>  3 files changed, 16 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/folio-compat.c b/mm/folio-compat.c
> index bcb037d9cec3..9cb0867d5b38 100644
> --- a/mm/folio-compat.c
> +++ b/mm/folio-compat.c
> @@ -169,3 +169,8 @@ void clear_page_mlock(struct page *page)
>  {
>  	folio_end_mlock(page_folio(page));
>  }
> +
> +void mlock_vma_page(struct page *page)
> +{
> +	mlock_vma_folio(page_folio(page));
> +}

This will make no-mmu unhappy, e.g. arm allnoconfig:

  CC      mm/folio-compat.o
mm/folio-compat.c:174:6: error: redefinition of 'mlock_vma_page'
  174 | void mlock_vma_page(struct page *page)
      |      ^~~~~~~~~~~~~~
In file included from mm/folio-compat.c:11:
mm/internal.h:501:20: note: previous definition of 'mlock_vma_page' was here
  501 | static inline void mlock_vma_page(struct page *page) { }
      |                    ^~~~~~~~~~~~~~
mm/folio-compat.c: In function 'mlock_vma_page':
mm/folio-compat.c:176:2: error: implicit declaration of function 'mlock_vma_folio'; did you mean 'mlock_vma_page'? [-Werror=implicit-function-declaration]
  176 |  mlock_vma_folio(page_folio(page));
      |  ^~~~~~~~~~~~~~~
      |  mlock_vma_page

This also applies to munlock_vma_page() and page_mlock(). 
For some reason no build yelled about clear_page_mlock()...

> diff --git a/mm/internal.h b/mm/internal.h
> index 041c76a4c284..18b024aa7e59 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -411,7 +411,8 @@ static inline void munlock_vma_pages_all(struct vm_area_struct *vma)
>  /*
>   * must be called with vma's mmap_lock held for read or write, and page locked.
>   */
> -extern void mlock_vma_page(struct page *page);
> +void mlock_vma_page(struct page *page);
> +void mlock_vma_folio(struct folio *folio);
>  extern unsigned int munlock_vma_page(struct page *page);
>  
>  extern int mlock_future_check(struct mm_struct *mm, unsigned long flags,
> diff --git a/mm/mlock.c b/mm/mlock.c
> index ff067d64acc5..d998fd5c84bf 100644
> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -94,21 +94,21 @@ void folio_end_mlock(struct folio *folio)
>   * Mark page as mlocked if not already.
>   * If page on LRU, isolate and putback to move to unevictable list.
>   */
> -void mlock_vma_page(struct page *page)
> +void mlock_vma_folio(struct folio *folio)
>  {
>  	/* Serialize with page migration */
> -	BUG_ON(!PageLocked(page));
> +	BUG_ON(!folio_test_locked(folio));
>  
> -	VM_BUG_ON_PAGE(PageTail(page), page);
> -	VM_BUG_ON_PAGE(PageCompound(page) && PageDoubleMap(page), page);
> +	VM_BUG_ON_FOLIO(folio_test_large(folio) && folio_test_double_map(folio),
> +			folio);
>  
> -	if (!TestSetPageMlocked(page)) {
> -		int nr_pages = thp_nr_pages(page);
> +	if (!folio_test_set_mlocked(folio)) {
> +		long nr_pages = folio_nr_pages(folio);
>  
> -		mod_zone_page_state(page_zone(page), NR_MLOCK, nr_pages);
> +		zone_stat_mod_folio(folio, NR_MLOCK, nr_pages);
>  		count_vm_events(UNEVICTABLE_PGMLOCKED, nr_pages);
> -		if (!isolate_lru_page(page))
> -			putback_lru_page(page);
> +		if (!folio_isolate_lru(folio))
> +			folio_putback_lru(folio);
>  	}
>  }
>  
> -- 
> 2.34.1
> 
> 

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 41/75] hexagon: Add pmd_pfn()
  2022-02-06 22:05         ` Matthew Wilcox
@ 2022-02-07 14:24           ` Mike Rapoport
  0 siblings, 0 replies; 115+ messages in thread
From: Mike Rapoport @ 2022-02-07 14:24 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm, linux-kernel

On Sun, Feb 06, 2022 at 10:05:35PM +0000, Matthew Wilcox wrote:
> On Sun, Feb 06, 2022 at 11:33:40PM +0200, Mike Rapoport wrote:
> > On Sun, Feb 06, 2022 at 08:46:45PM +0000, Matthew Wilcox wrote:
> > > On Sun, Feb 06, 2022 at 08:13:11PM +0200, Mike Rapoport wrote:
> > > > On Fri, Feb 04, 2022 at 07:58:18PM +0000, Matthew Wilcox (Oracle) wrote:
> > > > > I need to use this function in common code, so define it for hexagon.
> > > > > 
> > > > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > > > > ---
> > > > >  arch/hexagon/include/asm/pgtable.h | 3 ++-
> > > > 
> > > > Why hexagon out of all architectures?
> > > > What about m68k, nios2, nds32 etc?
> > 
> > > Presumably they don't support CONFIG_TRANSPARENT_HUGEPAGE.
> > > This code isn't compiled when THP are disabled; at least I haven't
> > > had a buildbot complaint for any other architectures.
> > 
> > m68k defconfig fails:
> > 
> >   CC      mm/page_vma_mapped.o
> > mm/page_vma_mapped.c: In function 'page_vma_mapped_walk':
> > mm/page_vma_mapped.c:219:20: error: implicit declaration of function 'pmd_pfn'; did you mean 'pmd_off'? [-Werror=implicit-function-declaration]
> >   219 |     if (!check_pmd(pmd_pfn(pmde), pvmw))
> >       |                    ^~~~~~~
> >       |                    pmd_off
> 
> Ok, so this whole thing is protected by "if (pmd_trans_huge(pmd))"
> which is defined to be 0 if !THP.  But obviously, gcc is trying to parse
> it.  So I think we need something like:
> 
> #ifndef CONFIG_TRANSPARENT_HUGEPAGE
> static inline unsigned long pmd_pfn(pmd_t pmd)
> {
>         BUG();
>         return 0UL;
> }
> #endif
> 
> but that can have its own problem if architectures define pmd_pfn()
> when they don't have CONFIG_TRANSPARENT_HUGEPAGE.  Which of course, they
> can do.

I think the safest would be to define pmd_pfn where it's missing. Then
the whole pmd_pfn(), pmd_page() and pmd_page_vaddr() can be cleaned up on
top, but this won't block folio work.

I've baked a patch that defines pmd_pfn() based on pmd_page_vaddr() or pmd_page()
definition. With pmd_pfn() defined we won't need special care for THP or
!THP cases and PMD is always there even if it's called PGD sometimes.

From e0befabdfb5a4aaeab38a998fa58fe3218e80ad8 Mon Sep 17 00:00:00 2001
From: Mike Rapoport <rppt@linux.ibm.com>
Date: Fri, 4 Feb 2022 13:49:20 -0500
Subject: [PATCH] arch: Add pmd_pfn() where it is missing

Willy needs to use this function in common code, so define it for
architectures and/or configrations that miss it.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/alpha/include/asm/pgtable.h         | 1 +
 arch/arc/include/asm/hugepage.h          | 1 -
 arch/arc/include/asm/pgtable-levels.h    | 1 +
 arch/arm/include/asm/pgtable-2level.h    | 2 ++
 arch/csky/include/asm/pgtable.h          | 1 +
 arch/hexagon/include/asm/pgtable.h       | 5 +++++
 arch/ia64/include/asm/pgtable.h          | 1 +
 arch/m68k/include/asm/mcf_pgtable.h      | 1 +
 arch/m68k/include/asm/motorola_pgtable.h | 1 +
 arch/m68k/include/asm/sun3_pgtable.h     | 1 +
 arch/microblaze/include/asm/pgtable.h    | 3 +++
 arch/nds32/include/asm/pgtable.h         | 1 +
 arch/nios2/include/asm/pgtable.h         | 1 +
 arch/openrisc/include/asm/pgtable.h      | 1 +
 arch/parisc/include/asm/pgtable.h        | 1 +
 arch/sh/include/asm/pgtable_32.h         | 1 +
 arch/um/include/asm/pgtable.h            | 1 +
 arch/xtensa/include/asm/pgtable.h        | 1 +
 18 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/alpha/include/asm/pgtable.h b/arch/alpha/include/asm/pgtable.h
index 02f0429f1068..170451fde043 100644
--- a/arch/alpha/include/asm/pgtable.h
+++ b/arch/alpha/include/asm/pgtable.h
@@ -233,6 +233,7 @@ pmd_page_vaddr(pmd_t pmd)
 	return ((pmd_val(pmd) & _PFN_MASK) >> (32-PAGE_SHIFT)) + PAGE_OFFSET;
 }
 
+#define pmd_pfn(pmd)	(pmd_val(pmd) >> 32)
 #define pmd_page(pmd)	(pfn_to_page(pmd_val(pmd) >> 32))
 #define pud_page(pud)	(pfn_to_page(pud_val(pud) >> 32))
 
diff --git a/arch/arc/include/asm/hugepage.h b/arch/arc/include/asm/hugepage.h
index 11b0ff26b97b..5001b796fb8d 100644
--- a/arch/arc/include/asm/hugepage.h
+++ b/arch/arc/include/asm/hugepage.h
@@ -31,7 +31,6 @@ static inline pmd_t pte_pmd(pte_t pte)
 
 #define pmd_write(pmd)		pte_write(pmd_pte(pmd))
 #define pmd_young(pmd)		pte_young(pmd_pte(pmd))
-#define pmd_pfn(pmd)		pte_pfn(pmd_pte(pmd))
 #define pmd_dirty(pmd)		pte_dirty(pmd_pte(pmd))
 
 #define mk_pmd(page, prot)	pte_pmd(mk_pte(page, prot))
diff --git a/arch/arc/include/asm/pgtable-levels.h b/arch/arc/include/asm/pgtable-levels.h
index 8084ef2f6491..7848348719b2 100644
--- a/arch/arc/include/asm/pgtable-levels.h
+++ b/arch/arc/include/asm/pgtable-levels.h
@@ -161,6 +161,7 @@
 #define pmd_present(x)		(pmd_val(x))
 #define pmd_clear(xp)		do { pmd_val(*(xp)) = 0; } while (0)
 #define pmd_page_vaddr(pmd)	(pmd_val(pmd) & PAGE_MASK)
+#define pmd_pfn(pmd)		((pmd_val(pmd) & PAGE_MASK) >> PAGE_SHIFT)
 #define pmd_page(pmd)		virt_to_page(pmd_page_vaddr(pmd))
 #define set_pmd(pmdp, pmd)	(*(pmdp) = pmd)
 #define pmd_pgtable(pmd)	((pgtable_t) pmd_page_vaddr(pmd))
diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
index 70fe69bdcce2..92abd4cd8ca2 100644
--- a/arch/arm/include/asm/pgtable-2level.h
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -208,6 +208,8 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
 }
 #define pmd_offset pmd_offset
 
+#define pmd_pfn(pmd)		(__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))
+
 #define pmd_large(pmd)		(pmd_val(pmd) & 2)
 #define pmd_leaf(pmd)		(pmd_val(pmd) & 2)
 #define pmd_bad(pmd)		(pmd_val(pmd) & 2)
diff --git a/arch/csky/include/asm/pgtable.h b/arch/csky/include/asm/pgtable.h
index 151607ed5158..bbe245117777 100644
--- a/arch/csky/include/asm/pgtable.h
+++ b/arch/csky/include/asm/pgtable.h
@@ -30,6 +30,7 @@
 #define pgd_ERROR(e) \
 	pr_err("%s:%d: bad pgd %08lx.\n", __FILE__, __LINE__, pgd_val(e))
 
+#define pmd_pfn(pmd)	(pmd_phys(pmd) >> PAGE_SHIFT)
 #define pmd_page(pmd)	(pfn_to_page(pmd_phys(pmd) >> PAGE_SHIFT))
 #define pte_clear(mm, addr, ptep)	set_pte((ptep), \
 	(((unsigned int) addr >= PAGE_OFFSET) ? __pte(_PAGE_GLOBAL) : __pte(0)))
diff --git a/arch/hexagon/include/asm/pgtable.h b/arch/hexagon/include/asm/pgtable.h
index 18cd6ea9ab23..0610724d6a28 100644
--- a/arch/hexagon/include/asm/pgtable.h
+++ b/arch/hexagon/include/asm/pgtable.h
@@ -235,6 +235,11 @@ static inline int pmd_bad(pmd_t pmd)
 	return 0;
 }
 
+/*
+ * pmd_pfn - converts a PMD entry to a page frame number
+ */
+#define pmd_pfn(pmd)  (pmd_val(pmd) >> PAGE_SHIFT)
+
 /*
  * pmd_page - converts a PMD entry to a page pointer
  */
diff --git a/arch/ia64/include/asm/pgtable.h b/arch/ia64/include/asm/pgtable.h
index 9584b2c5f394..7aa8f2330fb1 100644
--- a/arch/ia64/include/asm/pgtable.h
+++ b/arch/ia64/include/asm/pgtable.h
@@ -267,6 +267,7 @@ ia64_phys_addr_valid (unsigned long addr)
 #define pmd_present(pmd)		(pmd_val(pmd) != 0UL)
 #define pmd_clear(pmdp)			(pmd_val(*(pmdp)) = 0UL)
 #define pmd_page_vaddr(pmd)		((unsigned long) __va(pmd_val(pmd) & _PFN_MASK))
+#define pmd_pfn(pmd)			((pmd_val(pmd) & _PFN_MASK) >> PAGE_SHIFT)
 #define pmd_page(pmd)			virt_to_page((pmd_val(pmd) + PAGE_OFFSET))
 
 #define pud_none(pud)			(!pud_val(pud))
diff --git a/arch/m68k/include/asm/mcf_pgtable.h b/arch/m68k/include/asm/mcf_pgtable.h
index 6f2b87d7a50d..94f38d76e278 100644
--- a/arch/m68k/include/asm/mcf_pgtable.h
+++ b/arch/m68k/include/asm/mcf_pgtable.h
@@ -322,6 +322,7 @@ extern pgd_t kernel_pg_dir[PTRS_PER_PGD];
 #define __pte_to_swp_entry(pte)	((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)	(__pte((x).val))
 
+#define pmd_pfn(pmd)		(pmd_val(pmd) >> PAGE_SHIFT)
 #define pmd_page(pmd)		(pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT))
 
 #define pfn_pte(pfn, prot)	__pte(((pfn) << PAGE_SHIFT) | pgprot_val(prot))
diff --git a/arch/m68k/include/asm/motorola_pgtable.h b/arch/m68k/include/asm/motorola_pgtable.h
index 022c3abc280d..7c9b56e2a750 100644
--- a/arch/m68k/include/asm/motorola_pgtable.h
+++ b/arch/m68k/include/asm/motorola_pgtable.h
@@ -147,6 +147,7 @@ static inline void pud_set(pud_t *pudp, pmd_t *pmdp)
 #define pmd_present(pmd)	(pmd_val(pmd) & _PAGE_TABLE)
 #define pmd_clear(pmdp)		({ pmd_val(*pmdp) = 0; })
 
+#define pmd_pfn(pmd)		((pmd_val(pmd) & _TABLE_MASK) >> PAGE_SHIFT)
 /*
  * m68k does not have huge pages (020/030 actually could), but generic code
  * expects pmd_page() to exists, only to then DCE it all. Provide a dummy to
diff --git a/arch/m68k/include/asm/sun3_pgtable.h b/arch/m68k/include/asm/sun3_pgtable.h
index 5b24283a0a42..5e4e753f0d24 100644
--- a/arch/m68k/include/asm/sun3_pgtable.h
+++ b/arch/m68k/include/asm/sun3_pgtable.h
@@ -130,6 +130,7 @@ static inline void pte_clear (struct mm_struct *mm, unsigned long addr, pte_t *p
 ({ pte_t __pte; pte_val(__pte) = pfn | pgprot_val(pgprot); __pte; })
 
 #define pte_page(pte)		virt_to_page(__pte_page(pte))
+#define pmd_pfn(pmd)		(pmd_val(pmd) >> PAGE_SHIFT)
 #define pmd_page(pmd)		virt_to_page(pmd_page_vaddr(pmd))
 
 
diff --git a/arch/microblaze/include/asm/pgtable.h b/arch/microblaze/include/asm/pgtable.h
index c136a01e467e..0c72646370e1 100644
--- a/arch/microblaze/include/asm/pgtable.h
+++ b/arch/microblaze/include/asm/pgtable.h
@@ -399,6 +399,9 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
 	return ((unsigned long) (pmd_val(pmd) & PAGE_MASK));
 }
 
+/* returns pfn of the pmd entry*/
+#define pmd_pfn(pmd)	(__pa(pmd_val(pmd)) >> PAGE_SHIFT)
+
 /* returns struct *page of the pmd entry*/
 #define pmd_page(pmd)	(pfn_to_page(__pa(pmd_val(pmd)) >> PAGE_SHIFT))
 
diff --git a/arch/nds32/include/asm/pgtable.h b/arch/nds32/include/asm/pgtable.h
index 419f984eef70..7ff144467b29 100644
--- a/arch/nds32/include/asm/pgtable.h
+++ b/arch/nds32/include/asm/pgtable.h
@@ -308,6 +308,7 @@ static inline pmd_t __mk_pmd(pte_t * ptep, unsigned long prot)
 	return pmd;
 }
 
+#define pmd_pfn(pmd)	     (pmd_val(pmd) >> PAGE_SHIFT)
 #define pmd_page(pmd)        virt_to_page(__va(pmd_val(pmd)))
 
 /*
diff --git a/arch/nios2/include/asm/pgtable.h b/arch/nios2/include/asm/pgtable.h
index 4a995fa628ee..262d0609268c 100644
--- a/arch/nios2/include/asm/pgtable.h
+++ b/arch/nios2/include/asm/pgtable.h
@@ -235,6 +235,7 @@ static inline void pte_clear(struct mm_struct *mm,
  * and a page entry and page directory to the page they refer to.
  */
 #define pmd_phys(pmd)		virt_to_phys((void *)pmd_val(pmd))
+#define pmd_pfn(pmd)		(pmd_phys(pmd) >> PAGE_SHIFT)
 #define pmd_page(pmd)		(pfn_to_page(pmd_phys(pmd) >> PAGE_SHIFT))
 
 static inline unsigned long pmd_page_vaddr(pmd_t pmd)
diff --git a/arch/openrisc/include/asm/pgtable.h b/arch/openrisc/include/asm/pgtable.h
index cdd657f80bfa..c3abbf71e09f 100644
--- a/arch/openrisc/include/asm/pgtable.h
+++ b/arch/openrisc/include/asm/pgtable.h
@@ -361,6 +361,7 @@ static inline void pmd_set(pmd_t *pmdp, pte_t *ptep)
 	pmd_val(*pmdp) = _KERNPG_TABLE | (unsigned long) ptep;
 }
 
+#define pmd_pfn(pmd)		(pmd_val(pmd) >> PAGE_SHIFT)
 #define pmd_page(pmd)		(pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT))
 
 static inline unsigned long pmd_page_vaddr(pmd_t pmd)
diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h
index 3e7cf882639f..48d49f207f84 100644
--- a/arch/parisc/include/asm/pgtable.h
+++ b/arch/parisc/include/asm/pgtable.h
@@ -405,6 +405,7 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
 	return ((unsigned long) __va(pmd_address(pmd)));
 }
 
+#define pmd_pfn(pmd)	(pmd_address(pmd) >> PAGE_SHIFT)
 #define __pmd_page(pmd) ((unsigned long) __va(pmd_address(pmd)))
 #define pmd_page(pmd)	virt_to_page((void *)__pmd_page(pmd))
 
diff --git a/arch/sh/include/asm/pgtable_32.h b/arch/sh/include/asm/pgtable_32.h
index 41be43e99cff..d0240decacca 100644
--- a/arch/sh/include/asm/pgtable_32.h
+++ b/arch/sh/include/asm/pgtable_32.h
@@ -406,6 +406,7 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
 	return (unsigned long)pmd_val(pmd);
 }
 
+#define pmd_pfn(pmd)		(__pa(pmd_val(pmd)) >> PAGE_SHIFT)
 #define pmd_page(pmd)		(virt_to_page(pmd_val(pmd)))
 
 #ifdef CONFIG_X2TLB
diff --git a/arch/um/include/asm/pgtable.h b/arch/um/include/asm/pgtable.h
index b9e20bbe2f75..167e236d9bb8 100644
--- a/arch/um/include/asm/pgtable.h
+++ b/arch/um/include/asm/pgtable.h
@@ -109,6 +109,7 @@ extern unsigned long end_iomem;
 #define p4d_newpage(x)  (p4d_val(x) & _PAGE_NEWPAGE)
 #define p4d_mkuptodate(x) (p4d_val(x) &= ~_PAGE_NEWPAGE)
 
+#define pmd_pfn(pmd) (pmd_val(pmd) >> PAGE_SHIFT)
 #define pmd_page(pmd) phys_to_page(pmd_val(pmd) & PAGE_MASK)
 
 #define pte_page(x) pfn_to_page(pte_pfn(x))
diff --git a/arch/xtensa/include/asm/pgtable.h b/arch/xtensa/include/asm/pgtable.h
index bd5aeb795567..8da562f5da73 100644
--- a/arch/xtensa/include/asm/pgtable.h
+++ b/arch/xtensa/include/asm/pgtable.h
@@ -241,6 +241,7 @@ static inline void paging_init(void) { }
  * The pmd contains the kernel virtual address of the pte page.
  */
 #define pmd_page_vaddr(pmd) ((unsigned long)(pmd_val(pmd) & PAGE_MASK))
+#define pmd_pfn(pmd) (__pa(pmd_val(pmd)) >> PAGE_SHIFT)
 #define pmd_page(pmd) virt_to_page(pmd_val(pmd))
 
 /*
-- 
2.25.1


-- 
Sincerely yours,
Mike.


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: [PATCH 51/75] mm/rmap: Convert try_to_unmap() to take a folio
  2022-02-04 19:58 ` [PATCH 51/75] mm/rmap: Convert try_to_unmap() " Matthew Wilcox (Oracle)
@ 2022-02-09 14:24   ` Mauricio Faria de Oliveira
  2022-02-09 14:29     ` Matthew Wilcox
  0 siblings, 1 reply; 115+ messages in thread
From: Mauricio Faria de Oliveira @ 2022-02-09 14:24 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Andrew Morton; +Cc: linux-mm, linux-kernel

Hi Andrew and Matthew,

On Fri, Feb 4, 2022 at 5:00 PM Matthew Wilcox (Oracle)
<willy@infradead.org> wrote:
>
> Change both callers and the worker function try_to_unmap_one().
...
> diff --git a/mm/rmap.c b/mm/rmap.c
...
> @@ -1598,8 +1602,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
>                         }
>
>                         /* MADV_FREE page check */
> -                       if (!PageSwapBacked(page)) {
> -                               if (!PageDirty(page)) {
> +                       if (!folio_test_swapbacked(folio)) {
> +                               if (!folio_test_dirty(folio)) {
>                                         /* Invalidate as we cleared the pte */
>                                         mmu_notifier_invalidate_range(mm,
>                                                 address, address + PAGE_SIZE);
> @@ -1608,11 +1612,11 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
>                                 }
>
>                                 /*
> -                                * If the page was redirtied, it cannot be
> +                                * If the folio was redirtied, it cannot be
>                                  * discarded. Remap the page to page table.
>                                  */
>                                 set_pte_at(mm, address, pvmw.pte, pteval);
> -                               SetPageSwapBacked(page);
> +                               folio_set_swapbacked(folio);
>                                 ret = false;
>                                 page_vma_mapped_walk_done(&pvmw);
>                                 break;
...

This conflicts with patch [1], currently in mmotm, and I'll send
another version anyway.
Should that patch be on top of these folio changes, or the other way around?

The latter would help w/ the stable backports that don't have folios
yet, but I can
send backports there as well; not a problem.

Thanks,

[1] https://lkml.kernel.org/r/20220131230255.789059-1-mfo@canonical.com
[PATCH v3] mm: fix race between MADV_FREE reclaim and blkdev direct IO read

-- 
Mauricio Faria de Oliveira


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 51/75] mm/rmap: Convert try_to_unmap() to take a folio
  2022-02-09 14:24   ` Mauricio Faria de Oliveira
@ 2022-02-09 14:29     ` Matthew Wilcox
  0 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox @ 2022-02-09 14:29 UTC (permalink / raw)
  To: Mauricio Faria de Oliveira; +Cc: Andrew Morton, linux-mm, linux-kernel

On Wed, Feb 09, 2022 at 11:24:34AM -0300, Mauricio Faria de Oliveira wrote:
> Hi Andrew and Matthew,
> 
> On Fri, Feb 4, 2022 at 5:00 PM Matthew Wilcox (Oracle)
> <willy@infradead.org> wrote:
> >
> > Change both callers and the worker function try_to_unmap_one().
> ...
> > diff --git a/mm/rmap.c b/mm/rmap.c
> ...
> > @@ -1598,8 +1602,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
> >                         }
> >
> >                         /* MADV_FREE page check */
> > -                       if (!PageSwapBacked(page)) {
> > -                               if (!PageDirty(page)) {
> > +                       if (!folio_test_swapbacked(folio)) {
> > +                               if (!folio_test_dirty(folio)) {
> >                                         /* Invalidate as we cleared the pte */
> >                                         mmu_notifier_invalidate_range(mm,
> >                                                 address, address + PAGE_SIZE);
> > @@ -1608,11 +1612,11 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
> >                                 }
> >
> >                                 /*
> > -                                * If the page was redirtied, it cannot be
> > +                                * If the folio was redirtied, it cannot be
> >                                  * discarded. Remap the page to page table.
> >                                  */
> >                                 set_pte_at(mm, address, pvmw.pte, pteval);
> > -                               SetPageSwapBacked(page);
> > +                               folio_set_swapbacked(folio);
> >                                 ret = false;
> >                                 page_vma_mapped_walk_done(&pvmw);
> >                                 break;
> ...
> 
> This conflicts with patch [1], currently in mmotm, and I'll send
> another version anyway.
> Should that patch be on top of these folio changes, or the other way around?

Andrew and I need to resolve conflicts between this series and other
patches in his tree.  You don't need to worry about this.

> The latter would help w/ the stable backports that don't have folios
> yet, but I can
> send backports there as well; not a problem.
> 
> Thanks,
> 
> [1] https://lkml.kernel.org/r/20220131230255.789059-1-mfo@canonical.com
> [PATCH v3] mm: fix race between MADV_FREE reclaim and blkdev direct IO read
> 
> -- 
> Mauricio Faria de Oliveira
> 


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 52/75] mm/rmap: Convert try_to_migrate() to folios
  2022-02-04 19:58 ` [PATCH 52/75] mm/rmap: Convert try_to_migrate() to folios Matthew Wilcox (Oracle)
@ 2022-02-09 15:27   ` Zi Yan
  0 siblings, 0 replies; 115+ messages in thread
From: Zi Yan @ 2022-02-09 15:27 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 9951 bytes --]

On 4 Feb 2022, at 14:58, Matthew Wilcox (Oracle) wrote:

> Convert the callers to pass a folio and the try_to_migrate_one()
> worker to use a folio throughout.  Fixes an assumption that a
> folio must be <= PMD size.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/rmap.h |  2 +-
>  mm/huge_memory.c     |  4 ++--
>  mm/migrate.c         | 12 ++++++----
>  mm/rmap.c            | 57 +++++++++++++++++++++++---------------------
>  4 files changed, 41 insertions(+), 34 deletions(-)
>
> diff --git a/include/linux/rmap.h b/include/linux/rmap.h
> index 66407434c3b5..502439f20d88 100644
> --- a/include/linux/rmap.h
> +++ b/include/linux/rmap.h
> @@ -192,7 +192,7 @@ static inline void page_dup_rmap(struct page *page, bool compound)
>  int folio_referenced(struct folio *, int is_locked,
>  			struct mem_cgroup *memcg, unsigned long *vm_flags);
>
> -void try_to_migrate(struct page *page, enum ttu_flags flags);
> +void try_to_migrate(struct folio *folio, enum ttu_flags flags);
>  void try_to_unmap(struct folio *, enum ttu_flags flags);
>
>  int make_device_exclusive_range(struct mm_struct *mm, unsigned long start,
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 4ea22b7319fd..21676a4afd07 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2294,8 +2294,8 @@ static void unmap_page(struct page *page)
>  	 * pages can simply be left unmapped, then faulted back on demand.
>  	 * If that is ever changed (perhaps for mlock), update remap_page().
>  	 */
> -	if (PageAnon(page))
> -		try_to_migrate(page, ttu_flags);
> +	if (folio_test_anon(folio))
> +		try_to_migrate(folio, ttu_flags);
>  	else
>  		try_to_unmap(folio, ttu_flags | TTU_IGNORE_MLOCK);
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 766dc67874a1..5dcdd43d983d 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -927,6 +927,7 @@ static int move_to_new_page(struct page *newpage, struct page *page,
>  static int __unmap_and_move(struct page *page, struct page *newpage,
>  				int force, enum migrate_mode mode)
>  {
> +	struct folio *folio = page_folio(page);
>  	int rc = -EAGAIN;
>  	bool page_was_mapped = false;
>  	struct anon_vma *anon_vma = NULL;
> @@ -1030,7 +1031,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
>  		/* Establish migration ptes */
>  		VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && !anon_vma,
>  				page);
> -		try_to_migrate(page, 0);
> +		try_to_migrate(folio, 0);
>  		page_was_mapped = true;
>  	}
>
> @@ -1173,6 +1174,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
>  				enum migrate_mode mode, int reason,
>  				struct list_head *ret)
>  {
> +	struct folio *src = page_folio(hpage);
>  	int rc = -EAGAIN;
>  	int page_was_mapped = 0;
>  	struct page *new_hpage;
> @@ -1249,7 +1251,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
>  			ttu |= TTU_RMAP_LOCKED;
>  		}
>
> -		try_to_migrate(hpage, ttu);
> +		try_to_migrate(src, ttu);
>  		page_was_mapped = 1;
>
>  		if (mapping_locked)
> @@ -2449,6 +2451,7 @@ static void migrate_vma_unmap(struct migrate_vma *migrate)
>
>  	for (i = 0; i < npages; i++) {
>  		struct page *page = migrate_pfn_to_page(migrate->src[i]);
> +		struct folio *folio;
>
>  		if (!page)
>  			continue;
> @@ -2472,8 +2475,9 @@ static void migrate_vma_unmap(struct migrate_vma *migrate)
>  			put_page(page);
>  		}
>
> -		if (page_mapped(page))
> -			try_to_migrate(page, 0);
> +		folio = page_folio(page);
> +		if (folio_mapped(folio))
> +			try_to_migrate(folio, 0);
>
>  		if (page_mapped(page) || !migrate_vma_check_page(page)) {
>  			if (!is_zone_device_page(page)) {
> diff --git a/mm/rmap.c b/mm/rmap.c
> index c598fd667948..4cfac67e328c 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1767,7 +1767,7 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
>  	range.end = vma_address_end(&pvmw);
>  	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
>  				address, range.end);
> -	if (PageHuge(page)) {
> +	if (folio_test_hugetlb(folio)) {
>  		/*
>  		 * If sharing is possible, start and end will be adjusted
>  		 * accordingly.
> @@ -1781,21 +1781,24 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
>  #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
>  		/* PMD-mapped THP migration entry */
>  		if (!pvmw.pte) {
> -			VM_BUG_ON_PAGE(PageHuge(page) ||
> -				       !PageTransCompound(page), page);
> +			subpage = folio_page(folio,
> +				pmd_pfn(*pvmw.pmd) - folio_pfn(folio));

Here you removed the assumption that folio is always <= PMD, right?

In the commit message, maybe the below is better?
In THP migration code, fixes an assumption that a folio must be <= PMD size.

> +			VM_BUG_ON_FOLIO(folio_test_hugetlb(folio) ||
> +					!folio_test_pmd_mappable(folio), folio);
>
> -			set_pmd_migration_entry(&pvmw, page);
> +			set_pmd_migration_entry(&pvmw, subpage);
>  			continue;
>  		}
>  #endif
>
>  		/* Unexpected PMD-mapped THP? */
> -		VM_BUG_ON_PAGE(!pvmw.pte, page);
> +		VM_BUG_ON_FOLIO(!pvmw.pte, folio);
>
> -		subpage = page - page_to_pfn(page) + pte_pfn(*pvmw.pte);
> +		subpage = folio_page(folio,
> +				pte_pfn(*pvmw.pte) - folio_pfn(folio));
>  		address = pvmw.address;
>
> -		if (PageHuge(page) && !PageAnon(page)) {
> +		if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) {
>  			/*
>  			 * To call huge_pmd_unshare, i_mmap_rwsem must be
>  			 * held in write mode.  Caller needs to explicitly
> @@ -1833,15 +1836,15 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
>  		flush_cache_page(vma, address, pte_pfn(*pvmw.pte));
>  		pteval = ptep_clear_flush(vma, address, pvmw.pte);
>
> -		/* Move the dirty bit to the page. Now the pte is gone. */
> +		/* Set the dirty flag on the folio now the pte is gone. */
>  		if (pte_dirty(pteval))
> -			set_page_dirty(page);
> +			folio_mark_dirty(folio);
>
>  		/* Update high watermark before we lower rss */
>  		update_hiwater_rss(mm);
>
> -		if (is_zone_device_page(page)) {
> -			unsigned long pfn = page_to_pfn(page);
> +		if (folio_is_zone_device(folio)) {
> +			unsigned long pfn = folio_pfn(folio);
>  			swp_entry_t entry;
>  			pte_t swp_pte;
>
> @@ -1877,16 +1880,16 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
>  			 * changed when hugepage migrations to device private
>  			 * memory are supported.
>  			 */
> -			subpage = page;
> -		} else if (PageHWPoison(page)) {
> +			subpage = &folio->page;
> +		} else if (PageHWPoison(subpage)) {
>  			pteval = swp_entry_to_pte(make_hwpoison_entry(subpage));
> -			if (PageHuge(page)) {
> -				hugetlb_count_sub(compound_nr(page), mm);
> +			if (folio_test_hugetlb(folio)) {
> +				hugetlb_count_sub(folio_nr_pages(folio), mm);
>  				set_huge_swap_pte_at(mm, address,
>  						     pvmw.pte, pteval,
>  						     vma_mmu_pagesize(vma));
>  			} else {
> -				dec_mm_counter(mm, mm_counter(page));
> +				dec_mm_counter(mm, mm_counter(&folio->page));
>  				set_pte_at(mm, address, pvmw.pte, pteval);
>  			}
>
> @@ -1901,7 +1904,7 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
>  			 * migration) will not expect userfaults on already
>  			 * copied pages.
>  			 */
> -			dec_mm_counter(mm, mm_counter(page));
> +			dec_mm_counter(mm, mm_counter(&folio->page));
>  			/* We have to invalidate as we cleared the pte */
>  			mmu_notifier_invalidate_range(mm, address,
>  						      address + PAGE_SIZE);
> @@ -1947,8 +1950,8 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
>  		 *
>  		 * See Documentation/vm/mmu_notifier.rst
>  		 */
> -		page_remove_rmap(subpage, PageHuge(page));
> -		put_page(page);
> +		page_remove_rmap(subpage, folio_test_hugetlb(folio));
> +		folio_put(folio);
>  	}
>
>  	mmu_notifier_invalidate_range_end(&range);
> @@ -1958,13 +1961,13 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
>
>  /**
>   * try_to_migrate - try to replace all page table mappings with swap entries
> - * @page: the page to replace page table entries for
> + * @folio: the folio to replace page table entries for
>   * @flags: action and flags
>   *
> - * Tries to remove all the page table entries which are mapping this page and
> - * replace them with special swap entries. Caller must hold the page lock.
> + * Tries to remove all the page table entries which are mapping this folio and
> + * replace them with special swap entries. Caller must hold the folio lock.
>   */
> -void try_to_migrate(struct page *page, enum ttu_flags flags)
> +void try_to_migrate(struct folio *folio, enum ttu_flags flags)
>  {
>  	struct rmap_walk_control rwc = {
>  		.rmap_one = try_to_migrate_one,
> @@ -1981,7 +1984,7 @@ void try_to_migrate(struct page *page, enum ttu_flags flags)
>  					TTU_SYNC)))
>  		return;
>
> -	if (is_zone_device_page(page) && !is_device_private_page(page))
> +	if (folio_is_zone_device(folio) && !folio_is_device_private(folio))
>  		return;
>
>  	/*
> @@ -1992,13 +1995,13 @@ void try_to_migrate(struct page *page, enum ttu_flags flags)
>  	 * locking requirements of exec(), migration skips
>  	 * temporary VMAs until after exec() completes.
>  	 */
> -	if (!PageKsm(page) && PageAnon(page))
> +	if (!folio_test_ksm(folio) && folio_test_anon(folio))
>  		rwc.invalid_vma = invalid_migration_vma;
>
>  	if (flags & TTU_RMAP_LOCKED)
> -		rmap_walk_locked(page, &rwc);
> +		rmap_walk_locked(&folio->page, &rwc);
>  	else
> -		rmap_walk(page, &rwc);
> +		rmap_walk(&folio->page, &rwc);
>  }
>
>  /*
> -- 
> 2.34.1

Otherwise, LGTM. Thanks. Reviewed-by: Zi Yan <ziy@nvidia.com>

--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 20/75] mm/gup: Convert gup_pte_range() to use a folio
  2022-02-06 14:52   ` Mark Hemment
@ 2022-02-11 20:20     ` Matthew Wilcox
  0 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox @ 2022-02-11 20:20 UTC (permalink / raw)
  To: Mark Hemment
  Cc: linux-mm, linux-kernel, Christoph Hellwig, John Hubbard,
	Jason Gunthorpe, William Kucharski

On Sun, Feb 06, 2022 at 02:52:46PM +0000, Mark Hemment wrote:
> On Fri, 4 Feb 2022 at 20:21, Matthew Wilcox (Oracle)
> <willy@infradead.org> wrote:
> V. Minor/nit: Above the "goto pte_unmap;" is the code block;
>            if (flags & FOLL_PIN) {
>                         ret = arch_make_page_accessible(page);
>                         if (ret) {
>                                 unpin_user_page(page);
>                                 goto pte_unmap;
>                         }
>                 }
> Other conditions which goto pte_unmap, after successful
> try_grab_folio(), call gup_put_folio() (rather than
> unpin_user_page()).
> No change in functionality, but suggest calling gup_put_folio() here
> too for consistency.

Thanks!  Changed.


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 31/75] mm: Add lru_to_folio()
  2022-02-07  7:50   ` Christoph Hellwig
@ 2022-02-11 20:24     ` Matthew Wilcox
  0 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox @ 2022-02-11 20:24 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-mm, linux-kernel

On Sun, Feb 06, 2022 at 11:50:06PM -0800, Christoph Hellwig wrote:
> On Fri, Feb 04, 2022 at 07:58:08PM +0000, Matthew Wilcox (Oracle) wrote:
> > Since page->lru occupies the same bytes as compound_head, any page
> > on the LRU list must be a folio.
> 
> Any reason to not turn this into an inline function?

Good idea.  Done.


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 34/75] mm/vmscan: Turn page_check_dirty_writeback() into folio_check_dirty_writeback()
  2022-02-07  7:51   ` Christoph Hellwig
@ 2022-02-12  1:49     ` Matthew Wilcox
  0 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox @ 2022-02-12  1:49 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-mm, linux-kernel

On Sun, Feb 06, 2022 at 11:51:54PM -0800, Christoph Hellwig wrote:
> > -	mapping = page_mapping(page);
> > +	mapping = folio_mapping(folio);
> >  	if (mapping && mapping->a_ops->is_dirty_writeback)
> > -		mapping->a_ops->is_dirty_writeback(page, dirty, writeback);
> > +		mapping->a_ops->is_dirty_writeback(&folio->page, dirty, writeback);
> 
> This adds an overly long line.

Yeah, I'm planning on taking care of it in 5.19 by switching
is_dirty_writeback() to take a folio.

> Otherwise looks good:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> 


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH 00/75] MM folio patches for 5.18
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (74 preceding siblings ...)
  2022-02-04 19:58 ` [PATCH 75/75] selftests/vm/transhuge-stress: Support file-backed PMD folios Matthew Wilcox (Oracle)
@ 2022-02-13 22:31 ` John Hubbard
  2022-02-14  4:33 ` Matthew Wilcox
  76 siblings, 0 replies; 115+ messages in thread
From: John Hubbard @ 2022-02-13 22:31 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-mm; +Cc: linux-kernel

On 2/4/22 11:57, Matthew Wilcox (Oracle) wrote:
> Whole series availabke through git, and shortly in linux-next:
> https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/for-next
> or git://git.infradead.org/users/willy/pagecache.git for-next

Hi Matthew,

I'm having trouble finding this series linux-next, or mmotm either. Has
the plan changed, or maybe I'm just Doing It Wrong? :)

Background as to why (you can skip this part unless you're wondering):

Locally, I've based a small but critical patch on top of this series. It
introduces a new routine:

     void pin_user_page(struct page *page);

...which is a prerequisite for converting Direct IO over to use
FOLL_PIN.

For that, I am on the fence about whether to request putting the first
part of my conversion patchset into 5.18, or 5.19. Ideally, I'd like to
keep it based on your series, because otherwise there are a couple of
warts in pin_user_page() that have to be fixed up later. But on the
other hand, it would be nice to get the prerequisites in place, because
many filesystems need small changes.

Here's the diffs for "mm/gup: introduce pin_user_page()", for reference:

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 73b7e4bd250b..c2bb8099a56b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1963,6 +1963,7 @@ long get_user_pages(unsigned long start, unsigned long nr_pages,
  long pin_user_pages(unsigned long start, unsigned long nr_pages,
  		    unsigned int gup_flags, struct page **pages,
  		    struct vm_area_struct **vmas);
+void pin_user_page(struct page *page);
  long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
  		    struct page **pages, unsigned int gup_flags);
  long pin_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
diff --git a/mm/gup.c b/mm/gup.c
index 7150ea002002..7d57c3452192 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -3014,6 +3014,40 @@ long pin_user_pages(unsigned long start, unsigned long nr_pages,
  }
  EXPORT_SYMBOL(pin_user_pages);

+/**
+ * pin_user_page() - apply a FOLL_PIN reference to a page ()
+ *
+ * @page: the page to be pinned.
+ *
+ * Similar to get_user_pages(), in that the page's refcount is elevated using
+ * FOLL_PIN rules.
+ *
+ * IMPORTANT: That means that the caller must release the page via
+ * unpin_user_page().
+ *
+ */
+void pin_user_page(struct page *page)
+{
+	struct folio *folio = page_folio(page);
+
+	WARN_ON_ONCE(folio_ref_count(folio) <= 0);
+
+	/*
+	 * Similar to try_grab_page(): be sure to *also*
+	 * increment the normal page refcount field at least once,
+	 * so that the page really is pinned.
+	 */
+	if (folio_test_large(folio)) {
+		folio_ref_add(folio, 1);
+		atomic_add(1, folio_pincount_ptr(folio));
+	} else {
+		folio_ref_add(folio, GUP_PIN_COUNTING_BIAS);
+	}
+
+	node_stat_mod_folio(folio, NR_FOLL_PIN_ACQUIRED, 1);
+}
+EXPORT_SYMBOL(pin_user_page);
+
  /*
   * pin_user_pages_unlocked() is the FOLL_PIN variant of
   * get_user_pages_unlocked(). Behavior is the same, except that this one sets


thanks,
-- 
John Hubbard
NVIDIA


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: [PATCH 00/75] MM folio patches for 5.18
  2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
                   ` (75 preceding siblings ...)
  2022-02-13 22:31 ` [PATCH 00/75] MM folio patches for 5.18 John Hubbard
@ 2022-02-14  4:33 ` Matthew Wilcox
  76 siblings, 0 replies; 115+ messages in thread
From: Matthew Wilcox @ 2022-02-14  4:33 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel

On Fri, Feb 04, 2022 at 07:57:37PM +0000, Matthew Wilcox (Oracle) wrote:
> Whole series availabke through git, and shortly in linux-next:
> https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/for-next
> or git://git.infradead.org/users/willy/pagecache.git for-next

I've just pushed out a new version to infradead.  I'll probably forget
a few things, but major differences:

 - Incorporate various fixes from others including:
   - Implement pmd_pfn() for various arches from Mike Rapoport
   - lru_to_folio() now a function from Christoph Hellwig
   - Change a unpin_user_page() call to gup_put_folio() from Mark Hemment
 - Use DEFINE_PAGE_VMA_WALK() and DEFINE_FOLIO_VMA_WALK() instead of the
   pvmw_set_page()/folio() calls that were in this patch set.
 - A new set of ten patches around invalidate_inode_page().  I'll send
   them out as a fresh patchset tomorrow.
 - Add various Reviewed-by trailers.
 - Updated to -rc4.



^ permalink raw reply	[flat|nested] 115+ messages in thread

end of thread, other threads:[~2022-02-14  4:33 UTC | newest]

Thread overview: 115+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-04 19:57 [PATCH 00/75] MM folio patches for 5.18 Matthew Wilcox (Oracle)
2022-02-04 19:57 ` [PATCH 01/75] mm/gup: Increment the page refcount before the pincount Matthew Wilcox (Oracle)
2022-02-04 21:13   ` John Hubbard
2022-02-04 21:28     ` Matthew Wilcox
2022-02-07  7:45   ` Christoph Hellwig
2022-02-04 19:57 ` [PATCH 02/75] mm/gup: Remove for_each_compound_range() Matthew Wilcox (Oracle)
2022-02-04 19:57 ` [PATCH 03/75] mm/gup: Remove for_each_compound_head() Matthew Wilcox (Oracle)
2022-02-04 19:57 ` [PATCH 04/75] mm/gup: Change the calling convention for compound_range_next() Matthew Wilcox (Oracle)
2022-02-07  7:45   ` Christoph Hellwig
2022-02-04 19:57 ` [PATCH 05/75] mm/gup: Optimise compound_range_next() Matthew Wilcox (Oracle)
2022-02-04 19:57 ` [PATCH 06/75] mm/gup: Change the calling convention for compound_next() Matthew Wilcox (Oracle)
2022-02-04 19:57 ` [PATCH 07/75] mm/gup: Fix some contiguous memmap assumptions Matthew Wilcox (Oracle)
2022-02-04 19:57 ` [PATCH 08/75] mm/gup: Remove an assumption of a contiguous memmap Matthew Wilcox (Oracle)
2022-02-04 19:57 ` [PATCH 09/75] mm/gup: Handle page split race more efficiently Matthew Wilcox (Oracle)
2022-02-04 19:57 ` [PATCH 10/75] mm/gup: Remove hpage_pincount_add() Matthew Wilcox (Oracle)
2022-02-04 21:29   ` John Hubbard
2022-02-07  7:46   ` Christoph Hellwig
2022-02-04 19:57 ` [PATCH 11/75] mm/gup: Remove hpage_pincount_sub() Matthew Wilcox (Oracle)
2022-02-04 19:57 ` [PATCH 12/75] mm: Make compound_pincount always available Matthew Wilcox (Oracle)
2022-02-04 19:57 ` [PATCH 13/75] mm: Add folio_pincount_ptr() Matthew Wilcox (Oracle)
2022-02-04 19:57 ` [PATCH 14/75] mm: Turn page_maybe_dma_pinned() into folio_maybe_dma_pinned() Matthew Wilcox (Oracle)
2022-02-04 19:57 ` [PATCH 15/75] mm/gup: Add try_get_folio() and try_grab_folio() Matthew Wilcox (Oracle)
2022-02-04 19:57 ` [PATCH 16/75] mm/gup: Convert try_grab_page() to use a folio Matthew Wilcox (Oracle)
2022-02-06  2:12   ` John Hubbard
2022-02-07  7:47   ` Christoph Hellwig
2022-02-04 19:57 ` [PATCH 17/75] mm: Remove page_cache_add_speculative() and page_cache_get_speculative() Matthew Wilcox (Oracle)
2022-02-04 19:57 ` [PATCH 18/75] mm/gup: Add gup_put_folio() Matthew Wilcox (Oracle)
2022-02-04 19:57 ` [PATCH 19/75] mm/hugetlb: Use try_grab_folio() instead of try_grab_compound_head() Matthew Wilcox (Oracle)
2022-02-04 19:57 ` [PATCH 20/75] mm/gup: Convert gup_pte_range() to use a folio Matthew Wilcox (Oracle)
2022-02-06 14:52   ` Mark Hemment
2022-02-11 20:20     ` Matthew Wilcox
2022-02-04 19:57 ` [PATCH 21/75] mm/gup: Convert gup_hugepte() " Matthew Wilcox (Oracle)
2022-02-04 19:57 ` [PATCH 22/75] mm/gup: Convert gup_huge_pmd() " Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 23/75] mm/gup: Convert gup_huge_pud() " Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 24/75] mm/gup: Convert gup_huge_pgd() " Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 25/75] mm/gup: Turn compound_next() into gup_folio_next() Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 26/75] mm/gup: Turn compound_range_next() into gup_folio_range_next() Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 27/75] mm: Turn isolate_lru_page() into folio_isolate_lru() Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 28/75] mm/gup: Convert check_and_migrate_movable_pages() to use a folio Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 29/75] mm/workingset: Convert workingset_eviction() to take " Matthew Wilcox (Oracle)
2022-02-07  7:49   ` Christoph Hellwig
2022-02-04 19:58 ` [PATCH 30/75] mm/memcg: Convert mem_cgroup_swapout() " Matthew Wilcox (Oracle)
2022-02-07  7:49   ` Christoph Hellwig
2022-02-04 19:58 ` [PATCH 31/75] mm: Add lru_to_folio() Matthew Wilcox (Oracle)
2022-02-07  7:50   ` Christoph Hellwig
2022-02-11 20:24     ` Matthew Wilcox
2022-02-04 19:58 ` [PATCH 32/75] mm: Turn putback_lru_page() into folio_putback_lru() Matthew Wilcox (Oracle)
2022-02-07  7:50   ` Christoph Hellwig
2022-02-04 19:58 ` [PATCH 33/75] mm/vmscan: Convert __remove_mapping() to take a folio Matthew Wilcox (Oracle)
2022-02-07  7:51   ` Christoph Hellwig
2022-02-04 19:58 ` [PATCH 34/75] mm/vmscan: Turn page_check_dirty_writeback() into folio_check_dirty_writeback() Matthew Wilcox (Oracle)
2022-02-07  7:51   ` Christoph Hellwig
2022-02-12  1:49     ` Matthew Wilcox
2022-02-04 19:58 ` [PATCH 35/75] mm: Turn head_compound_mapcount() into folio_entire_mapcount() Matthew Wilcox (Oracle)
2022-02-07  7:52   ` Christoph Hellwig
2022-02-04 19:58 ` [PATCH 36/75] mm: Add folio_mapcount() Matthew Wilcox (Oracle)
2022-02-07  7:53   ` Christoph Hellwig
2022-02-04 19:58 ` [PATCH 37/75] mm: Add split_folio_to_list() Matthew Wilcox (Oracle)
2022-02-07  7:54   ` Christoph Hellwig
2022-02-04 19:58 ` [PATCH 38/75] mm: Add folio_is_zone_device() and folio_is_device_private() Matthew Wilcox (Oracle)
2022-02-07  7:54   ` Christoph Hellwig
2022-02-04 19:58 ` [PATCH 39/75] mm: Add folio_pgoff() Matthew Wilcox (Oracle)
2022-02-07  7:55   ` Christoph Hellwig
2022-02-04 19:58 ` [PATCH 40/75] mm: Add pvmw_set_page() and pvmw_set_folio() Matthew Wilcox (Oracle)
2022-02-07  7:55   ` Christoph Hellwig
2022-02-04 19:58 ` [PATCH 41/75] hexagon: Add pmd_pfn() Matthew Wilcox (Oracle)
2022-02-06 18:13   ` Mike Rapoport
2022-02-06 20:46     ` Matthew Wilcox
2022-02-06 21:33       ` Mike Rapoport
2022-02-06 22:05         ` Matthew Wilcox
2022-02-07 14:24           ` Mike Rapoport
2022-02-04 19:58 ` [PATCH 42/75] mm: Convert page_vma_mapped_walk to work on PFNs Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 43/75] mm/page_idle: Convert page_idle_clear_pte_refs() to use a folio Matthew Wilcox (Oracle)
2022-02-07  7:57   ` Christoph Hellwig
2022-02-04 19:58 ` [PATCH 44/75] mm/rmap: Use a folio in page_mkclean_one() Matthew Wilcox (Oracle)
2022-02-07  7:57   ` Christoph Hellwig
2022-02-04 19:58 ` [PATCH 45/75] mm/rmap: Turn page_referenced() into folio_referenced() Matthew Wilcox (Oracle)
2022-02-07  7:58   ` Christoph Hellwig
2022-02-04 19:58 ` [PATCH 46/75] mm/mlock: Turn clear_page_mlock() into folio_end_mlock() Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 47/75] mm/mlock: Turn mlock_vma_page() into mlock_vma_folio() Matthew Wilcox (Oracle)
2022-02-07 10:46   ` Mike Rapoport
2022-02-04 19:58 ` [PATCH 48/75] mm/rmap: Turn page_mlock() into folio_mlock() Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 49/75] mm/mlock: Turn munlock_vma_page() into munlock_vma_folio() Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 50/75] mm/huge_memory: Convert __split_huge_pmd() to take a folio Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 51/75] mm/rmap: Convert try_to_unmap() " Matthew Wilcox (Oracle)
2022-02-09 14:24   ` Mauricio Faria de Oliveira
2022-02-09 14:29     ` Matthew Wilcox
2022-02-04 19:58 ` [PATCH 52/75] mm/rmap: Convert try_to_migrate() to folios Matthew Wilcox (Oracle)
2022-02-09 15:27   ` Zi Yan
2022-02-04 19:58 ` [PATCH 53/75] mm/rmap: Convert make_device_exclusive_range() to use folios Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 54/75] mm/migrate: Convert remove_migration_ptes() to folios Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 55/75] mm/damon: Convert damon_pa_mkold() to use a folio Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 56/75] mm/damon: Convert damon_pa_young() " Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 57/75] mm/rmap: Turn page_lock_anon_vma_read() into folio_lock_anon_vma_read() Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 58/75] mm: Turn page_anon_vma() into folio_anon_vma() Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 59/75] mm/rmap: Convert rmap_walk() to take a folio Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 60/75] mm/rmap: Constify the rmap_walk_control argument Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 61/75] mm/vmscan: Free non-shmem folios without splitting them Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 62/75] mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 63/75] mm/vmscan: Account large folios correctly Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 64/75] mm/vmscan: Turn page_check_references() into folio_check_references() Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 65/75] mm/vmscan: Convert pageout() to take a folio Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 66/75] mm: Turn can_split_huge_page() into can_split_folio() Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 67/75] mm/filemap: Allow large folios to be added to the page cache Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 68/75] mm: Fix READ_ONLY_THP warning Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 69/75] mm: Make large folios depend on THP Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 70/75] mm: Support arbitrary THP sizes Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 71/75] mm/readahead: Add large folio readahead Matthew Wilcox (Oracle)
2022-02-06 13:10   ` Mark Hemment
2022-02-04 19:58 ` [PATCH 72/75] mm/readahead: Align file mappings for non-DAX Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 73/75] mm/readahead: Switch to page_cache_ra_order Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 74/75] mm/filemap: Support VM_HUGEPAGE for file mappings Matthew Wilcox (Oracle)
2022-02-04 19:58 ` [PATCH 75/75] selftests/vm/transhuge-stress: Support file-backed PMD folios Matthew Wilcox (Oracle)
2022-02-13 22:31 ` [PATCH 00/75] MM folio patches for 5.18 John Hubbard
2022-02-14  4:33 ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).