All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/16] Allocate and free frozen pages
@ 2022-08-09 17:18 Matthew Wilcox (Oracle)
  2022-08-09 17:18 ` [PATCH v2 01/16] mm/page_alloc: Cache page_zone() result in free_unref_page() Matthew Wilcox (Oracle)
                   ` (16 more replies)
  0 siblings, 17 replies; 29+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-08-09 17:18 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

Slab does not need to use the page refcount at all, and it can avoid
an atomic operation on page free.  Hugetlb wants to delay setting the
refcount until it has assembled a complete gigantic page.  We already
have the ability to freeze a page (safely reduce its reference count to
0), so this patchset adds APIs to allocate and free pages which are in
a frozen state.

This patchset is also a step towards the Glorious Future in which struct
page doesn't have a refcount; the users which need a refcount will have
one in their per-allocation memdesc.

Compared to v1, this patchset has been tested and survives a few hours
of xfstests.  Vlastimil fixed a bug where compaction needed to initialise
the page refcount itself.  As part of that debugging, I split the old 4/6
into ten patches; I've opted to leave it that way to aid anybody else
trying to bisect a bug in these patches in future.  I also dropped the
old patch 1/6 and replaced it with one that moves the call to page_zone()
a few lines earlier to reflect other changes that were made to page_alloc.

Matthew Wilcox (Oracle) (16):
  mm/page_alloc: Cache page_zone() result in free_unref_page()
  mm/page_alloc: Rename free_the_page() to free_frozen_pages()
  mm/page_alloc: Export free_frozen_pages() instead of free_unref_page()
  mm/page_alloc: Move set_page_refcounted() to callers of
    post_alloc_hook()
  mm/page_alloc: Move set_page_refcounted() to callers of
    prep_new_page()
  mm/page_alloc: Move set_page_refcounted() to callers of
    get_page_from_freelist()
  mm/page_alloc: Move set_page_refcounted() to callers of
    __alloc_pages_cpuset_fallback()
  mm/page_alloc: Move set_page_refcounted() to callers of
    __alloc_pages_may_oom()
  mm/page_alloc: Move set_page_refcounted() to callers of
    __alloc_pages_direct_compact()
  mm/page_alloc: Move set_page_refcounted() to callers of
    __alloc_pages_direct_reclaim()
  mm/page_alloc: Move set_page_refcounted() to callers of
    __alloc_pages_slowpath()
  mm/page_alloc: Move set_page_refcounted() to end of __alloc_pages()
  mm/page_alloc: Add __alloc_frozen_pages()
  mm/mempolicy: Add alloc_frozen_pages()
  slab: Allocate frozen pages
  slub: Allocate frozen pages

 mm/compaction.c |  1 +
 mm/internal.h   | 15 ++++++++++--
 mm/mempolicy.c  | 61 ++++++++++++++++++++++++++++++-------------------
 mm/page_alloc.c | 53 +++++++++++++++++++++++++-----------------
 mm/slab.c       | 23 +++++++++----------
 mm/slub.c       | 26 ++++++++++-----------
 mm/swap.c       |  2 +-
 7 files changed, 109 insertions(+), 72 deletions(-)

-- 
2.35.1



^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2 01/16] mm/page_alloc: Cache page_zone() result in free_unref_page()
  2022-08-09 17:18 [PATCH v2 00/16] Allocate and free frozen pages Matthew Wilcox (Oracle)
@ 2022-08-09 17:18 ` Matthew Wilcox (Oracle)
  2022-08-10  1:56   ` Miaohe Lin
                     ` (2 more replies)
  2022-08-09 17:18 ` [PATCH v2 02/16] mm/page_alloc: Rename free_the_page() to free_frozen_pages() Matthew Wilcox (Oracle)
                   ` (15 subsequent siblings)
  16 siblings, 3 replies; 29+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-08-09 17:18 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

Save 17 bytes of text by calculating page_zone() once instead of twice.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/page_alloc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e5486d47406e..2745865a57c5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3483,16 +3483,16 @@ void free_unref_page(struct page *page, unsigned int order)
 	 * areas back if necessary. Otherwise, we may have to free
 	 * excessively into the page allocator
 	 */
+	zone = page_zone(page);
 	migratetype = get_pcppage_migratetype(page);
 	if (unlikely(migratetype >= MIGRATE_PCPTYPES)) {
 		if (unlikely(is_migrate_isolate(migratetype))) {
-			free_one_page(page_zone(page), page, pfn, order, migratetype, FPI_NONE);
+			free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);
 			return;
 		}
 		migratetype = MIGRATE_MOVABLE;
 	}
 
-	zone = page_zone(page);
 	pcp_trylock_prepare(UP_flags);
 	pcp = pcp_spin_trylock_irqsave(zone->per_cpu_pageset, flags);
 	if (pcp) {
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 02/16] mm/page_alloc: Rename free_the_page() to free_frozen_pages()
  2022-08-09 17:18 [PATCH v2 00/16] Allocate and free frozen pages Matthew Wilcox (Oracle)
  2022-08-09 17:18 ` [PATCH v2 01/16] mm/page_alloc: Cache page_zone() result in free_unref_page() Matthew Wilcox (Oracle)
@ 2022-08-09 17:18 ` Matthew Wilcox (Oracle)
  2022-08-10  6:36   ` Muchun Song
  2022-08-09 17:18 ` [PATCH v2 03/16] mm/page_alloc: Export free_frozen_pages() instead of free_unref_page() Matthew Wilcox (Oracle)
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 29+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-08-09 17:18 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	David Hildenbrand, Miaohe Lin, William Kucharski

In preparation for making this function available outside page_alloc,
rename it to free_frozen_pages(), which fits better with the other
memory allocation/free functions.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/page_alloc.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2745865a57c5..04260b5a7699 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -761,7 +761,7 @@ static inline bool pcp_allowed_order(unsigned int order)
 	return false;
 }
 
-static inline void free_the_page(struct page *page, unsigned int order)
+static inline void free_frozen_pages(struct page *page, unsigned int order)
 {
 	if (pcp_allowed_order(order))		/* Via pcp? */
 		free_unref_page(page, order);
@@ -787,7 +787,7 @@ static inline void free_the_page(struct page *page, unsigned int order)
 void free_compound_page(struct page *page)
 {
 	mem_cgroup_uncharge(page_folio(page));
-	free_the_page(page, compound_order(page));
+	free_frozen_pages(page, compound_order(page));
 }
 
 static void prep_compound_head(struct page *page, unsigned int order)
@@ -5597,10 +5597,10 @@ EXPORT_SYMBOL(get_zeroed_page);
 void __free_pages(struct page *page, unsigned int order)
 {
 	if (put_page_testzero(page))
-		free_the_page(page, order);
+		free_frozen_pages(page, order);
 	else if (!PageHead(page))
 		while (order-- > 0)
-			free_the_page(page + (1 << order), order);
+			free_frozen_pages(page + (1 << order), order);
 }
 EXPORT_SYMBOL(__free_pages);
 
@@ -5651,7 +5651,7 @@ void __page_frag_cache_drain(struct page *page, unsigned int count)
 	VM_BUG_ON_PAGE(page_ref_count(page) == 0, page);
 
 	if (page_ref_sub_and_test(page, count))
-		free_the_page(page, compound_order(page));
+		free_frozen_pages(page, compound_order(page));
 }
 EXPORT_SYMBOL(__page_frag_cache_drain);
 
@@ -5692,7 +5692,7 @@ void *page_frag_alloc_align(struct page_frag_cache *nc,
 			goto refill;
 
 		if (unlikely(nc->pfmemalloc)) {
-			free_the_page(page, compound_order(page));
+			free_frozen_pages(page, compound_order(page));
 			goto refill;
 		}
 
@@ -5724,7 +5724,7 @@ void page_frag_free(void *addr)
 	struct page *page = virt_to_head_page(addr);
 
 	if (unlikely(put_page_testzero(page)))
-		free_the_page(page, compound_order(page));
+		free_frozen_pages(page, compound_order(page));
 }
 EXPORT_SYMBOL(page_frag_free);
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 03/16] mm/page_alloc: Export free_frozen_pages() instead of free_unref_page()
  2022-08-09 17:18 [PATCH v2 00/16] Allocate and free frozen pages Matthew Wilcox (Oracle)
  2022-08-09 17:18 ` [PATCH v2 01/16] mm/page_alloc: Cache page_zone() result in free_unref_page() Matthew Wilcox (Oracle)
  2022-08-09 17:18 ` [PATCH v2 02/16] mm/page_alloc: Rename free_the_page() to free_frozen_pages() Matthew Wilcox (Oracle)
@ 2022-08-09 17:18 ` Matthew Wilcox (Oracle)
  2022-08-10  3:00   ` Miaohe Lin
  2022-08-10  6:37   ` Muchun Song
  2022-08-09 17:18 ` [PATCH v2 04/16] mm/page_alloc: Move set_page_refcounted() to callers of post_alloc_hook() Matthew Wilcox (Oracle)
                   ` (13 subsequent siblings)
  16 siblings, 2 replies; 29+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-08-09 17:18 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), David Hildenbrand, William Kucharski

This API makes more sense for slab to use and it works perfectly
well for swap.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/internal.h   |  4 ++--
 mm/page_alloc.c | 18 +++++++++---------
 mm/swap.c       |  2 +-
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index 785409805ed7..08d0881223cf 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -362,8 +362,8 @@ extern void post_alloc_hook(struct page *page, unsigned int order,
 					gfp_t gfp_flags);
 extern int user_min_free_kbytes;
 
-extern void free_unref_page(struct page *page, unsigned int order);
-extern void free_unref_page_list(struct list_head *list);
+void free_frozen_pages(struct page *, unsigned int order);
+void free_unref_page_list(struct list_head *list);
 
 extern void zone_pcp_update(struct zone *zone, int cpu_online);
 extern void zone_pcp_reset(struct zone *zone);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 04260b5a7699..30e7a5974d39 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -761,14 +761,6 @@ static inline bool pcp_allowed_order(unsigned int order)
 	return false;
 }
 
-static inline void free_frozen_pages(struct page *page, unsigned int order)
-{
-	if (pcp_allowed_order(order))		/* Via pcp? */
-		free_unref_page(page, order);
-	else
-		__free_pages_ok(page, order, FPI_NONE);
-}
-
 /*
  * Higher-order pages are called "compound pages".  They are structured thusly:
  *
@@ -3464,7 +3456,7 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp,
 /*
  * Free a pcp page
  */
-void free_unref_page(struct page *page, unsigned int order)
+static void free_unref_page(struct page *page, unsigned int order)
 {
 	unsigned long flags;
 	unsigned long __maybe_unused UP_flags;
@@ -3504,6 +3496,14 @@ void free_unref_page(struct page *page, unsigned int order)
 	pcp_trylock_finish(UP_flags);
 }
 
+void free_frozen_pages(struct page *page, unsigned int order)
+{
+	if (pcp_allowed_order(order))		/* Via pcp? */
+		free_unref_page(page, order);
+	else
+		__free_pages_ok(page, order, FPI_NONE);
+}
+
 /*
  * Free a list of 0-order pages
  */
diff --git a/mm/swap.c b/mm/swap.c
index 6525011b715e..647f6f77193f 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -102,7 +102,7 @@ static void __folio_put_small(struct folio *folio)
 {
 	__page_cache_release(folio);
 	mem_cgroup_uncharge(folio);
-	free_unref_page(&folio->page, 0);
+	free_frozen_pages(&folio->page, 0);
 }
 
 static void __folio_put_large(struct folio *folio)
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 04/16] mm/page_alloc: Move set_page_refcounted() to callers of post_alloc_hook()
  2022-08-09 17:18 [PATCH v2 00/16] Allocate and free frozen pages Matthew Wilcox (Oracle)
                   ` (2 preceding siblings ...)
  2022-08-09 17:18 ` [PATCH v2 03/16] mm/page_alloc: Export free_frozen_pages() instead of free_unref_page() Matthew Wilcox (Oracle)
@ 2022-08-09 17:18 ` Matthew Wilcox (Oracle)
  2022-08-10  3:30   ` Miaohe Lin
  2022-08-09 17:18 ` [PATCH v2 05/16] mm/page_alloc: Move set_page_refcounted() to callers of prep_new_page() Matthew Wilcox (Oracle)
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 29+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-08-09 17:18 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

In preparation for allocating frozen pages, stop initialising
the page refcount in post_alloc_hook().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/compaction.c | 1 +
 mm/page_alloc.c | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 640fa76228dd..63dc6abdb573 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -97,6 +97,7 @@ static void split_map_pages(struct list_head *list)
 		nr_pages = 1 << order;
 
 		post_alloc_hook(page, order, __GFP_MOVABLE);
+		set_page_refcounted(page);
 		if (order)
 			split_page(page, order);
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 30e7a5974d39..d41b8c8f3135 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2465,7 +2465,6 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
 	int i;
 
 	set_page_private(page, 0);
-	set_page_refcounted(page);
 
 	arch_alloc_page(page, order);
 	debug_pagealloc_map_pages(page, 1 << order);
@@ -2536,6 +2535,7 @@ static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags
 		set_page_pfmemalloc(page);
 	else
 		clear_page_pfmemalloc(page);
+	set_page_refcounted(page);
 }
 
 /*
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 05/16] mm/page_alloc: Move set_page_refcounted() to callers of prep_new_page()
  2022-08-09 17:18 [PATCH v2 00/16] Allocate and free frozen pages Matthew Wilcox (Oracle)
                   ` (3 preceding siblings ...)
  2022-08-09 17:18 ` [PATCH v2 04/16] mm/page_alloc: Move set_page_refcounted() to callers of post_alloc_hook() Matthew Wilcox (Oracle)
@ 2022-08-09 17:18 ` Matthew Wilcox (Oracle)
  2022-08-09 17:18 ` [PATCH v2 06/16] mm/page_alloc: Move set_page_refcounted() to callers of get_page_from_freelist() Matthew Wilcox (Oracle)
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-08-09 17:18 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

In preparation for allocating frozen pages, stop initialising the page
refcount in prep_new_page().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/page_alloc.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d41b8c8f3135..9bc53001f56c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2535,7 +2535,6 @@ static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags
 		set_page_pfmemalloc(page);
 	else
 		clear_page_pfmemalloc(page);
-	set_page_refcounted(page);
 }
 
 /*
@@ -4281,6 +4280,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
 				gfp_mask, alloc_flags, ac->migratetype);
 		if (page) {
 			prep_new_page(page, order, gfp_mask, alloc_flags);
+			set_page_refcounted(page);
 
 			/*
 			 * If this is a high-order atomic allocation then check
@@ -4504,8 +4504,10 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
 	count_vm_event(COMPACTSTALL);
 
 	/* Prep a captured page if available */
-	if (page)
+	if (page) {
 		prep_new_page(page, order, gfp_mask, alloc_flags);
+		set_page_refcounted(page);
+	}
 
 	/* Try get a page from the freelist if available */
 	if (!page)
@@ -5440,6 +5442,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
 		nr_account++;
 
 		prep_new_page(page, 0, gfp, 0);
+		set_page_refcounted(page);
 		if (page_list)
 			list_add(&page->lru, page_list);
 		else
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 06/16] mm/page_alloc: Move set_page_refcounted() to callers of get_page_from_freelist()
  2022-08-09 17:18 [PATCH v2 00/16] Allocate and free frozen pages Matthew Wilcox (Oracle)
                   ` (4 preceding siblings ...)
  2022-08-09 17:18 ` [PATCH v2 05/16] mm/page_alloc: Move set_page_refcounted() to callers of prep_new_page() Matthew Wilcox (Oracle)
@ 2022-08-09 17:18 ` Matthew Wilcox (Oracle)
  2022-08-09 17:18 ` [PATCH v2 07/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_cpuset_fallback() Matthew Wilcox (Oracle)
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-08-09 17:18 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

In preparation for allocating frozen pages, stop initialising the page
refcount in get_page_from_freelist().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/page_alloc.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 9bc53001f56c..8c9102ab7a87 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4280,7 +4280,6 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
 				gfp_mask, alloc_flags, ac->migratetype);
 		if (page) {
 			prep_new_page(page, order, gfp_mask, alloc_flags);
-			set_page_refcounted(page);
 
 			/*
 			 * If this is a high-order atomic allocation then check
@@ -4374,6 +4373,8 @@ __alloc_pages_cpuset_fallback(gfp_t gfp_mask, unsigned int order,
 		page = get_page_from_freelist(gfp_mask, order,
 				alloc_flags, ac);
 
+	if (page)
+		set_page_refcounted(page);
 	return page;
 }
 
@@ -4412,8 +4413,10 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
 	page = get_page_from_freelist((gfp_mask | __GFP_HARDWALL) &
 				      ~__GFP_DIRECT_RECLAIM, order,
 				      ALLOC_WMARK_HIGH|ALLOC_CPUSET, ac);
-	if (page)
+	if (page) {
+		set_page_refcounted(page);
 		goto out;
+	}
 
 	/* Coredumps can quickly deplete all memory reserves */
 	if (current->flags & PF_DUMPCORE)
@@ -4504,10 +4507,8 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
 	count_vm_event(COMPACTSTALL);
 
 	/* Prep a captured page if available */
-	if (page) {
+	if (page)
 		prep_new_page(page, order, gfp_mask, alloc_flags);
-		set_page_refcounted(page);
-	}
 
 	/* Try get a page from the freelist if available */
 	if (!page)
@@ -4516,6 +4517,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
 	if (page) {
 		struct zone *zone = page_zone(page);
 
+		set_page_refcounted(page);
 		zone->compact_blockskip_flush = false;
 		compaction_defer_reset(zone, order, true);
 		count_vm_event(COMPACTSUCCESS);
@@ -4765,6 +4767,7 @@ __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order,
 		drained = true;
 		goto retry;
 	}
+	set_page_refcounted(page);
 out:
 	psi_memstall_leave(&pflags);
 
@@ -5058,8 +5061,10 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	 * that first
 	 */
 	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
-	if (page)
+	if (page) {
+		set_page_refcounted(page);
 		goto got_pg;
+	}
 
 	/*
 	 * For costly allocations, try direct compaction first, as it's likely
@@ -5138,8 +5143,10 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 
 	/* Attempt with potentially adjusted zonelist and alloc_flags */
 	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
-	if (page)
+	if (page) {
+		set_page_refcounted(page);
 		goto got_pg;
+	}
 
 	/* Caller is not willing to reclaim, we can't balance anything */
 	if (!can_direct_reclaim)
@@ -5516,8 +5523,10 @@ struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid,
 
 	/* First allocation attempt */
 	page = get_page_from_freelist(alloc_gfp, order, alloc_flags, &ac);
-	if (likely(page))
+	if (likely(page)) {
+		set_page_refcounted(page);
 		goto out;
+	}
 
 	alloc_gfp = gfp;
 	ac.spread_dirty_pages = false;
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 07/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_cpuset_fallback()
  2022-08-09 17:18 [PATCH v2 00/16] Allocate and free frozen pages Matthew Wilcox (Oracle)
                   ` (5 preceding siblings ...)
  2022-08-09 17:18 ` [PATCH v2 06/16] mm/page_alloc: Move set_page_refcounted() to callers of get_page_from_freelist() Matthew Wilcox (Oracle)
@ 2022-08-09 17:18 ` Matthew Wilcox (Oracle)
  2022-08-09 17:18 ` [PATCH v2 08/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_may_oom() Matthew Wilcox (Oracle)
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-08-09 17:18 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

In preparation for allocating frozen pages, stop initialising the page
refcount in __alloc_pages_cpuset_fallback().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/page_alloc.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8c9102ab7a87..0287b3be92e5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4373,8 +4373,6 @@ __alloc_pages_cpuset_fallback(gfp_t gfp_mask, unsigned int order,
 		page = get_page_from_freelist(gfp_mask, order,
 				alloc_flags, ac);
 
-	if (page)
-		set_page_refcounted(page);
 	return page;
 }
 
@@ -4461,6 +4459,8 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
 		if (gfp_mask & __GFP_NOFAIL)
 			page = __alloc_pages_cpuset_fallback(gfp_mask, order,
 					ALLOC_NO_WATERMARKS, ac);
+		if (page)
+			set_page_refcounted(page);
 	}
 out:
 	mutex_unlock(&oom_lock);
@@ -5256,8 +5256,10 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 		 * the situation worse
 		 */
 		page = __alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_HARDER, ac);
-		if (page)
+		if (page) {
+			set_page_refcounted(page);
 			goto got_pg;
+		}
 
 		cond_resched();
 		goto retry;
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 08/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_may_oom()
  2022-08-09 17:18 [PATCH v2 00/16] Allocate and free frozen pages Matthew Wilcox (Oracle)
                   ` (6 preceding siblings ...)
  2022-08-09 17:18 ` [PATCH v2 07/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_cpuset_fallback() Matthew Wilcox (Oracle)
@ 2022-08-09 17:18 ` Matthew Wilcox (Oracle)
  2022-08-09 17:18 ` [PATCH v2 09/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_direct_compact() Matthew Wilcox (Oracle)
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-08-09 17:18 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

In preparation for allocating frozen pages, stop initialising the page
refcount in __alloc_pages_may_oom().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/page_alloc.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0287b3be92e5..8222cc6ce7dd 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4411,10 +4411,8 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
 	page = get_page_from_freelist((gfp_mask | __GFP_HARDWALL) &
 				      ~__GFP_DIRECT_RECLAIM, order,
 				      ALLOC_WMARK_HIGH|ALLOC_CPUSET, ac);
-	if (page) {
-		set_page_refcounted(page);
+	if (page)
 		goto out;
-	}
 
 	/* Coredumps can quickly deplete all memory reserves */
 	if (current->flags & PF_DUMPCORE)
@@ -4459,8 +4457,6 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
 		if (gfp_mask & __GFP_NOFAIL)
 			page = __alloc_pages_cpuset_fallback(gfp_mask, order,
 					ALLOC_NO_WATERMARKS, ac);
-		if (page)
-			set_page_refcounted(page);
 	}
 out:
 	mutex_unlock(&oom_lock);
@@ -5202,8 +5198,10 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 
 	/* Reclaim has failed us, start killing things */
 	page = __alloc_pages_may_oom(gfp_mask, order, ac, &did_some_progress);
-	if (page)
+	if (page) {
+		set_page_refcounted(page);
 		goto got_pg;
+	}
 
 	/* Avoid allocations with no watermarks from looping endlessly */
 	if (tsk_is_oom_victim(current) &&
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 09/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_direct_compact()
  2022-08-09 17:18 [PATCH v2 00/16] Allocate and free frozen pages Matthew Wilcox (Oracle)
                   ` (7 preceding siblings ...)
  2022-08-09 17:18 ` [PATCH v2 08/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_may_oom() Matthew Wilcox (Oracle)
@ 2022-08-09 17:18 ` Matthew Wilcox (Oracle)
  2022-08-09 17:18 ` [PATCH v2 10/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_direct_reclaim() Matthew Wilcox (Oracle)
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-08-09 17:18 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

In preparation for allocating frozen pages, stop initialising the page
refcount in __alloc_pages_direct_compact().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/page_alloc.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8222cc6ce7dd..555409f04d49 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4513,7 +4513,6 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
 	if (page) {
 		struct zone *zone = page_zone(page);
 
-		set_page_refcounted(page);
 		zone->compact_blockskip_flush = false;
 		compaction_defer_reset(zone, order, true);
 		count_vm_event(COMPACTSUCCESS);
@@ -5079,8 +5078,10 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 						alloc_flags, ac,
 						INIT_COMPACT_PRIORITY,
 						&compact_result);
-		if (page)
+		if (page) {
+			set_page_refcounted(page);
 			goto got_pg;
+		}
 
 		/*
 		 * Checks for costly allocations with __GFP_NORETRY, which
@@ -5161,8 +5162,10 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	/* Try direct compaction and then allocating */
 	page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags, ac,
 					compact_priority, &compact_result);
-	if (page)
+	if (page) {
+		set_page_refcounted(page);
 		goto got_pg;
+	}
 
 	/* Do not loop if specifically requested */
 	if (gfp_mask & __GFP_NORETRY)
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 10/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_direct_reclaim()
  2022-08-09 17:18 [PATCH v2 00/16] Allocate and free frozen pages Matthew Wilcox (Oracle)
                   ` (8 preceding siblings ...)
  2022-08-09 17:18 ` [PATCH v2 09/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_direct_compact() Matthew Wilcox (Oracle)
@ 2022-08-09 17:18 ` Matthew Wilcox (Oracle)
  2022-08-09 17:18 ` [PATCH v2 11/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_slowpath() Matthew Wilcox (Oracle)
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-08-09 17:18 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

In preparation for allocating frozen pages, stop initialising the page
refcount in __alloc_pages_direct_reclaim().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/page_alloc.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 555409f04d49..7c306231b336 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4762,7 +4762,6 @@ __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order,
 		drained = true;
 		goto retry;
 	}
-	set_page_refcounted(page);
 out:
 	psi_memstall_leave(&pflags);
 
@@ -5156,8 +5155,10 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	/* Try direct reclaim and then allocating */
 	page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,
 							&did_some_progress);
-	if (page)
+	if (page) {
+		set_page_refcounted(page);
 		goto got_pg;
+	}
 
 	/* Try direct compaction and then allocating */
 	page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags, ac,
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 11/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_slowpath()
  2022-08-09 17:18 [PATCH v2 00/16] Allocate and free frozen pages Matthew Wilcox (Oracle)
                   ` (9 preceding siblings ...)
  2022-08-09 17:18 ` [PATCH v2 10/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_direct_reclaim() Matthew Wilcox (Oracle)
@ 2022-08-09 17:18 ` Matthew Wilcox (Oracle)
  2022-08-09 17:18 ` [PATCH v2 12/16] mm/page_alloc: Move set_page_refcounted() to end of __alloc_pages() Matthew Wilcox (Oracle)
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-08-09 17:18 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

In preparation for allocating frozen pages, stop initialising the page
refcount in __alloc_pages_slowpath().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/page_alloc.c | 30 +++++++++---------------------
 1 file changed, 9 insertions(+), 21 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7c306231b336..26f8ed480ebb 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5055,10 +5055,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	 * that first
 	 */
 	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
-	if (page) {
-		set_page_refcounted(page);
+	if (page)
 		goto got_pg;
-	}
 
 	/*
 	 * For costly allocations, try direct compaction first, as it's likely
@@ -5077,10 +5075,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 						alloc_flags, ac,
 						INIT_COMPACT_PRIORITY,
 						&compact_result);
-		if (page) {
-			set_page_refcounted(page);
+		if (page)
 			goto got_pg;
-		}
 
 		/*
 		 * Checks for costly allocations with __GFP_NORETRY, which
@@ -5139,10 +5135,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 
 	/* Attempt with potentially adjusted zonelist and alloc_flags */
 	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
-	if (page) {
-		set_page_refcounted(page);
+	if (page)
 		goto got_pg;
-	}
 
 	/* Caller is not willing to reclaim, we can't balance anything */
 	if (!can_direct_reclaim)
@@ -5155,18 +5149,14 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	/* Try direct reclaim and then allocating */
 	page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,
 							&did_some_progress);
-	if (page) {
-		set_page_refcounted(page);
+	if (page)
 		goto got_pg;
-	}
 
 	/* Try direct compaction and then allocating */
 	page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags, ac,
 					compact_priority, &compact_result);
-	if (page) {
-		set_page_refcounted(page);
+	if (page)
 		goto got_pg;
-	}
 
 	/* Do not loop if specifically requested */
 	if (gfp_mask & __GFP_NORETRY)
@@ -5202,10 +5192,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 
 	/* Reclaim has failed us, start killing things */
 	page = __alloc_pages_may_oom(gfp_mask, order, ac, &did_some_progress);
-	if (page) {
-		set_page_refcounted(page);
+	if (page)
 		goto got_pg;
-	}
 
 	/* Avoid allocations with no watermarks from looping endlessly */
 	if (tsk_is_oom_victim(current) &&
@@ -5258,10 +5246,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 		 * the situation worse
 		 */
 		page = __alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_HARDER, ac);
-		if (page) {
-			set_page_refcounted(page);
+		if (page)
 			goto got_pg;
-		}
 
 		cond_resched();
 		goto retry;
@@ -5542,6 +5528,8 @@ struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid,
 	ac.nodemask = nodemask;
 
 	page = __alloc_pages_slowpath(alloc_gfp, order, &ac);
+	if (page)
+		set_page_refcounted(page);
 
 out:
 	if (memcg_kmem_enabled() && (gfp & __GFP_ACCOUNT) && page &&
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 12/16] mm/page_alloc: Move set_page_refcounted() to end of __alloc_pages()
  2022-08-09 17:18 [PATCH v2 00/16] Allocate and free frozen pages Matthew Wilcox (Oracle)
                   ` (10 preceding siblings ...)
  2022-08-09 17:18 ` [PATCH v2 11/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_slowpath() Matthew Wilcox (Oracle)
@ 2022-08-09 17:18 ` Matthew Wilcox (Oracle)
  2022-08-09 17:18 ` [PATCH v2 13/16] mm/page_alloc: Add __alloc_frozen_pages() Matthew Wilcox (Oracle)
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-08-09 17:18 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

Remove some code duplication by calling set_page_refcounted() at the
end of __alloc_pages() instead of after each call that can allocate
a page.  That means that we free a frozen page if we've exceeded the
allowed memcg memory.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/page_alloc.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 26f8ed480ebb..f1b7fc657c74 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5513,10 +5513,8 @@ struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid,
 
 	/* First allocation attempt */
 	page = get_page_from_freelist(alloc_gfp, order, alloc_flags, &ac);
-	if (likely(page)) {
-		set_page_refcounted(page);
+	if (likely(page))
 		goto out;
-	}
 
 	alloc_gfp = gfp;
 	ac.spread_dirty_pages = false;
@@ -5528,15 +5526,15 @@ struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid,
 	ac.nodemask = nodemask;
 
 	page = __alloc_pages_slowpath(alloc_gfp, order, &ac);
-	if (page)
-		set_page_refcounted(page);
 
 out:
 	if (memcg_kmem_enabled() && (gfp & __GFP_ACCOUNT) && page &&
 	    unlikely(__memcg_kmem_charge_page(page, gfp, order) != 0)) {
-		__free_pages(page, order);
+		free_frozen_pages(page, order);
 		page = NULL;
 	}
+	if (page)
+		set_page_refcounted(page);
 
 	trace_mm_page_alloc(page, order, alloc_gfp, ac.migratetype);
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 13/16] mm/page_alloc: Add __alloc_frozen_pages()
  2022-08-09 17:18 [PATCH v2 00/16] Allocate and free frozen pages Matthew Wilcox (Oracle)
                   ` (11 preceding siblings ...)
  2022-08-09 17:18 ` [PATCH v2 12/16] mm/page_alloc: Move set_page_refcounted() to end of __alloc_pages() Matthew Wilcox (Oracle)
@ 2022-08-09 17:18 ` Matthew Wilcox (Oracle)
  2022-08-09 17:18 ` [PATCH v2 14/16] mm/mempolicy: Add alloc_frozen_pages() Matthew Wilcox (Oracle)
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-08-09 17:18 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

Defer the initialisation of the page refcount to the new __alloc_pages()
wrapper and turn the old __alloc_pages() into __alloc_frozen_pages().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/internal.h   |  2 ++
 mm/page_alloc.c | 17 +++++++++++++----
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index 08d0881223cf..7e6079216a17 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -362,6 +362,8 @@ extern void post_alloc_hook(struct page *page, unsigned int order,
 					gfp_t gfp_flags);
 extern int user_min_free_kbytes;
 
+struct page *__alloc_frozen_pages(gfp_t, unsigned int order, int nid,
+		nodemask_t *);
 void free_frozen_pages(struct page *, unsigned int order);
 void free_unref_page_list(struct list_head *list);
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f1b7fc657c74..359a92113152 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5476,8 +5476,8 @@ EXPORT_SYMBOL_GPL(__alloc_pages_bulk);
 /*
  * This is the 'heart' of the zoned buddy allocator.
  */
-struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid,
-							nodemask_t *nodemask)
+struct page *__alloc_frozen_pages(gfp_t gfp, unsigned int order,
+		int preferred_nid, nodemask_t *nodemask)
 {
 	struct page *page;
 	unsigned int alloc_flags = ALLOC_WMARK_LOW;
@@ -5533,13 +5533,22 @@ struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid,
 		free_frozen_pages(page, order);
 		page = NULL;
 	}
-	if (page)
-		set_page_refcounted(page);
 
 	trace_mm_page_alloc(page, order, alloc_gfp, ac.migratetype);
 
 	return page;
 }
+
+struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid,
+							nodemask_t *nodemask)
+{
+	struct page *page;
+
+	page = __alloc_frozen_pages(gfp, order, preferred_nid, nodemask);
+	if (page)
+		set_page_refcounted(page);
+	return page;
+}
 EXPORT_SYMBOL(__alloc_pages);
 
 struct folio *__folio_alloc(gfp_t gfp, unsigned int order, int preferred_nid,
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 14/16] mm/mempolicy: Add alloc_frozen_pages()
  2022-08-09 17:18 [PATCH v2 00/16] Allocate and free frozen pages Matthew Wilcox (Oracle)
                   ` (12 preceding siblings ...)
  2022-08-09 17:18 ` [PATCH v2 13/16] mm/page_alloc: Add __alloc_frozen_pages() Matthew Wilcox (Oracle)
@ 2022-08-09 17:18 ` Matthew Wilcox (Oracle)
  2022-08-09 17:18 ` [PATCH v2 15/16] slab: Allocate frozen pages Matthew Wilcox (Oracle)
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-08-09 17:18 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), William Kucharski

Provide an interface to allocate pages from the page allocator without
incrementing their refcount.  This saves an atomic operation on free,
which may be beneficial to some users (eg slab).

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/internal.h  |  9 ++++++++
 mm/mempolicy.c | 61 +++++++++++++++++++++++++++++++-------------------
 2 files changed, 47 insertions(+), 23 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index 7e6079216a17..6f02bc32b406 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -367,6 +367,15 @@ struct page *__alloc_frozen_pages(gfp_t, unsigned int order, int nid,
 void free_frozen_pages(struct page *, unsigned int order);
 void free_unref_page_list(struct list_head *list);
 
+#ifdef CONFIG_NUMA
+struct page *alloc_frozen_pages(gfp_t, unsigned int order);
+#else
+static inline struct page *alloc_frozen_pages(gfp_t gfp, unsigned int order)
+{
+	return __alloc_frozen_pages(gfp, order, numa_node_id(), NULL);
+}
+#endif
+
 extern void zone_pcp_update(struct zone *zone, int cpu_online);
 extern void zone_pcp_reset(struct zone *zone);
 extern void zone_pcp_disable(struct zone *zone);
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index b73d3248d976..09ecc499d5fc 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2100,7 +2100,7 @@ static struct page *alloc_page_interleave(gfp_t gfp, unsigned order,
 {
 	struct page *page;
 
-	page = __alloc_pages(gfp, order, nid, NULL);
+	page = __alloc_frozen_pages(gfp, order, nid, NULL);
 	/* skip NUMA_INTERLEAVE_HIT counter update if numa stats is disabled */
 	if (!static_branch_likely(&vm_numa_stat_key))
 		return page;
@@ -2126,9 +2126,9 @@ static struct page *alloc_pages_preferred_many(gfp_t gfp, unsigned int order,
 	 */
 	preferred_gfp = gfp | __GFP_NOWARN;
 	preferred_gfp &= ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL);
-	page = __alloc_pages(preferred_gfp, order, nid, &pol->nodes);
+	page = __alloc_frozen_pages(preferred_gfp, order, nid, &pol->nodes);
 	if (!page)
-		page = __alloc_pages(gfp, order, nid, NULL);
+		page = __alloc_frozen_pages(gfp, order, nid, NULL);
 
 	return page;
 }
@@ -2167,8 +2167,11 @@ struct folio *vma_alloc_folio(gfp_t gfp, int order, struct vm_area_struct *vma,
 		mpol_cond_put(pol);
 		gfp |= __GFP_COMP;
 		page = alloc_page_interleave(gfp, order, nid);
-		if (page && order > 1)
-			prep_transhuge_page(page);
+		if (page) {
+			set_page_refcounted(page);
+			if (order > 1)
+				prep_transhuge_page(page);
+		}
 		folio = (struct folio *)page;
 		goto out;
 	}
@@ -2180,8 +2183,11 @@ struct folio *vma_alloc_folio(gfp_t gfp, int order, struct vm_area_struct *vma,
 		gfp |= __GFP_COMP;
 		page = alloc_pages_preferred_many(gfp, order, node, pol);
 		mpol_cond_put(pol);
-		if (page && order > 1)
-			prep_transhuge_page(page);
+		if (page) {
+			set_page_refcounted(page);
+			if (order > 1)
+				prep_transhuge_page(page);
+		}
 		folio = (struct folio *)page;
 		goto out;
 	}
@@ -2235,21 +2241,7 @@ struct folio *vma_alloc_folio(gfp_t gfp, int order, struct vm_area_struct *vma,
 }
 EXPORT_SYMBOL(vma_alloc_folio);
 
-/**
- * alloc_pages - Allocate pages.
- * @gfp: GFP flags.
- * @order: Power of two of number of pages to allocate.
- *
- * Allocate 1 << @order contiguous pages.  The physical address of the
- * first page is naturally aligned (eg an order-3 allocation will be aligned
- * to a multiple of 8 * PAGE_SIZE bytes).  The NUMA policy of the current
- * process is honoured when in process context.
- *
- * Context: Can be called from any context, providing the appropriate GFP
- * flags are used.
- * Return: The page on success or NULL if allocation fails.
- */
-struct page *alloc_pages(gfp_t gfp, unsigned order)
+struct page *alloc_frozen_pages(gfp_t gfp, unsigned order)
 {
 	struct mempolicy *pol = &default_policy;
 	struct page *page;
@@ -2267,12 +2259,35 @@ struct page *alloc_pages(gfp_t gfp, unsigned order)
 		page = alloc_pages_preferred_many(gfp, order,
 				  policy_node(gfp, pol, numa_node_id()), pol);
 	else
-		page = __alloc_pages(gfp, order,
+		page = __alloc_frozen_pages(gfp, order,
 				policy_node(gfp, pol, numa_node_id()),
 				policy_nodemask(gfp, pol));
 
 	return page;
 }
+
+/**
+ * alloc_pages - Allocate pages.
+ * @gfp: GFP flags.
+ * @order: Power of two of number of pages to allocate.
+ *
+ * Allocate 1 << @order contiguous pages.  The physical address of the
+ * first page is naturally aligned (eg an order-3 allocation will be aligned
+ * to a multiple of 8 * PAGE_SIZE bytes).  The NUMA policy of the current
+ * process is honoured when in process context.
+ *
+ * Context: Can be called from any context, providing the appropriate GFP
+ * flags are used.
+ * Return: The page on success or NULL if allocation fails.
+ */
+struct page *alloc_pages(gfp_t gfp, unsigned order)
+{
+	struct page *page = alloc_frozen_pages(gfp, order);
+
+	if (page)
+		set_page_refcounted(page);
+	return page;
+}
 EXPORT_SYMBOL(alloc_pages);
 
 struct folio *folio_alloc(gfp_t gfp, unsigned order)
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 15/16] slab: Allocate frozen pages
  2022-08-09 17:18 [PATCH v2 00/16] Allocate and free frozen pages Matthew Wilcox (Oracle)
                   ` (13 preceding siblings ...)
  2022-08-09 17:18 ` [PATCH v2 14/16] mm/mempolicy: Add alloc_frozen_pages() Matthew Wilcox (Oracle)
@ 2022-08-09 17:18 ` Matthew Wilcox (Oracle)
  2022-08-10 12:31   ` Vlastimil Babka
  2022-08-09 17:18 ` [PATCH v2 16/16] slub: " Matthew Wilcox (Oracle)
  2022-08-11  0:19 ` [PATCH v2 00/16] Allocate and free " Shakeel Butt
  16 siblings, 1 reply; 29+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-08-09 17:18 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), William Kucharski

Since slab does not use the page refcount, it can allocate and
free frozen pages, saving one atomic operation per free.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/slab.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 10e96137b44f..e7603d23c6c9 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1355,23 +1355,23 @@ slab_out_of_memory(struct kmem_cache *cachep, gfp_t gfpflags, int nodeid)
 static struct slab *kmem_getpages(struct kmem_cache *cachep, gfp_t flags,
 								int nodeid)
 {
-	struct folio *folio;
+	struct page *page;
 	struct slab *slab;
 
 	flags |= cachep->allocflags;
 
-	folio = (struct folio *) __alloc_pages_node(nodeid, flags, cachep->gfporder);
-	if (!folio) {
+	page = __alloc_frozen_pages(flags, cachep->gfporder, nodeid, NULL);
+	if (!page) {
 		slab_out_of_memory(cachep, flags, nodeid);
 		return NULL;
 	}
 
-	slab = folio_slab(folio);
+	__SetPageSlab(page);
+	slab = (struct slab *)page;
 
 	account_slab(slab, cachep->gfporder, cachep, flags);
-	__folio_set_slab(folio);
 	/* Record if ALLOC_NO_WATERMARKS was set when allocating the slab */
-	if (sk_memalloc_socks() && page_is_pfmemalloc(folio_page(folio, 0)))
+	if (sk_memalloc_socks() && page_is_pfmemalloc(page))
 		slab_set_pfmemalloc(slab);
 
 	return slab;
@@ -1383,18 +1383,17 @@ static struct slab *kmem_getpages(struct kmem_cache *cachep, gfp_t flags,
 static void kmem_freepages(struct kmem_cache *cachep, struct slab *slab)
 {
 	int order = cachep->gfporder;
-	struct folio *folio = slab_folio(slab);
+	struct page *page = (struct page *)slab;
 
-	BUG_ON(!folio_test_slab(folio));
 	__slab_clear_pfmemalloc(slab);
-	__folio_clear_slab(folio);
-	page_mapcount_reset(folio_page(folio, 0));
-	folio->mapping = NULL;
+	__ClearPageSlab(page);
+	page_mapcount_reset(page);
+	page->mapping = NULL;
 
 	if (current->reclaim_state)
 		current->reclaim_state->reclaimed_slab += 1 << order;
 	unaccount_slab(slab, order, cachep);
-	__free_pages(folio_page(folio, 0), order);
+	free_frozen_pages(page, order);
 }
 
 static void kmem_rcu_free(struct rcu_head *head)
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 16/16] slub: Allocate frozen pages
  2022-08-09 17:18 [PATCH v2 00/16] Allocate and free frozen pages Matthew Wilcox (Oracle)
                   ` (14 preceding siblings ...)
  2022-08-09 17:18 ` [PATCH v2 15/16] slab: Allocate frozen pages Matthew Wilcox (Oracle)
@ 2022-08-09 17:18 ` Matthew Wilcox (Oracle)
  2022-08-11  0:19 ` [PATCH v2 00/16] Allocate and free " Shakeel Butt
  16 siblings, 0 replies; 29+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-08-09 17:18 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle), William Kucharski

Since slub does not use the page refcount, it can allocate and
free frozen pages, saving one atomic operation per free.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
---
 mm/slub.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 862dbd9af4f5..65d14d7aa7a9 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1816,21 +1816,21 @@ static void *setup_object(struct kmem_cache *s, void *object)
 static inline struct slab *alloc_slab_page(gfp_t flags, int node,
 		struct kmem_cache_order_objects oo)
 {
-	struct folio *folio;
+	struct page *page;
 	struct slab *slab;
 	unsigned int order = oo_order(oo);
 
 	if (node == NUMA_NO_NODE)
-		folio = (struct folio *)alloc_pages(flags, order);
+		page = alloc_frozen_pages(flags, order);
 	else
-		folio = (struct folio *)__alloc_pages_node(node, flags, order);
+		page = __alloc_frozen_pages(flags, order, node, NULL);
 
-	if (!folio)
+	if (!page)
 		return NULL;
 
-	slab = folio_slab(folio);
-	__folio_set_slab(folio);
-	if (page_is_pfmemalloc(folio_page(folio, 0)))
+	slab = (struct slab *)page;
+	__SetPageSlab(page);
+	if (page_is_pfmemalloc(page))
 		slab_set_pfmemalloc(slab);
 
 	return slab;
@@ -2032,8 +2032,8 @@ static struct slab *new_slab(struct kmem_cache *s, gfp_t flags, int node)
 
 static void __free_slab(struct kmem_cache *s, struct slab *slab)
 {
-	struct folio *folio = slab_folio(slab);
-	int order = folio_order(folio);
+	struct page *page = (struct page *)slab;
+	int order = compound_order(page);
 	int pages = 1 << order;
 
 	if (kmem_cache_debug_flags(s, SLAB_CONSISTENCY_CHECKS)) {
@@ -2045,12 +2045,12 @@ static void __free_slab(struct kmem_cache *s, struct slab *slab)
 	}
 
 	__slab_clear_pfmemalloc(slab);
-	__folio_clear_slab(folio);
-	folio->mapping = NULL;
+	__ClearPageSlab(page);
+	page->mapping = NULL;
 	if (current->reclaim_state)
 		current->reclaim_state->reclaimed_slab += pages;
 	unaccount_slab(slab, order, s);
-	__free_pages(folio_page(folio, 0), order);
+	free_frozen_pages(page, order);
 }
 
 static void rcu_free_slab(struct rcu_head *h)
@@ -3568,7 +3568,7 @@ static inline void free_large_kmalloc(struct folio *folio, void *object)
 		pr_warn_once("object pointer: 0x%p\n", object);
 
 	kfree_hook(object);
-	mod_lruvec_page_state(folio_page(folio, 0), NR_SLAB_UNRECLAIMABLE_B,
+	lruvec_stat_mod_folio(folio, NR_SLAB_UNRECLAIMABLE_B,
 			      -(PAGE_SIZE << order));
 	__free_pages(folio_page(folio, 0), order);
 }
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 01/16] mm/page_alloc: Cache page_zone() result in free_unref_page()
  2022-08-09 17:18 ` [PATCH v2 01/16] mm/page_alloc: Cache page_zone() result in free_unref_page() Matthew Wilcox (Oracle)
@ 2022-08-10  1:56   ` Miaohe Lin
  2022-08-10  6:31   ` Muchun Song
  2022-08-10 15:00   ` Mel Gorman
  2 siblings, 0 replies; 29+ messages in thread
From: Miaohe Lin @ 2022-08-10  1:56 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-mm

On 2022/8/10 1:18, Matthew Wilcox (Oracle) wrote:
> Save 17 bytes of text by calculating page_zone() once instead of twice.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

Thanks.

> ---
>  mm/page_alloc.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index e5486d47406e..2745865a57c5 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3483,16 +3483,16 @@ void free_unref_page(struct page *page, unsigned int order)
>  	 * areas back if necessary. Otherwise, we may have to free
>  	 * excessively into the page allocator
>  	 */
> +	zone = page_zone(page);
>  	migratetype = get_pcppage_migratetype(page);
>  	if (unlikely(migratetype >= MIGRATE_PCPTYPES)) {
>  		if (unlikely(is_migrate_isolate(migratetype))) {
> -			free_one_page(page_zone(page), page, pfn, order, migratetype, FPI_NONE);
> +			free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);
>  			return;
>  		}
>  		migratetype = MIGRATE_MOVABLE;
>  	}
>  
> -	zone = page_zone(page);
>  	pcp_trylock_prepare(UP_flags);
>  	pcp = pcp_spin_trylock_irqsave(zone->per_cpu_pageset, flags);
>  	if (pcp) {
> 



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 03/16] mm/page_alloc: Export free_frozen_pages() instead of free_unref_page()
  2022-08-09 17:18 ` [PATCH v2 03/16] mm/page_alloc: Export free_frozen_pages() instead of free_unref_page() Matthew Wilcox (Oracle)
@ 2022-08-10  3:00   ` Miaohe Lin
  2022-08-10  6:37   ` Muchun Song
  1 sibling, 0 replies; 29+ messages in thread
From: Miaohe Lin @ 2022-08-10  3:00 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-mm; +Cc: David Hildenbrand, William Kucharski

On 2022/8/10 1:18, Matthew Wilcox (Oracle) wrote:
> This API makes more sense for slab to use and it works perfectly
> well for swap.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Reviewed-by: David Hildenbrand <david@redhat.com>
> Reviewed-by: William Kucharski <william.kucharski@oracle.com>

Looks good to me. Thanks.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

> ---
>  mm/internal.h   |  4 ++--
>  mm/page_alloc.c | 18 +++++++++---------
>  mm/swap.c       |  2 +-
>  3 files changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/mm/internal.h b/mm/internal.h
> index 785409805ed7..08d0881223cf 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -362,8 +362,8 @@ extern void post_alloc_hook(struct page *page, unsigned int order,
>  					gfp_t gfp_flags);
>  extern int user_min_free_kbytes;
>  
> -extern void free_unref_page(struct page *page, unsigned int order);
> -extern void free_unref_page_list(struct list_head *list);
> +void free_frozen_pages(struct page *, unsigned int order);
> +void free_unref_page_list(struct list_head *list);
>  
>  extern void zone_pcp_update(struct zone *zone, int cpu_online);
>  extern void zone_pcp_reset(struct zone *zone);
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 04260b5a7699..30e7a5974d39 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -761,14 +761,6 @@ static inline bool pcp_allowed_order(unsigned int order)
>  	return false;
>  }
>  
> -static inline void free_frozen_pages(struct page *page, unsigned int order)
> -{
> -	if (pcp_allowed_order(order))		/* Via pcp? */
> -		free_unref_page(page, order);
> -	else
> -		__free_pages_ok(page, order, FPI_NONE);
> -}
> -
>  /*
>   * Higher-order pages are called "compound pages".  They are structured thusly:
>   *
> @@ -3464,7 +3456,7 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp,
>  /*
>   * Free a pcp page
>   */
> -void free_unref_page(struct page *page, unsigned int order)
> +static void free_unref_page(struct page *page, unsigned int order)
>  {
>  	unsigned long flags;
>  	unsigned long __maybe_unused UP_flags;
> @@ -3504,6 +3496,14 @@ void free_unref_page(struct page *page, unsigned int order)
>  	pcp_trylock_finish(UP_flags);
>  }
>  
> +void free_frozen_pages(struct page *page, unsigned int order)
> +{
> +	if (pcp_allowed_order(order))		/* Via pcp? */
> +		free_unref_page(page, order);
> +	else
> +		__free_pages_ok(page, order, FPI_NONE);
> +}
> +
>  /*
>   * Free a list of 0-order pages
>   */
> diff --git a/mm/swap.c b/mm/swap.c
> index 6525011b715e..647f6f77193f 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -102,7 +102,7 @@ static void __folio_put_small(struct folio *folio)
>  {
>  	__page_cache_release(folio);
>  	mem_cgroup_uncharge(folio);
> -	free_unref_page(&folio->page, 0);
> +	free_frozen_pages(&folio->page, 0);
>  }
>  
>  static void __folio_put_large(struct folio *folio)
> 



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 04/16] mm/page_alloc: Move set_page_refcounted() to callers of post_alloc_hook()
  2022-08-09 17:18 ` [PATCH v2 04/16] mm/page_alloc: Move set_page_refcounted() to callers of post_alloc_hook() Matthew Wilcox (Oracle)
@ 2022-08-10  3:30   ` Miaohe Lin
  0 siblings, 0 replies; 29+ messages in thread
From: Miaohe Lin @ 2022-08-10  3:30 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-mm

On 2022/8/10 1:18, Matthew Wilcox (Oracle) wrote:
> In preparation for allocating frozen pages, stop initialising
> the page refcount in post_alloc_hook().
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Looks good to me. Thanks.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

> ---
>  mm/compaction.c | 1 +
>  mm/page_alloc.c | 2 +-
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 640fa76228dd..63dc6abdb573 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -97,6 +97,7 @@ static void split_map_pages(struct list_head *list)
>  		nr_pages = 1 << order;
>  
>  		post_alloc_hook(page, order, __GFP_MOVABLE);
> +		set_page_refcounted(page);
>  		if (order)
>  			split_page(page, order);
>  
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 30e7a5974d39..d41b8c8f3135 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2465,7 +2465,6 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
>  	int i;
>  
>  	set_page_private(page, 0);
> -	set_page_refcounted(page);
>  
>  	arch_alloc_page(page, order);
>  	debug_pagealloc_map_pages(page, 1 << order);
> @@ -2536,6 +2535,7 @@ static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags
>  		set_page_pfmemalloc(page);
>  	else
>  		clear_page_pfmemalloc(page);
> +	set_page_refcounted(page);
>  }
>  
>  /*
> 



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 01/16] mm/page_alloc: Cache page_zone() result in free_unref_page()
  2022-08-09 17:18 ` [PATCH v2 01/16] mm/page_alloc: Cache page_zone() result in free_unref_page() Matthew Wilcox (Oracle)
  2022-08-10  1:56   ` Miaohe Lin
@ 2022-08-10  6:31   ` Muchun Song
  2022-08-10 15:00   ` Mel Gorman
  2 siblings, 0 replies; 29+ messages in thread
From: Muchun Song @ 2022-08-10  6:31 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm



> On Aug 10, 2022, at 01:18, Matthew Wilcox (Oracle) <willy@infradead.org> wrote:
> 
> Save 17 bytes of text by calculating page_zone() once instead of twice.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Muchun Song <songmuchun@bytedance.com>

Thanks.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 02/16] mm/page_alloc: Rename free_the_page() to free_frozen_pages()
  2022-08-09 17:18 ` [PATCH v2 02/16] mm/page_alloc: Rename free_the_page() to free_frozen_pages() Matthew Wilcox (Oracle)
@ 2022-08-10  6:36   ` Muchun Song
  0 siblings, 0 replies; 29+ messages in thread
From: Muchun Song @ 2022-08-10  6:36 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: linux-mm, David Hildenbrand, Miaohe Lin, William Kucharski



> On Aug 10, 2022, at 01:18, Matthew Wilcox (Oracle) <willy@infradead.org> wrote:
> 
> In preparation for making this function available outside page_alloc,
> rename it to free_frozen_pages(), which fits better with the other
> memory allocation/free functions.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Reviewed-by: David Hildenbrand <david@redhat.com>
> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
> Reviewed-by: William Kucharski <william.kucharski@oracle.com>

Reviewed-by: Muchun Song <songmuchun@bytedance.com>

Thanks.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 03/16] mm/page_alloc: Export free_frozen_pages() instead of free_unref_page()
  2022-08-09 17:18 ` [PATCH v2 03/16] mm/page_alloc: Export free_frozen_pages() instead of free_unref_page() Matthew Wilcox (Oracle)
  2022-08-10  3:00   ` Miaohe Lin
@ 2022-08-10  6:37   ` Muchun Song
  1 sibling, 0 replies; 29+ messages in thread
From: Muchun Song @ 2022-08-10  6:37 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm, David Hildenbrand, William Kucharski



> On Aug 10, 2022, at 01:18, Matthew Wilcox (Oracle) <willy@infradead.org> wrote:
> 
> This API makes more sense for slab to use and it works perfectly
> well for swap.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Reviewed-by: David Hildenbrand <david@redhat.com>
> Reviewed-by: William Kucharski <william.kucharski@oracle.com>

Reviewed-by: Muchun Song <songmuchun@bytedance.com>

Thanks.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 15/16] slab: Allocate frozen pages
  2022-08-09 17:18 ` [PATCH v2 15/16] slab: Allocate frozen pages Matthew Wilcox (Oracle)
@ 2022-08-10 12:31   ` Vlastimil Babka
  2022-08-10 16:27     ` Mel Gorman
  0 siblings, 1 reply; 29+ messages in thread
From: Vlastimil Babka @ 2022-08-10 12:31 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-mm
  Cc: William Kucharski, David Hildenbrand, Mel Gorman

On 8/9/22 19:18, Matthew Wilcox (Oracle) wrote:
> Since slab does not use the page refcount, it can allocate and
> free frozen pages, saving one atomic operation per free.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Reviewed-by: William Kucharski <william.kucharski@oracle.com>

AFAICS the problem of has_unmovable_pages() is not addressed:
https://lore.kernel.org/all/40d658da-6220-e05e-ba0b-d95c82f6bfb3@redhat.com/

But I don't think it's sustainable approach to enhance the checks there 
with PageSlab() and then with whatever other user will adopt allocating 
frozen pages in the future. I guess it would be better to just be able 
to detect pages on pcplist without false positives. A new page type? 
Maybe the overhead of managing it would be negligible as we set 
page->index anyway for migratetype?

> ---
>   mm/slab.c | 23 +++++++++++------------
>   1 file changed, 11 insertions(+), 12 deletions(-)
> 
> diff --git a/mm/slab.c b/mm/slab.c
> index 10e96137b44f..e7603d23c6c9 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -1355,23 +1355,23 @@ slab_out_of_memory(struct kmem_cache *cachep, gfp_t gfpflags, int nodeid)
>   static struct slab *kmem_getpages(struct kmem_cache *cachep, gfp_t flags,
>   								int nodeid)
>   {
> -	struct folio *folio;
> +	struct page *page;
>   	struct slab *slab;
>   
>   	flags |= cachep->allocflags;
>   
> -	folio = (struct folio *) __alloc_pages_node(nodeid, flags, cachep->gfporder);
> -	if (!folio) {
> +	page = __alloc_frozen_pages(flags, cachep->gfporder, nodeid, NULL);
> +	if (!page) {
>   		slab_out_of_memory(cachep, flags, nodeid);
>   		return NULL;
>   	}
>   
> -	slab = folio_slab(folio);
> +	__SetPageSlab(page);
> +	slab = (struct slab *)page;
>   
>   	account_slab(slab, cachep->gfporder, cachep, flags);
> -	__folio_set_slab(folio);
>   	/* Record if ALLOC_NO_WATERMARKS was set when allocating the slab */
> -	if (sk_memalloc_socks() && page_is_pfmemalloc(folio_page(folio, 0)))
> +	if (sk_memalloc_socks() && page_is_pfmemalloc(page))
>   		slab_set_pfmemalloc(slab);
>   
>   	return slab;
> @@ -1383,18 +1383,17 @@ static struct slab *kmem_getpages(struct kmem_cache *cachep, gfp_t flags,
>   static void kmem_freepages(struct kmem_cache *cachep, struct slab *slab)
>   {
>   	int order = cachep->gfporder;
> -	struct folio *folio = slab_folio(slab);
> +	struct page *page = (struct page *)slab;
>   
> -	BUG_ON(!folio_test_slab(folio));
>   	__slab_clear_pfmemalloc(slab);
> -	__folio_clear_slab(folio);
> -	page_mapcount_reset(folio_page(folio, 0));
> -	folio->mapping = NULL;
> +	__ClearPageSlab(page);
> +	page_mapcount_reset(page);
> +	page->mapping = NULL;
>   
>   	if (current->reclaim_state)
>   		current->reclaim_state->reclaimed_slab += 1 << order;
>   	unaccount_slab(slab, order, cachep);
> -	__free_pages(folio_page(folio, 0), order);
> +	free_frozen_pages(page, order);
>   }
>   
>   static void kmem_rcu_free(struct rcu_head *head)



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 01/16] mm/page_alloc: Cache page_zone() result in free_unref_page()
  2022-08-09 17:18 ` [PATCH v2 01/16] mm/page_alloc: Cache page_zone() result in free_unref_page() Matthew Wilcox (Oracle)
  2022-08-10  1:56   ` Miaohe Lin
  2022-08-10  6:31   ` Muchun Song
@ 2022-08-10 15:00   ` Mel Gorman
  2 siblings, 0 replies; 29+ messages in thread
From: Mel Gorman @ 2022-08-10 15:00 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm

On Tue, Aug 09, 2022 at 06:18:39PM +0100, Matthew Wilcox (Oracle) wrote:
> Save 17 bytes of text by calculating page_zone() once instead of twice.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Acked-by: Mel Gorman <mgorman@techsinguularity.net>

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 15/16] slab: Allocate frozen pages
  2022-08-10 12:31   ` Vlastimil Babka
@ 2022-08-10 16:27     ` Mel Gorman
  0 siblings, 0 replies; 29+ messages in thread
From: Mel Gorman @ 2022-08-10 16:27 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Matthew Wilcox (Oracle), linux-mm, William Kucharski, David Hildenbrand

On Wed, Aug 10, 2022 at 02:31:11PM +0200, Vlastimil Babka wrote:
> On 8/9/22 19:18, Matthew Wilcox (Oracle) wrote:
> > Since slab does not use the page refcount, it can allocate and
> > free frozen pages, saving one atomic operation per free.
> > 
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > Reviewed-by: William Kucharski <william.kucharski@oracle.com>
> 
> AFAICS the problem of has_unmovable_pages() is not addressed:
> https://lore.kernel.org/all/40d658da-6220-e05e-ba0b-d95c82f6bfb3@redhat.com/
> 
> But I don't think it's sustainable approach to enhance the checks there with
> PageSlab() and then with whatever other user will adopt allocating frozen
> pages in the future. I guess it would be better to just be able to detect
> pages on pcplist without false positives. A new page type? Maybe the
> overhead of managing it would be negligible as we set page->index anyway for
> migratetype?
> 

I think page type would be usable to identify a PCP page same as how
it's used to identify a buddy page. Most likely, this could be done in
check_pcp_refill, check_new_pcp (watch DEBUG_VM) and free_pcppages_bulk.
There would be a race between the last refcount being dropped and becoming
a PCP page but I doubt that matters to page isolation as I expect it retries.

The __Clear and __Set operations would add some overhead but it's almost
certainly cheaper than the put_page_testzero in __free_pages().

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 00/16] Allocate and free frozen pages
  2022-08-09 17:18 [PATCH v2 00/16] Allocate and free frozen pages Matthew Wilcox (Oracle)
                   ` (15 preceding siblings ...)
  2022-08-09 17:18 ` [PATCH v2 16/16] slub: " Matthew Wilcox (Oracle)
@ 2022-08-11  0:19 ` Shakeel Butt
  2022-08-12  0:13   ` Matthew Wilcox
  16 siblings, 1 reply; 29+ messages in thread
From: Shakeel Butt @ 2022-08-11  0:19 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm

On Tue, Aug 09, 2022 at 06:18:38PM +0100, Matthew Wilcox (Oracle) wrote:
[...]
> 
> This patchset is also a step towards the Glorious Future in which struct
> page doesn't have a refcount; the users which need a refcount will have
> one in their per-allocation memdesc.
> 

Can you please share a bit more details about this glorious future? How
do you envision the per-allocation memdesc will work?


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 00/16] Allocate and free frozen pages
  2022-08-11  0:19 ` [PATCH v2 00/16] Allocate and free " Shakeel Butt
@ 2022-08-12  0:13   ` Matthew Wilcox
  2022-08-12 16:48     ` Shakeel Butt
  0 siblings, 1 reply; 29+ messages in thread
From: Matthew Wilcox @ 2022-08-12  0:13 UTC (permalink / raw)
  To: Shakeel Butt; +Cc: linux-mm

On Thu, Aug 11, 2022 at 12:19:56AM +0000, Shakeel Butt wrote:
> On Tue, Aug 09, 2022 at 06:18:38PM +0100, Matthew Wilcox (Oracle) wrote:
> [...]
> > 
> > This patchset is also a step towards the Glorious Future in which struct
> > page doesn't have a refcount; the users which need a refcount will have
> > one in their per-allocation memdesc.
> > 
> 
> Can you please share a bit more details about this glorious future? How
> do you envision the per-allocation memdesc will work?

Hopefully the 'State of the Page' email answers your questions.  If not,
let me know and I can go into more detail (or admit I haven't thought
about the detail you're asking about).


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 00/16] Allocate and free frozen pages
  2022-08-12  0:13   ` Matthew Wilcox
@ 2022-08-12 16:48     ` Shakeel Butt
  0 siblings, 0 replies; 29+ messages in thread
From: Shakeel Butt @ 2022-08-12 16:48 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm

On Thu, Aug 11, 2022 at 5:13 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Thu, Aug 11, 2022 at 12:19:56AM +0000, Shakeel Butt wrote:
> > On Tue, Aug 09, 2022 at 06:18:38PM +0100, Matthew Wilcox (Oracle) wrote:
> > [...]
> > >
> > > This patchset is also a step towards the Glorious Future in which struct
> > > page doesn't have a refcount; the users which need a refcount will have
> > > one in their per-allocation memdesc.
> > >
> >
> > Can you please share a bit more details about this glorious future? How
> > do you envision the per-allocation memdesc will work?
>
> Hopefully the 'State of the Page' email answers your questions.  If not,
> let me know and I can go into more detail (or admit I haven't thought
> about the detail you're asking about).

Thanks, that is helpful. I will ask questions there if I have any.


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2022-08-12 16:49 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-09 17:18 [PATCH v2 00/16] Allocate and free frozen pages Matthew Wilcox (Oracle)
2022-08-09 17:18 ` [PATCH v2 01/16] mm/page_alloc: Cache page_zone() result in free_unref_page() Matthew Wilcox (Oracle)
2022-08-10  1:56   ` Miaohe Lin
2022-08-10  6:31   ` Muchun Song
2022-08-10 15:00   ` Mel Gorman
2022-08-09 17:18 ` [PATCH v2 02/16] mm/page_alloc: Rename free_the_page() to free_frozen_pages() Matthew Wilcox (Oracle)
2022-08-10  6:36   ` Muchun Song
2022-08-09 17:18 ` [PATCH v2 03/16] mm/page_alloc: Export free_frozen_pages() instead of free_unref_page() Matthew Wilcox (Oracle)
2022-08-10  3:00   ` Miaohe Lin
2022-08-10  6:37   ` Muchun Song
2022-08-09 17:18 ` [PATCH v2 04/16] mm/page_alloc: Move set_page_refcounted() to callers of post_alloc_hook() Matthew Wilcox (Oracle)
2022-08-10  3:30   ` Miaohe Lin
2022-08-09 17:18 ` [PATCH v2 05/16] mm/page_alloc: Move set_page_refcounted() to callers of prep_new_page() Matthew Wilcox (Oracle)
2022-08-09 17:18 ` [PATCH v2 06/16] mm/page_alloc: Move set_page_refcounted() to callers of get_page_from_freelist() Matthew Wilcox (Oracle)
2022-08-09 17:18 ` [PATCH v2 07/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_cpuset_fallback() Matthew Wilcox (Oracle)
2022-08-09 17:18 ` [PATCH v2 08/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_may_oom() Matthew Wilcox (Oracle)
2022-08-09 17:18 ` [PATCH v2 09/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_direct_compact() Matthew Wilcox (Oracle)
2022-08-09 17:18 ` [PATCH v2 10/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_direct_reclaim() Matthew Wilcox (Oracle)
2022-08-09 17:18 ` [PATCH v2 11/16] mm/page_alloc: Move set_page_refcounted() to callers of __alloc_pages_slowpath() Matthew Wilcox (Oracle)
2022-08-09 17:18 ` [PATCH v2 12/16] mm/page_alloc: Move set_page_refcounted() to end of __alloc_pages() Matthew Wilcox (Oracle)
2022-08-09 17:18 ` [PATCH v2 13/16] mm/page_alloc: Add __alloc_frozen_pages() Matthew Wilcox (Oracle)
2022-08-09 17:18 ` [PATCH v2 14/16] mm/mempolicy: Add alloc_frozen_pages() Matthew Wilcox (Oracle)
2022-08-09 17:18 ` [PATCH v2 15/16] slab: Allocate frozen pages Matthew Wilcox (Oracle)
2022-08-10 12:31   ` Vlastimil Babka
2022-08-10 16:27     ` Mel Gorman
2022-08-09 17:18 ` [PATCH v2 16/16] slub: " Matthew Wilcox (Oracle)
2022-08-11  0:19 ` [PATCH v2 00/16] Allocate and free " Shakeel Butt
2022-08-12  0:13   ` Matthew Wilcox
2022-08-12 16:48     ` Shakeel Butt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.