All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] Allocate and free frozen pages
@ 2022-05-31 15:06 Matthew Wilcox (Oracle)
  2022-05-31 15:06 ` [PATCH 1/6] mm/page_alloc: Remove zone parameter from free_one_page() Matthew Wilcox (Oracle)
                   ` (6 more replies)
  0 siblings, 7 replies; 20+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-05-31 15:06 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

We already have the ability to freeze a page (safely reduce its
reference count to 0).  Some users (eg slab) would prefer to be able
to allocate frozen pages and avoid touching the refcount.  It also
avoids spurious temporary refcounts being taken on these pages.

Matthew Wilcox (Oracle) (6):
  mm/page_alloc: Remove zone parameter from free_one_page()
  mm/page_alloc: Rename free_the_page() to free_frozen_pages()
  mm/page_alloc: Export free_frozen_pages() instead of free_unref_page()
  mm/page_alloc: Add alloc_frozen_pages()
  slab: Allocate frozen pages
  slub: Allocate frozen pages

 mm/internal.h   | 15 ++++++++++--
 mm/mempolicy.c  | 61 ++++++++++++++++++++++++++++++-------------------
 mm/page_alloc.c | 59 +++++++++++++++++++++++++++--------------------
 mm/slab.c       | 23 +++++++++----------
 mm/slub.c       | 26 ++++++++++-----------
 mm/swap.c       |  2 +-
 6 files changed, 110 insertions(+), 76 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/6] mm/page_alloc: Remove zone parameter from free_one_page()
  2022-05-31 15:06 [PATCH 0/6] Allocate and free frozen pages Matthew Wilcox (Oracle)
@ 2022-05-31 15:06 ` Matthew Wilcox (Oracle)
  2022-05-31 16:59   ` David Hildenbrand
  2022-06-01  6:53   ` Miaohe Lin
  2022-05-31 15:06 ` [PATCH 2/6] mm/page_alloc: Rename free_the_page() to free_frozen_pages() Matthew Wilcox (Oracle)
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 20+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-05-31 15:06 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

Both callers pass in page_zone(page), so move that into free_one_page().
Shrinks page_alloc.o by 196 bytes with allmodconfig.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/page_alloc.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 29d775b60cf9..68bb77900f67 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1529,11 +1529,10 @@ static void free_pcppages_bulk(struct zone *zone, int count,
 	spin_unlock(&zone->lock);
 }
 
-static void free_one_page(struct zone *zone,
-				struct page *page, unsigned long pfn,
-				unsigned int order,
-				int migratetype, fpi_t fpi_flags)
+static void free_one_page(struct page *page, unsigned long pfn,
+		unsigned int order, int migratetype, fpi_t fpi_flags)
 {
+	struct zone *zone = page_zone(page);
 	unsigned long flags;
 
 	spin_lock_irqsave(&zone->lock, flags);
@@ -3448,7 +3447,7 @@ void free_unref_page(struct page *page, unsigned int order)
 	migratetype = get_pcppage_migratetype(page);
 	if (unlikely(migratetype >= MIGRATE_PCPTYPES)) {
 		if (unlikely(is_migrate_isolate(migratetype))) {
-			free_one_page(page_zone(page), page, pfn, order, migratetype, FPI_NONE);
+			free_one_page(page, pfn, order, migratetype, FPI_NONE);
 			return;
 		}
 		migratetype = MIGRATE_MOVABLE;
@@ -3484,7 +3483,7 @@ void free_unref_page_list(struct list_head *list)
 		migratetype = get_pcppage_migratetype(page);
 		if (unlikely(is_migrate_isolate(migratetype))) {
 			list_del(&page->lru);
-			free_one_page(page_zone(page), page, pfn, 0, migratetype, FPI_NONE);
+			free_one_page(page, pfn, 0, migratetype, FPI_NONE);
 			continue;
 		}
 	}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/6] mm/page_alloc: Rename free_the_page() to free_frozen_pages()
  2022-05-31 15:06 [PATCH 0/6] Allocate and free frozen pages Matthew Wilcox (Oracle)
  2022-05-31 15:06 ` [PATCH 1/6] mm/page_alloc: Remove zone parameter from free_one_page() Matthew Wilcox (Oracle)
@ 2022-05-31 15:06 ` Matthew Wilcox (Oracle)
  2022-05-31 17:02   ` David Hildenbrand
  2022-06-01  6:58   ` Miaohe Lin
  2022-05-31 15:06 ` [PATCH 3/6] mm/page_alloc: Export free_frozen_pages() instead of free_unref_page() Matthew Wilcox (Oracle)
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 20+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-05-31 15:06 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

In preparation for making this function available outside page_alloc,
rename it to free_frozen_pages(), which fits better with the other
memory allocation/free functions.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/page_alloc.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 68bb77900f67..6a8676cb69db 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -687,7 +687,7 @@ static inline bool pcp_allowed_order(unsigned int order)
 	return false;
 }
 
-static inline void free_the_page(struct page *page, unsigned int order)
+static inline void free_frozen_pages(struct page *page, unsigned int order)
 {
 	if (pcp_allowed_order(order))		/* Via pcp? */
 		free_unref_page(page, order);
@@ -713,7 +713,7 @@ static inline void free_the_page(struct page *page, unsigned int order)
 void free_compound_page(struct page *page)
 {
 	mem_cgroup_uncharge(page_folio(page));
-	free_the_page(page, compound_order(page));
+	free_frozen_pages(page, compound_order(page));
 }
 
 static void prep_compound_head(struct page *page, unsigned int order)
@@ -5507,10 +5507,10 @@ EXPORT_SYMBOL(get_zeroed_page);
 void __free_pages(struct page *page, unsigned int order)
 {
 	if (put_page_testzero(page))
-		free_the_page(page, order);
+		free_frozen_pages(page, order);
 	else if (!PageHead(page))
 		while (order-- > 0)
-			free_the_page(page + (1 << order), order);
+			free_frozen_pages(page + (1 << order), order);
 }
 EXPORT_SYMBOL(__free_pages);
 
@@ -5561,7 +5561,7 @@ void __page_frag_cache_drain(struct page *page, unsigned int count)
 	VM_BUG_ON_PAGE(page_ref_count(page) == 0, page);
 
 	if (page_ref_sub_and_test(page, count))
-		free_the_page(page, compound_order(page));
+		free_frozen_pages(page, compound_order(page));
 }
 EXPORT_SYMBOL(__page_frag_cache_drain);
 
@@ -5602,7 +5602,7 @@ void *page_frag_alloc_align(struct page_frag_cache *nc,
 			goto refill;
 
 		if (unlikely(nc->pfmemalloc)) {
-			free_the_page(page, compound_order(page));
+			free_frozen_pages(page, compound_order(page));
 			goto refill;
 		}
 
@@ -5634,7 +5634,7 @@ void page_frag_free(void *addr)
 	struct page *page = virt_to_head_page(addr);
 
 	if (unlikely(put_page_testzero(page)))
-		free_the_page(page, compound_order(page));
+		free_frozen_pages(page, compound_order(page));
 }
 EXPORT_SYMBOL(page_frag_free);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/6] mm/page_alloc: Export free_frozen_pages() instead of free_unref_page()
  2022-05-31 15:06 [PATCH 0/6] Allocate and free frozen pages Matthew Wilcox (Oracle)
  2022-05-31 15:06 ` [PATCH 1/6] mm/page_alloc: Remove zone parameter from free_one_page() Matthew Wilcox (Oracle)
  2022-05-31 15:06 ` [PATCH 2/6] mm/page_alloc: Rename free_the_page() to free_frozen_pages() Matthew Wilcox (Oracle)
@ 2022-05-31 15:06 ` Matthew Wilcox (Oracle)
  2022-05-31 17:09   ` David Hildenbrand
  2022-05-31 15:06 ` [PATCH 4/6] mm/page_alloc: Add alloc_frozen_pages() Matthew Wilcox (Oracle)
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-05-31 15:06 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

This API makes more sense for slab to use and it works perfectly
well for swap.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/internal.h   |  4 ++--
 mm/page_alloc.c | 18 +++++++++---------
 mm/swap.c       |  2 +-
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index c0f8fbe0445b..f1c0dab2b98e 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -362,8 +362,8 @@ extern void post_alloc_hook(struct page *page, unsigned int order,
 					gfp_t gfp_flags);
 extern int user_min_free_kbytes;
 
-extern void free_unref_page(struct page *page, unsigned int order);
-extern void free_unref_page_list(struct list_head *list);
+void free_frozen_pages(struct page *, unsigned int order);
+void free_unref_page_list(struct list_head *list);
 
 extern void zone_pcp_update(struct zone *zone, int cpu_online);
 extern void zone_pcp_reset(struct zone *zone);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6a8676cb69db..825922000781 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -687,14 +687,6 @@ static inline bool pcp_allowed_order(unsigned int order)
 	return false;
 }
 
-static inline void free_frozen_pages(struct page *page, unsigned int order)
-{
-	if (pcp_allowed_order(order))		/* Via pcp? */
-		free_unref_page(page, order);
-	else
-		__free_pages_ok(page, order, FPI_NONE);
-}
-
 /*
  * Higher-order pages are called "compound pages".  They are structured thusly:
  *
@@ -3428,7 +3420,7 @@ static void free_unref_page_commit(struct page *page, int migratetype,
 /*
  * Free a pcp page
  */
-void free_unref_page(struct page *page, unsigned int order)
+static void free_unref_page(struct page *page, unsigned int order)
 {
 	unsigned long flags;
 	unsigned long pfn = page_to_pfn(page);
@@ -3458,6 +3450,14 @@ void free_unref_page(struct page *page, unsigned int order)
 	local_unlock_irqrestore(&pagesets.lock, flags);
 }
 
+void free_frozen_pages(struct page *page, unsigned int order)
+{
+	if (pcp_allowed_order(order))		/* Via pcp? */
+		free_unref_page(page, order);
+	else
+		__free_pages_ok(page, order, FPI_NONE);
+}
+
 /*
  * Free a list of 0-order pages
  */
diff --git a/mm/swap.c b/mm/swap.c
index f3922a96b2e9..c474bdf838e3 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -103,7 +103,7 @@ static void __put_single_page(struct page *page)
 {
 	__page_cache_release(page);
 	mem_cgroup_uncharge(page_folio(page));
-	free_unref_page(page, 0);
+	free_frozen_pages(page, 0);
 }
 
 static void __put_compound_page(struct page *page)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/6] mm/page_alloc: Add alloc_frozen_pages()
  2022-05-31 15:06 [PATCH 0/6] Allocate and free frozen pages Matthew Wilcox (Oracle)
                   ` (2 preceding siblings ...)
  2022-05-31 15:06 ` [PATCH 3/6] mm/page_alloc: Export free_frozen_pages() instead of free_unref_page() Matthew Wilcox (Oracle)
@ 2022-05-31 15:06 ` Matthew Wilcox (Oracle)
  2022-05-31 15:06 ` [PATCH 5/6] slab: Allocate frozen pages Matthew Wilcox (Oracle)
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-05-31 15:06 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

Provide an interface to allocate pages from the page allocator without
incrementing their refcount.  This saves an atomic operation on free,
which may be beneficial to some users (eg slab).

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/internal.h   | 11 +++++++++
 mm/mempolicy.c  | 61 ++++++++++++++++++++++++++++++-------------------
 mm/page_alloc.c | 18 +++++++++++----
 3 files changed, 63 insertions(+), 27 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index f1c0dab2b98e..bf70ee2e38e9 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -362,9 +362,20 @@ extern void post_alloc_hook(struct page *page, unsigned int order,
 					gfp_t gfp_flags);
 extern int user_min_free_kbytes;
 
+struct page *__alloc_frozen_pages(gfp_t, unsigned int order, int nid,
+		nodemask_t *);
 void free_frozen_pages(struct page *, unsigned int order);
 void free_unref_page_list(struct list_head *list);
 
+#ifdef CONFIG_NUMA
+struct page *alloc_frozen_pages(gfp_t, unsigned int order);
+#else
+static inline struct page *alloc_frozen_pages(gfp_t gfp, unsigned int order)
+{
+	return __alloc_frozen_pages(gfp, order, numa_node_id(), NULL);
+}
+#endif
+
 extern void zone_pcp_update(struct zone *zone, int cpu_online);
 extern void zone_pcp_reset(struct zone *zone);
 extern void zone_pcp_disable(struct zone *zone);
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index d39b01fd52fe..ac7c45d0f7dc 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2102,7 +2102,7 @@ static struct page *alloc_page_interleave(gfp_t gfp, unsigned order,
 {
 	struct page *page;
 
-	page = __alloc_pages(gfp, order, nid, NULL);
+	page = __alloc_frozen_pages(gfp, order, nid, NULL);
 	/* skip NUMA_INTERLEAVE_HIT counter update if numa stats is disabled */
 	if (!static_branch_likely(&vm_numa_stat_key))
 		return page;
@@ -2128,9 +2128,9 @@ static struct page *alloc_pages_preferred_many(gfp_t gfp, unsigned int order,
 	 */
 	preferred_gfp = gfp | __GFP_NOWARN;
 	preferred_gfp &= ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL);
-	page = __alloc_pages(preferred_gfp, order, nid, &pol->nodes);
+	page = __alloc_frozen_pages(preferred_gfp, order, nid, &pol->nodes);
 	if (!page)
-		page = __alloc_pages(gfp, order, nid, NULL);
+		page = __alloc_frozen_pages(gfp, order, nid, NULL);
 
 	return page;
 }
@@ -2169,8 +2169,11 @@ struct folio *vma_alloc_folio(gfp_t gfp, int order, struct vm_area_struct *vma,
 		mpol_cond_put(pol);
 		gfp |= __GFP_COMP;
 		page = alloc_page_interleave(gfp, order, nid);
-		if (page && order > 1)
-			prep_transhuge_page(page);
+		if (page) {
+			set_page_refcounted(page);
+			if (order > 1)
+				prep_transhuge_page(page);
+		}
 		folio = (struct folio *)page;
 		goto out;
 	}
@@ -2182,8 +2185,11 @@ struct folio *vma_alloc_folio(gfp_t gfp, int order, struct vm_area_struct *vma,
 		gfp |= __GFP_COMP;
 		page = alloc_pages_preferred_many(gfp, order, node, pol);
 		mpol_cond_put(pol);
-		if (page && order > 1)
-			prep_transhuge_page(page);
+		if (page) {
+			set_page_refcounted(page);
+			if (order > 1)
+				prep_transhuge_page(page);
+		}
 		folio = (struct folio *)page;
 		goto out;
 	}
@@ -2237,21 +2243,7 @@ struct folio *vma_alloc_folio(gfp_t gfp, int order, struct vm_area_struct *vma,
 }
 EXPORT_SYMBOL(vma_alloc_folio);
 
-/**
- * alloc_pages - Allocate pages.
- * @gfp: GFP flags.
- * @order: Power of two of number of pages to allocate.
- *
- * Allocate 1 << @order contiguous pages.  The physical address of the
- * first page is naturally aligned (eg an order-3 allocation will be aligned
- * to a multiple of 8 * PAGE_SIZE bytes).  The NUMA policy of the current
- * process is honoured when in process context.
- *
- * Context: Can be called from any context, providing the appropriate GFP
- * flags are used.
- * Return: The page on success or NULL if allocation fails.
- */
-struct page *alloc_pages(gfp_t gfp, unsigned order)
+struct page *alloc_frozen_pages(gfp_t gfp, unsigned order)
 {
 	struct mempolicy *pol = &default_policy;
 	struct page *page;
@@ -2269,12 +2261,35 @@ struct page *alloc_pages(gfp_t gfp, unsigned order)
 		page = alloc_pages_preferred_many(gfp, order,
 				  policy_node(gfp, pol, numa_node_id()), pol);
 	else
-		page = __alloc_pages(gfp, order,
+		page = __alloc_frozen_pages(gfp, order,
 				policy_node(gfp, pol, numa_node_id()),
 				policy_nodemask(gfp, pol));
 
 	return page;
 }
+
+/**
+ * alloc_pages - Allocate pages.
+ * @gfp: GFP flags.
+ * @order: Power of two of number of pages to allocate.
+ *
+ * Allocate 1 << @order contiguous pages.  The physical address of the
+ * first page is naturally aligned (eg an order-3 allocation will be aligned
+ * to a multiple of 8 * PAGE_SIZE bytes).  The NUMA policy of the current
+ * process is honoured when in process context.
+ *
+ * Context: Can be called from any context, providing the appropriate GFP
+ * flags are used.
+ * Return: The page on success or NULL if allocation fails.
+ */
+struct page *alloc_pages(gfp_t gfp, unsigned order)
+{
+	struct page *page = alloc_frozen_pages(gfp, order);
+
+	if (page)
+		set_page_refcounted(page);
+	return page;
+}
 EXPORT_SYMBOL(alloc_pages);
 
 struct folio *folio_alloc(gfp_t gfp, unsigned order)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 825922000781..49d8f04d14ef 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2390,7 +2390,6 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
 	bool init_tags = init && (gfp_flags & __GFP_ZEROTAGS);
 
 	set_page_private(page, 0);
-	set_page_refcounted(page);
 
 	arch_alloc_page(page, order);
 	debug_pagealloc_map_pages(page, 1 << order);
@@ -5386,8 +5385,8 @@ EXPORT_SYMBOL_GPL(__alloc_pages_bulk);
 /*
  * This is the 'heart' of the zoned buddy allocator.
  */
-struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid,
-							nodemask_t *nodemask)
+struct page *__alloc_frozen_pages(gfp_t gfp, unsigned int order,
+		int preferred_nid, nodemask_t *nodemask)
 {
 	struct page *page;
 	unsigned int alloc_flags = ALLOC_WMARK_LOW;
@@ -5440,7 +5439,7 @@ struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid,
 out:
 	if (memcg_kmem_enabled() && (gfp & __GFP_ACCOUNT) && page &&
 	    unlikely(__memcg_kmem_charge_page(page, gfp, order) != 0)) {
-		__free_pages(page, order);
+		free_frozen_pages(page, order);
 		page = NULL;
 	}
 
@@ -5448,6 +5447,17 @@ struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid,
 
 	return page;
 }
+
+struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid,
+							nodemask_t *nodemask)
+{
+	struct page *page;
+
+	page = __alloc_frozen_pages(gfp, order, preferred_nid, nodemask);
+	if (page)
+		set_page_refcounted(page);
+	return page;
+}
 EXPORT_SYMBOL(__alloc_pages);
 
 struct folio *__folio_alloc(gfp_t gfp, unsigned int order, int preferred_nid,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 5/6] slab: Allocate frozen pages
  2022-05-31 15:06 [PATCH 0/6] Allocate and free frozen pages Matthew Wilcox (Oracle)
                   ` (3 preceding siblings ...)
  2022-05-31 15:06 ` [PATCH 4/6] mm/page_alloc: Add alloc_frozen_pages() Matthew Wilcox (Oracle)
@ 2022-05-31 15:06 ` Matthew Wilcox (Oracle)
  2022-05-31 17:15   ` David Hildenbrand
  2022-05-31 15:06 ` [PATCH 6/6] slub: " Matthew Wilcox (Oracle)
  2022-06-01  3:31 ` [PATCH 0/6] Allocate and free " William Kucharski
  6 siblings, 1 reply; 20+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-05-31 15:06 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

Since slab does not use the page refcount, it can allocate and
free frozen pages, saving one atomic operation per free.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/slab.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index f8cd00f4ba13..c5c53ed304d1 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1355,23 +1355,23 @@ slab_out_of_memory(struct kmem_cache *cachep, gfp_t gfpflags, int nodeid)
 static struct slab *kmem_getpages(struct kmem_cache *cachep, gfp_t flags,
 								int nodeid)
 {
-	struct folio *folio;
+	struct page *page;
 	struct slab *slab;
 
 	flags |= cachep->allocflags;
 
-	folio = (struct folio *) __alloc_pages_node(nodeid, flags, cachep->gfporder);
-	if (!folio) {
+	page = __alloc_frozen_pages(flags, cachep->gfporder, nodeid, NULL);
+	if (!page) {
 		slab_out_of_memory(cachep, flags, nodeid);
 		return NULL;
 	}
 
-	slab = folio_slab(folio);
+	__SetPageSlab(page);
+	slab = (struct slab *)page;
 
 	account_slab(slab, cachep->gfporder, cachep, flags);
-	__folio_set_slab(folio);
 	/* Record if ALLOC_NO_WATERMARKS was set when allocating the slab */
-	if (sk_memalloc_socks() && page_is_pfmemalloc(folio_page(folio, 0)))
+	if (sk_memalloc_socks() && page_is_pfmemalloc(page))
 		slab_set_pfmemalloc(slab);
 
 	return slab;
@@ -1383,18 +1383,17 @@ static struct slab *kmem_getpages(struct kmem_cache *cachep, gfp_t flags,
 static void kmem_freepages(struct kmem_cache *cachep, struct slab *slab)
 {
 	int order = cachep->gfporder;
-	struct folio *folio = slab_folio(slab);
+	struct page *page = (struct page *)slab;
 
-	BUG_ON(!folio_test_slab(folio));
 	__slab_clear_pfmemalloc(slab);
-	__folio_clear_slab(folio);
-	page_mapcount_reset(folio_page(folio, 0));
-	folio->mapping = NULL;
+	__ClearPageSlab(page);
+	page_mapcount_reset(page);
+	page->mapping = NULL;
 
 	if (current->reclaim_state)
 		current->reclaim_state->reclaimed_slab += 1 << order;
 	unaccount_slab(slab, order, cachep);
-	__free_pages(folio_page(folio, 0), order);
+	free_frozen_pages(page, order);
 }
 
 static void kmem_rcu_free(struct rcu_head *head)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 6/6] slub: Allocate frozen pages
  2022-05-31 15:06 [PATCH 0/6] Allocate and free frozen pages Matthew Wilcox (Oracle)
                   ` (4 preceding siblings ...)
  2022-05-31 15:06 ` [PATCH 5/6] slab: Allocate frozen pages Matthew Wilcox (Oracle)
@ 2022-05-31 15:06 ` Matthew Wilcox (Oracle)
  2022-06-01  3:31 ` [PATCH 0/6] Allocate and free " William Kucharski
  6 siblings, 0 replies; 20+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-05-31 15:06 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

Since slub does not use the page refcount, it can allocate and
free frozen pages, saving one atomic operation per free.
---
 mm/slub.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index e5535020e0fd..420a56746a01 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1789,21 +1789,21 @@ static void *setup_object(struct kmem_cache *s, void *object)
 static inline struct slab *alloc_slab_page(gfp_t flags, int node,
 		struct kmem_cache_order_objects oo)
 {
-	struct folio *folio;
+	struct page *page;
 	struct slab *slab;
 	unsigned int order = oo_order(oo);
 
 	if (node == NUMA_NO_NODE)
-		folio = (struct folio *)alloc_pages(flags, order);
+		page = alloc_frozen_pages(flags, order);
 	else
-		folio = (struct folio *)__alloc_pages_node(node, flags, order);
+		page = __alloc_frozen_pages(flags, order, node, NULL);
 
-	if (!folio)
+	if (!page)
 		return NULL;
 
-	slab = folio_slab(folio);
-	__folio_set_slab(folio);
-	if (page_is_pfmemalloc(folio_page(folio, 0)))
+	slab = (struct slab *)page;
+	__SetPageSlab(page);
+	if (page_is_pfmemalloc(page))
 		slab_set_pfmemalloc(slab);
 
 	return slab;
@@ -2005,8 +2005,8 @@ static struct slab *new_slab(struct kmem_cache *s, gfp_t flags, int node)
 
 static void __free_slab(struct kmem_cache *s, struct slab *slab)
 {
-	struct folio *folio = slab_folio(slab);
-	int order = folio_order(folio);
+	struct page *page = (struct page *)slab;
+	int order = compound_order(page);
 	int pages = 1 << order;
 
 	if (kmem_cache_debug_flags(s, SLAB_CONSISTENCY_CHECKS)) {
@@ -2018,12 +2018,12 @@ static void __free_slab(struct kmem_cache *s, struct slab *slab)
 	}
 
 	__slab_clear_pfmemalloc(slab);
-	__folio_clear_slab(folio);
-	folio->mapping = NULL;
+	__ClearPageSlab(page);
+	page->mapping = NULL;
 	if (current->reclaim_state)
 		current->reclaim_state->reclaimed_slab += pages;
 	unaccount_slab(slab, order, s);
-	__free_pages(folio_page(folio, 0), order);
+	free_frozen_pages(page, order);
 }
 
 static void rcu_free_slab(struct rcu_head *h)
@@ -3541,7 +3541,7 @@ static inline void free_large_kmalloc(struct folio *folio, void *object)
 		pr_warn_once("object pointer: 0x%p\n", object);
 
 	kfree_hook(object);
-	mod_lruvec_page_state(folio_page(folio, 0), NR_SLAB_UNRECLAIMABLE_B,
+	lruvec_stat_mod_folio(folio, NR_SLAB_UNRECLAIMABLE_B,
 			      -(PAGE_SIZE << order));
 	__free_pages(folio_page(folio, 0), order);
 }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/6] mm/page_alloc: Remove zone parameter from free_one_page()
  2022-05-31 15:06 ` [PATCH 1/6] mm/page_alloc: Remove zone parameter from free_one_page() Matthew Wilcox (Oracle)
@ 2022-05-31 16:59   ` David Hildenbrand
  2022-06-01  6:53   ` Miaohe Lin
  1 sibling, 0 replies; 20+ messages in thread
From: David Hildenbrand @ 2022-05-31 16:59 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-mm

On 31.05.22 17:06, Matthew Wilcox (Oracle) wrote:
> Both callers pass in page_zone(page), so move that into free_one_page().
> Shrinks page_alloc.o by 196 bytes with allmodconfig.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  mm/page_alloc.c | 11 +++++------
>  1 file changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 29d775b60cf9..68bb77900f67 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1529,11 +1529,10 @@ static void free_pcppages_bulk(struct zone *zone, int count,
>  	spin_unlock(&zone->lock);
>  }
>  
> -static void free_one_page(struct zone *zone,
> -				struct page *page, unsigned long pfn,
> -				unsigned int order,
> -				int migratetype, fpi_t fpi_flags)
> +static void free_one_page(struct page *page, unsigned long pfn,
> +		unsigned int order, int migratetype, fpi_t fpi_flags)
>  {
> +	struct zone *zone = page_zone(page);
>  	unsigned long flags;
>  
>  	spin_lock_irqsave(&zone->lock, flags);
> @@ -3448,7 +3447,7 @@ void free_unref_page(struct page *page, unsigned int order)
>  	migratetype = get_pcppage_migratetype(page);
>  	if (unlikely(migratetype >= MIGRATE_PCPTYPES)) {
>  		if (unlikely(is_migrate_isolate(migratetype))) {
> -			free_one_page(page_zone(page), page, pfn, order, migratetype, FPI_NONE);
> +			free_one_page(page, pfn, order, migratetype, FPI_NONE);
>  			return;
>  		}
>  		migratetype = MIGRATE_MOVABLE;
> @@ -3484,7 +3483,7 @@ void free_unref_page_list(struct list_head *list)
>  		migratetype = get_pcppage_migratetype(page);
>  		if (unlikely(is_migrate_isolate(migratetype))) {
>  			list_del(&page->lru);
> -			free_one_page(page_zone(page), page, pfn, 0, migratetype, FPI_NONE);
> +			free_one_page(page, pfn, 0, migratetype, FPI_NONE);
>  			continue;
>  		}
>  	}

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/6] mm/page_alloc: Rename free_the_page() to free_frozen_pages()
  2022-05-31 15:06 ` [PATCH 2/6] mm/page_alloc: Rename free_the_page() to free_frozen_pages() Matthew Wilcox (Oracle)
@ 2022-05-31 17:02   ` David Hildenbrand
  2022-06-01  6:58   ` Miaohe Lin
  1 sibling, 0 replies; 20+ messages in thread
From: David Hildenbrand @ 2022-05-31 17:02 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-mm

On 31.05.22 17:06, Matthew Wilcox (Oracle) wrote:
> In preparation for making this function available outside page_alloc,
> rename it to free_frozen_pages(), which fits better with the other
> memory allocation/free functions.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: David Hildenbrand <david@redhat.com>


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/6] mm/page_alloc: Export free_frozen_pages() instead of free_unref_page()
  2022-05-31 15:06 ` [PATCH 3/6] mm/page_alloc: Export free_frozen_pages() instead of free_unref_page() Matthew Wilcox (Oracle)
@ 2022-05-31 17:09   ` David Hildenbrand
  2022-05-31 17:11     ` Matthew Wilcox
  0 siblings, 1 reply; 20+ messages in thread
From: David Hildenbrand @ 2022-05-31 17:09 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-mm

On 31.05.22 17:06, Matthew Wilcox (Oracle) wrote:
> This API makes more sense for slab to use and it works perfectly
> well for swap.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  mm/internal.h   |  4 ++--
>  mm/page_alloc.c | 18 +++++++++---------
>  mm/swap.c       |  2 +-
>  3 files changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/mm/internal.h b/mm/internal.h
> index c0f8fbe0445b..f1c0dab2b98e 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -362,8 +362,8 @@ extern void post_alloc_hook(struct page *page, unsigned int order,
>  					gfp_t gfp_flags);
>  extern int user_min_free_kbytes;
>  
> -extern void free_unref_page(struct page *page, unsigned int order);
> -extern void free_unref_page_list(struct list_head *list);
> +void free_frozen_pages(struct page *, unsigned int order);
> +void free_unref_page_list(struct list_head *list);
>  
>  extern void zone_pcp_update(struct zone *zone, int cpu_online);
>  extern void zone_pcp_reset(struct zone *zone);
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 6a8676cb69db..825922000781 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -687,14 +687,6 @@ static inline bool pcp_allowed_order(unsigned int order)
>  	return false;
>  }
>  
> -static inline void free_frozen_pages(struct page *page, unsigned int order)
> -{
> -	if (pcp_allowed_order(order))		/* Via pcp? */
> -		free_unref_page(page, order);
> -	else
> -		__free_pages_ok(page, order, FPI_NONE);
> -}
> -
>  /*
>   * Higher-order pages are called "compound pages".  They are structured thusly:
>   *
> @@ -3428,7 +3420,7 @@ static void free_unref_page_commit(struct page *page, int migratetype,
>  /*
>   * Free a pcp page
>   */
> -void free_unref_page(struct page *page, unsigned int order)
> +static void free_unref_page(struct page *page, unsigned int order)
>  {
>  	unsigned long flags;
>  	unsigned long pfn = page_to_pfn(page);
> @@ -3458,6 +3450,14 @@ void free_unref_page(struct page *page, unsigned int order)
>  	local_unlock_irqrestore(&pagesets.lock, flags);
>  }
>  
> +void free_frozen_pages(struct page *page, unsigned int order)
> +{
> +	if (pcp_allowed_order(order))		/* Via pcp? */
> +		free_unref_page(page, order);
> +	else
> +		__free_pages_ok(page, order, FPI_NONE);
> +}
> +
>  /*
>   * Free a list of 0-order pages
>   */
> diff --git a/mm/swap.c b/mm/swap.c
> index f3922a96b2e9..c474bdf838e3 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -103,7 +103,7 @@ static void __put_single_page(struct page *page)
>  {
>  	__page_cache_release(page);
>  	mem_cgroup_uncharge(page_folio(page));
> -	free_unref_page(page, 0);
> +	free_frozen_pages(page, 0);
>  }
>  
>  static void __put_compound_page(struct page *page)


IIUC, the net change is 0, because we'll always take the
pcp_allowed_order() path due to order==0, correct?

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/6] mm/page_alloc: Export free_frozen_pages() instead of free_unref_page()
  2022-05-31 17:09   ` David Hildenbrand
@ 2022-05-31 17:11     ` Matthew Wilcox
  0 siblings, 0 replies; 20+ messages in thread
From: Matthew Wilcox @ 2022-05-31 17:11 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: linux-mm

On Tue, May 31, 2022 at 07:09:09PM +0200, David Hildenbrand wrote:
> On 31.05.22 17:06, Matthew Wilcox (Oracle) wrote:
> > +++ b/mm/swap.c
> > @@ -103,7 +103,7 @@ static void __put_single_page(struct page *page)
> >  {
> >  	__page_cache_release(page);
> >  	mem_cgroup_uncharge(page_folio(page));
> > -	free_unref_page(page, 0);
> > +	free_frozen_pages(page, 0);
> >  }
> >  
> >  static void __put_compound_page(struct page *page)
> 
> 
> IIUC, the net change is 0, because we'll always take the
> pcp_allowed_order() path due to order==0, correct?

Correct


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/6] slab: Allocate frozen pages
  2022-05-31 15:06 ` [PATCH 5/6] slab: Allocate frozen pages Matthew Wilcox (Oracle)
@ 2022-05-31 17:15   ` David Hildenbrand
  2022-05-31 17:33     ` Matthew Wilcox
  0 siblings, 1 reply; 20+ messages in thread
From: David Hildenbrand @ 2022-05-31 17:15 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-mm

On 31.05.22 17:06, Matthew Wilcox (Oracle) wrote:
> Since slab does not use the page refcount, it can allocate and
> free frozen pages, saving one atomic operation per free.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  mm/slab.c | 23 +++++++++++------------
>  1 file changed, 11 insertions(+), 12 deletions(-)
> 
> diff --git a/mm/slab.c b/mm/slab.c
> index f8cd00f4ba13..c5c53ed304d1 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -1355,23 +1355,23 @@ slab_out_of_memory(struct kmem_cache *cachep, gfp_t gfpflags, int nodeid)
>  static struct slab *kmem_getpages(struct kmem_cache *cachep, gfp_t flags,
>  								int nodeid)
>  {
> -	struct folio *folio;
> +	struct page *page;
>  	struct slab *slab;
>  
>  	flags |= cachep->allocflags;
>  
> -	folio = (struct folio *) __alloc_pages_node(nodeid, flags, cachep->gfporder);
> -	if (!folio) {
> +	page = __alloc_frozen_pages(flags, cachep->gfporder, nodeid, NULL);
> +	if (!page) {
>  		slab_out_of_memory(cachep, flags, nodeid);
>  		return NULL;
>  	}
>  
> -	slab = folio_slab(folio);
> +	__SetPageSlab(page);
> +	slab = (struct slab *)page;
>  
>  	account_slab(slab, cachep->gfporder, cachep, flags);
> -	__folio_set_slab(folio);
>  	/* Record if ALLOC_NO_WATERMARKS was set when allocating the slab */
> -	if (sk_memalloc_socks() && page_is_pfmemalloc(folio_page(folio, 0)))
> +	if (sk_memalloc_socks() && page_is_pfmemalloc(page))
>  		slab_set_pfmemalloc(slab);
>  
>  	return slab;
> @@ -1383,18 +1383,17 @@ static struct slab *kmem_getpages(struct kmem_cache *cachep, gfp_t flags,
>  static void kmem_freepages(struct kmem_cache *cachep, struct slab *slab)
>  {
>  	int order = cachep->gfporder;
> -	struct folio *folio = slab_folio(slab);
> +	struct page *page = (struct page *)slab;
>  
> -	BUG_ON(!folio_test_slab(folio));
>  	__slab_clear_pfmemalloc(slab);
> -	__folio_clear_slab(folio);
> -	page_mapcount_reset(folio_page(folio, 0));
> -	folio->mapping = NULL;
> +	__ClearPageSlab(page);
> +	page_mapcount_reset(page);
> +	page->mapping = NULL;
>  
>  	if (current->reclaim_state)
>  		current->reclaim_state->reclaimed_slab += 1 << order;
>  	unaccount_slab(slab, order, cachep);
> -	__free_pages(folio_page(folio, 0), order);
> +	free_frozen_pages(page, order);
>  }
>  
>  static void kmem_rcu_free(struct rcu_head *head)

I assume that implies that pages that are actually allocated *from* the
buddy now have a refcount == 0.

IIRC, page isolation code (e.g., !page_ref_count check in
has_unmovable_pages()) assumes that any page with a refcount of 0 is
essentially either already free (buddy) or is on its way of getting
freed (!buddy).

There might be other PFN walker code (like compaction) that makes
similar assumptions that hold for now.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/6] slab: Allocate frozen pages
  2022-05-31 17:15   ` David Hildenbrand
@ 2022-05-31 17:33     ` Matthew Wilcox
  2022-06-01 12:14       ` David Hildenbrand
  0 siblings, 1 reply; 20+ messages in thread
From: Matthew Wilcox @ 2022-05-31 17:33 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: linux-mm

On Tue, May 31, 2022 at 07:15:14PM +0200, David Hildenbrand wrote:
> On 31.05.22 17:06, Matthew Wilcox (Oracle) wrote:
> > Since slab does not use the page refcount, it can allocate and
> > free frozen pages, saving one atomic operation per free.
> 
> I assume that implies that pages that are actually allocated *from* the
> buddy now have a refcount == 0.

Yes.

> IIRC, page isolation code (e.g., !page_ref_count check in
> has_unmovable_pages()) assumes that any page with a refcount of 0 is
> essentially either already free (buddy) or is on its way of getting
> freed (!buddy).

That would be a bad assumption.  We already freeze pages for things like
splitting, migration, and replacement with a THP.  If the usage is just
an optimisation, then that's OK (and maybe the optimisation needs to be
tweaked to check PageSlab), but if the code depends on that being true,
it was already broken.

For this particular case, I think has_unmovable_pages() is only called
for ZONE_MOVEABLE and Slab never allocates from ZONE_MOVEABLE, so it's
not an issue?

> There might be other PFN walker code (like compaction) that makes
> similar assumptions that hold for now.
> 
> -- 
> Thanks,
> 
> David / dhildenb
> 
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/6] Allocate and free frozen pages
  2022-05-31 15:06 [PATCH 0/6] Allocate and free frozen pages Matthew Wilcox (Oracle)
                   ` (5 preceding siblings ...)
  2022-05-31 15:06 ` [PATCH 6/6] slub: " Matthew Wilcox (Oracle)
@ 2022-06-01  3:31 ` William Kucharski
  6 siblings, 0 replies; 20+ messages in thread
From: William Kucharski @ 2022-06-01  3:31 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-mm

For the series:

Reviewed-by: William Kucharski <william.kucharski@oracle.com>

> On May 31, 2022, at 9:06 AM, Matthew Wilcox (Oracle) <willy@infradead.org> wrote:
> 
> We already have the ability to freeze a page (safely reduce its
> reference count to 0).  Some users (eg slab) would prefer to be able
> to allocate frozen pages and avoid touching the refcount.  It also
> avoids spurious temporary refcounts being taken on these pages.
> 
> Matthew Wilcox (Oracle) (6):
>  mm/page_alloc: Remove zone parameter from free_one_page()
>  mm/page_alloc: Rename free_the_page() to free_frozen_pages()
>  mm/page_alloc: Export free_frozen_pages() instead of free_unref_page()
>  mm/page_alloc: Add alloc_frozen_pages()
>  slab: Allocate frozen pages
>  slub: Allocate frozen pages
> 
> mm/internal.h   | 15 ++++++++++--
> mm/mempolicy.c  | 61 ++++++++++++++++++++++++++++++-------------------
> mm/page_alloc.c | 59 +++++++++++++++++++++++++++--------------------
> mm/slab.c       | 23 +++++++++----------
> mm/slub.c       | 26 ++++++++++-----------
> mm/swap.c       |  2 +-
> 6 files changed, 110 insertions(+), 76 deletions(-)
> 
> -- 
> 2.34.1
> 
> 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/6] mm/page_alloc: Remove zone parameter from free_one_page()
  2022-05-31 15:06 ` [PATCH 1/6] mm/page_alloc: Remove zone parameter from free_one_page() Matthew Wilcox (Oracle)
  2022-05-31 16:59   ` David Hildenbrand
@ 2022-06-01  6:53   ` Miaohe Lin
  1 sibling, 0 replies; 20+ messages in thread
From: Miaohe Lin @ 2022-06-01  6:53 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-mm

On 2022/5/31 23:06, Matthew Wilcox (Oracle) wrote:
> Both callers pass in page_zone(page), so move that into free_one_page().
> Shrinks page_alloc.o by 196 bytes with allmodconfig.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

Thanks!

> ---
>  mm/page_alloc.c | 11 +++++------
>  1 file changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 29d775b60cf9..68bb77900f67 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1529,11 +1529,10 @@ static void free_pcppages_bulk(struct zone *zone, int count,
>  	spin_unlock(&zone->lock);
>  }
>  
> -static void free_one_page(struct zone *zone,
> -				struct page *page, unsigned long pfn,
> -				unsigned int order,
> -				int migratetype, fpi_t fpi_flags)
> +static void free_one_page(struct page *page, unsigned long pfn,
> +		unsigned int order, int migratetype, fpi_t fpi_flags)
>  {
> +	struct zone *zone = page_zone(page);
>  	unsigned long flags;
>  
>  	spin_lock_irqsave(&zone->lock, flags);
> @@ -3448,7 +3447,7 @@ void free_unref_page(struct page *page, unsigned int order)
>  	migratetype = get_pcppage_migratetype(page);
>  	if (unlikely(migratetype >= MIGRATE_PCPTYPES)) {
>  		if (unlikely(is_migrate_isolate(migratetype))) {
> -			free_one_page(page_zone(page), page, pfn, order, migratetype, FPI_NONE);
> +			free_one_page(page, pfn, order, migratetype, FPI_NONE);
>  			return;
>  		}
>  		migratetype = MIGRATE_MOVABLE;
> @@ -3484,7 +3483,7 @@ void free_unref_page_list(struct list_head *list)
>  		migratetype = get_pcppage_migratetype(page);
>  		if (unlikely(is_migrate_isolate(migratetype))) {
>  			list_del(&page->lru);
> -			free_one_page(page_zone(page), page, pfn, 0, migratetype, FPI_NONE);
> +			free_one_page(page, pfn, 0, migratetype, FPI_NONE);
>  			continue;
>  		}
>  	}
> 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/6] mm/page_alloc: Rename free_the_page() to free_frozen_pages()
  2022-05-31 15:06 ` [PATCH 2/6] mm/page_alloc: Rename free_the_page() to free_frozen_pages() Matthew Wilcox (Oracle)
  2022-05-31 17:02   ` David Hildenbrand
@ 2022-06-01  6:58   ` Miaohe Lin
  2022-06-01 12:23     ` Matthew Wilcox
  1 sibling, 1 reply; 20+ messages in thread
From: Miaohe Lin @ 2022-06-01  6:58 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-mm

On 2022/5/31 23:06, Matthew Wilcox (Oracle) wrote:
> In preparation for making this function available outside page_alloc,
> rename it to free_frozen_pages(), which fits better with the other
> memory allocation/free functions.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

Thanks!

BTW: I thought it means we can free the temporarily frozen page (via page_ref_freeze) now
when I saw the name "free_frozen_pages". ;)

> ---
>  mm/page_alloc.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 68bb77900f67..6a8676cb69db 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -687,7 +687,7 @@ static inline bool pcp_allowed_order(unsigned int order)
>  	return false;
>  }
>  
> -static inline void free_the_page(struct page *page, unsigned int order)
> +static inline void free_frozen_pages(struct page *page, unsigned int order)
>  {
>  	if (pcp_allowed_order(order))		/* Via pcp? */
>  		free_unref_page(page, order);
> @@ -713,7 +713,7 @@ static inline void free_the_page(struct page *page, unsigned int order)
>  void free_compound_page(struct page *page)
>  {
>  	mem_cgroup_uncharge(page_folio(page));
> -	free_the_page(page, compound_order(page));
> +	free_frozen_pages(page, compound_order(page));
>  }
>  
>  static void prep_compound_head(struct page *page, unsigned int order)
> @@ -5507,10 +5507,10 @@ EXPORT_SYMBOL(get_zeroed_page);
>  void __free_pages(struct page *page, unsigned int order)
>  {
>  	if (put_page_testzero(page))
> -		free_the_page(page, order);
> +		free_frozen_pages(page, order);
>  	else if (!PageHead(page))
>  		while (order-- > 0)
> -			free_the_page(page + (1 << order), order);
> +			free_frozen_pages(page + (1 << order), order);
>  }
>  EXPORT_SYMBOL(__free_pages);
>  
> @@ -5561,7 +5561,7 @@ void __page_frag_cache_drain(struct page *page, unsigned int count)
>  	VM_BUG_ON_PAGE(page_ref_count(page) == 0, page);
>  
>  	if (page_ref_sub_and_test(page, count))
> -		free_the_page(page, compound_order(page));
> +		free_frozen_pages(page, compound_order(page));
>  }
>  EXPORT_SYMBOL(__page_frag_cache_drain);
>  
> @@ -5602,7 +5602,7 @@ void *page_frag_alloc_align(struct page_frag_cache *nc,
>  			goto refill;
>  
>  		if (unlikely(nc->pfmemalloc)) {
> -			free_the_page(page, compound_order(page));
> +			free_frozen_pages(page, compound_order(page));
>  			goto refill;
>  		}
>  
> @@ -5634,7 +5634,7 @@ void page_frag_free(void *addr)
>  	struct page *page = virt_to_head_page(addr);
>  
>  	if (unlikely(put_page_testzero(page)))
> -		free_the_page(page, compound_order(page));
> +		free_frozen_pages(page, compound_order(page));
>  }
>  EXPORT_SYMBOL(page_frag_free);
>  
> 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/6] slab: Allocate frozen pages
  2022-05-31 17:33     ` Matthew Wilcox
@ 2022-06-01 12:14       ` David Hildenbrand
  2022-08-09 10:37         ` Vlastimil Babka (SUSE)
  0 siblings, 1 reply; 20+ messages in thread
From: David Hildenbrand @ 2022-06-01 12:14 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm

On 31.05.22 19:33, Matthew Wilcox wrote:
> On Tue, May 31, 2022 at 07:15:14PM +0200, David Hildenbrand wrote:
>> On 31.05.22 17:06, Matthew Wilcox (Oracle) wrote:
>>> Since slab does not use the page refcount, it can allocate and
>>> free frozen pages, saving one atomic operation per free.
>>
>> I assume that implies that pages that are actually allocated *from* the
>> buddy now have a refcount == 0.
> 
> Yes.
> 
>> IIRC, page isolation code (e.g., !page_ref_count check in
>> has_unmovable_pages()) assumes that any page with a refcount of 0 is
>> essentially either already free (buddy) or is on its way of getting
>> freed (!buddy).
> 
> That would be a bad assumption.  We already freeze pages for things like
> splitting, migration, and replacement with a THP.  If the usage is just
> an optimisation, then that's OK (and maybe the optimisation needs to be
> tweaked to check PageSlab), but if the code depends on that being true,
> it was already broken.

Yes, it's an optimization to not go ahead and mess with the migratetype
of pageblocks (especially, setting it MIGRATE_ISOLATE to later reset it
to MIGRATE_MOVABLE) and so forth when it's obvious that there is
unmovable data.

However, freezing the refcount -- as used in existing code -- is only a
temporary thing. And it's frequently used on movable pages.

If migration froze the refcount, the page is most certainly movable.
THPs should be movable, so it doesn't matter if we froze the refcount or
not. Pages that we collapse into a THP should be mostly LRU pages and,
therefore, movable.

So the frozen refcount doesn't result in additional "false negatives" in
e.g., has_unmovable_pages(), at least for these users.


> 
> For this particular case, I think has_unmovable_pages() is only called
> for ZONE_MOVEABLE and Slab never allocates from ZONE_MOVEABLE, so it's
> not an issue?


has_unmovable_pages() is primarily useful for ZONE_NORMAL. For
ZONE_MOVABLE, we really only check if there are any boot time
allocations (PageReserved()).

For example, alloc_contig_range() implicitly calls has_unmovable_pages()
when isolating pageblocks and ends up getting called on ZONE_NORMAL for
example for runtime allocation of gigantic pages or via virtio-mem
(nowadays even in pageblock granularity).


Long story short, we should document accordingly what it actually means
when we allocate without increasing the refcount, and that people should
be careful with that. Regarding has_unmovable_pages() and frozen
allocations, we might just be able to keep the existing detection
unmodified by special-casing PageSlab() pages and detecting them as
unmovable immediately.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/6] mm/page_alloc: Rename free_the_page() to free_frozen_pages()
  2022-06-01  6:58   ` Miaohe Lin
@ 2022-06-01 12:23     ` Matthew Wilcox
  2022-06-02  7:45       ` Miaohe Lin
  0 siblings, 1 reply; 20+ messages in thread
From: Matthew Wilcox @ 2022-06-01 12:23 UTC (permalink / raw)
  To: Miaohe Lin; +Cc: linux-mm

On Wed, Jun 01, 2022 at 02:58:23PM +0800, Miaohe Lin wrote:
> On 2022/5/31 23:06, Matthew Wilcox (Oracle) wrote:
> > In preparation for making this function available outside page_alloc,
> > rename it to free_frozen_pages(), which fits better with the other
> > memory allocation/free functions.
> > 
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> 
> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
> 
> Thanks!
> 
> BTW: I thought it means we can free the temporarily frozen page (via page_ref_freeze) now
> when I saw the name "free_frozen_pages". ;)

Well, you can.  Before you'd have to do it by calling free_unref_page(),
but now you can do it by calling free_frozen_pages().


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/6] mm/page_alloc: Rename free_the_page() to free_frozen_pages()
  2022-06-01 12:23     ` Matthew Wilcox
@ 2022-06-02  7:45       ` Miaohe Lin
  0 siblings, 0 replies; 20+ messages in thread
From: Miaohe Lin @ 2022-06-02  7:45 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm

On 2022/6/1 20:23, Matthew Wilcox wrote:
> On Wed, Jun 01, 2022 at 02:58:23PM +0800, Miaohe Lin wrote:
>> On 2022/5/31 23:06, Matthew Wilcox (Oracle) wrote:
>>> In preparation for making this function available outside page_alloc,
>>> rename it to free_frozen_pages(), which fits better with the other
>>> memory allocation/free functions.
>>>
>>> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
>>
>> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
>>
>> Thanks!
>>
>> BTW: I thought it means we can free the temporarily frozen page (via page_ref_freeze) now
>> when I saw the name "free_frozen_pages". ;)
> 
> Well, you can.  Before you'd have to do it by calling free_unref_page(),
> but now you can do it by calling free_frozen_pages().

Sorry, it's my mistake. We can do this indeed, e.g. shrink_page_list is doing this via __remove_mapping
+ free_unref_page_list.

Thanks! :)

> 
> .
> 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/6] slab: Allocate frozen pages
  2022-06-01 12:14       ` David Hildenbrand
@ 2022-08-09 10:37         ` Vlastimil Babka (SUSE)
  0 siblings, 0 replies; 20+ messages in thread
From: Vlastimil Babka (SUSE) @ 2022-08-09 10:37 UTC (permalink / raw)
  To: David Hildenbrand, Matthew Wilcox; +Cc: linux-mm

On 6/1/22 14:14, David Hildenbrand wrote:
> On 31.05.22 19:33, Matthew Wilcox wrote:
>> On Tue, May 31, 2022 at 07:15:14PM +0200, David Hildenbrand wrote:
>>> On 31.05.22 17:06, Matthew Wilcox (Oracle) wrote:
>>>> Since slab does not use the page refcount, it can allocate and
>>>> free frozen pages, saving one atomic operation per free.
>>>
>>> I assume that implies that pages that are actually allocated *from* the
>>> buddy now have a refcount == 0.
>> 
>> Yes.
>> 
>>> IIRC, page isolation code (e.g., !page_ref_count check in
>>> has_unmovable_pages()) assumes that any page with a refcount of 0 is
>>> essentially either already free (buddy) or is on its way of getting
>>> freed (!buddy).
>> 
>> That would be a bad assumption.  We already freeze pages for things like
>> splitting, migration, and replacement with a THP.  If the usage is just
>> an optimisation, then that's OK (and maybe the optimisation needs to be
>> tweaked to check PageSlab), but if the code depends on that being true,
>> it was already broken.
> 
> Yes, it's an optimization to not go ahead and mess with the migratetype
> of pageblocks (especially, setting it MIGRATE_ISOLATE to later reset it
> to MIGRATE_MOVABLE) and so forth when it's obvious that there is
> unmovable data.
> 
> However, freezing the refcount -- as used in existing code -- is only a
> temporary thing. And it's frequently used on movable pages.

Agreed.

> If migration froze the refcount, the page is most certainly movable.
> THPs should be movable, so it doesn't matter if we froze the refcount or
> not. Pages that we collapse into a THP should be mostly LRU pages and,
> therefore, movable.
> 
> So the frozen refcount doesn't result in additional "false negatives" in
> e.g., has_unmovable_pages(), at least for these users.
> 
> 
>> 
>> For this particular case, I think has_unmovable_pages() is only called
>> for ZONE_MOVEABLE and Slab never allocates from ZONE_MOVEABLE, so it's
>> not an issue?
> 
> 
> has_unmovable_pages() is primarily useful for ZONE_NORMAL. For
> ZONE_MOVABLE, we really only check if there are any boot time
> allocations (PageReserved()).
> 
> For example, alloc_contig_range() implicitly calls has_unmovable_pages()
> when isolating pageblocks and ends up getting called on ZONE_NORMAL for
> example for runtime allocation of gigantic pages or via virtio-mem
> (nowadays even in pageblock granularity).
> 
> 
> Long story short, we should document accordingly what it actually means
> when we allocate without increasing the refcount, and that people should
> be careful with that. Regarding has_unmovable_pages() and frozen
> allocations, we might just be able to keep the existing detection
> unmodified by special-casing PageSlab() pages and detecting them as
> unmovable immediately.

I wonder if it makes sense to encourage long-term freezing anyway, if it
complicates other checks that could for now rely on the relatively simple
rules. For slab it might save some atomics, but allocating/freeing a new
slab page is already a very slow path, so that will be negligible.
So I wouldn't mind if only up to 4/6 of this series was merged.
OTOH once the API is available somebody will eventually use it for a
long-term frozen allocation (but they achieve do that now too, so maybe
not), so agree about documenting the implications.


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2022-08-09 10:37 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-31 15:06 [PATCH 0/6] Allocate and free frozen pages Matthew Wilcox (Oracle)
2022-05-31 15:06 ` [PATCH 1/6] mm/page_alloc: Remove zone parameter from free_one_page() Matthew Wilcox (Oracle)
2022-05-31 16:59   ` David Hildenbrand
2022-06-01  6:53   ` Miaohe Lin
2022-05-31 15:06 ` [PATCH 2/6] mm/page_alloc: Rename free_the_page() to free_frozen_pages() Matthew Wilcox (Oracle)
2022-05-31 17:02   ` David Hildenbrand
2022-06-01  6:58   ` Miaohe Lin
2022-06-01 12:23     ` Matthew Wilcox
2022-06-02  7:45       ` Miaohe Lin
2022-05-31 15:06 ` [PATCH 3/6] mm/page_alloc: Export free_frozen_pages() instead of free_unref_page() Matthew Wilcox (Oracle)
2022-05-31 17:09   ` David Hildenbrand
2022-05-31 17:11     ` Matthew Wilcox
2022-05-31 15:06 ` [PATCH 4/6] mm/page_alloc: Add alloc_frozen_pages() Matthew Wilcox (Oracle)
2022-05-31 15:06 ` [PATCH 5/6] slab: Allocate frozen pages Matthew Wilcox (Oracle)
2022-05-31 17:15   ` David Hildenbrand
2022-05-31 17:33     ` Matthew Wilcox
2022-06-01 12:14       ` David Hildenbrand
2022-08-09 10:37         ` Vlastimil Babka (SUSE)
2022-05-31 15:06 ` [PATCH 6/6] slub: " Matthew Wilcox (Oracle)
2022-06-01  3:31 ` [PATCH 0/6] Allocate and free " William Kucharski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.