linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/24] Split page pools from struct page
@ 2022-11-30 22:07 Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 01/24] netmem: Create new type Matthew Wilcox (Oracle)
                   ` (26 more replies)
  0 siblings, 27 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

The MM subsystem is trying to reduce struct page to a single pointer.
The first step towards that is splitting struct page by its individual
users, as has already been done with folio and slab.  This attempt chooses
'netmem' as a name, but I am not even slightly committed to that name,
and will happily use another.

There are some relatively significant reductions in kernel text
size from these changes.  I'm not qualified to judge how they
might affect performance, but every call to put_page() includes
a call to compound_head(), which is now rather more complex
than it once was (at least in a distro config which enables
CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP).

I've only converted one user of the page_pool APIs to use the new netmem
APIs, all the others continue to use the page based ones.

Uh, I see I left netmem_to_virt() as its own commit instead of squashing
it into "netmem: Add utility functions".  I'll fix that in the next
version, because I'm sure you'll want some changes anyway.

Happy to answer questions.

Matthew Wilcox (Oracle) (24):
  netmem: Create new type
  netmem: Add utility functions
  page_pool: Add netmem_set_dma_addr() and netmem_get_dma_addr()
  page_pool: Convert page_pool_release_page() to
    page_pool_release_netmem()
  page_pool: Start using netmem in allocation path.
  page_pool: Convert page_pool_return_page() to
    page_pool_return_netmem()
  page_pool: Convert __page_pool_put_page() to __page_pool_put_netmem()
  page_pool: Convert pp_alloc_cache to contain netmem
  page_pool: Convert page_pool_defrag_page() to
    page_pool_defrag_netmem()
  page_pool: Convert page_pool_put_defragged_page() to netmem
  page_pool: Convert page_pool_empty_ring() to use netmem
  page_pool: Convert page_pool_alloc_pages() to page_pool_alloc_netmem()
  page_pool: Convert page_pool_dma_sync_for_device() to take a netmem
  page_pool: Convert page_pool_recycle_in_cache() to netmem
  page_pool: Remove page_pool_defrag_page()
  page_pool: Use netmem in page_pool_drain_frag()
  page_pool: Convert page_pool_return_skb_page() to use netmem
  page_pool: Convert frag_page to frag_nmem
  xdp: Convert to netmem
  mm: Remove page pool members from struct page
  netmem_to_virt
  page_pool: Pass a netmem to init_callback()
  net: Add support for netmem in skb_frag
  mvneta: Convert to netmem

 drivers/net/ethernet/marvell/mvneta.c |  48 ++---
 include/linux/mm_types.h              |  22 ---
 include/linux/skbuff.h                |  11 ++
 include/net/page_pool.h               | 181 ++++++++++++++---
 include/trace/events/page_pool.h      |  28 +--
 net/bpf/test_run.c                    |   4 +-
 net/core/page_pool.c                  | 274 +++++++++++++-------------
 net/core/xdp.c                        |   7 +-
 8 files changed, 344 insertions(+), 231 deletions(-)


base-commit: 13ee7ef407cfcf63f4f047460ac5bb6ba5a3447d
-- 
2.35.1



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 01/24] netmem: Create new type
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-12-05 14:42   ` Jesper Dangaard Brouer
  2022-11-30 22:07 ` [PATCH 02/24] netmem: Add utility functions Matthew Wilcox (Oracle)
                   ` (25 subsequent siblings)
  26 siblings, 1 reply; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

As part of simplifying struct page, create a new netmem type which
mirrors the page_pool members in struct page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/net/page_pool.h | 41 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index 813c93499f20..af6ff8c302a0 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -50,6 +50,47 @@
 				 PP_FLAG_DMA_SYNC_DEV |\
 				 PP_FLAG_PAGE_FRAG)
 
+/* page_pool used by netstack */
+struct netmem {
+	unsigned long flags;		/* Page flags */
+	/**
+	 * @pp_magic: magic value to avoid recycling non
+	 * page_pool allocated pages.
+	 */
+	unsigned long pp_magic;
+	struct page_pool *pp;
+	unsigned long _pp_mapping_pad;
+	unsigned long dma_addr;
+	union {
+		/**
+		 * dma_addr_upper: might require a 64-bit
+		 * value on 32-bit architectures.
+		 */
+		unsigned long dma_addr_upper;
+		/**
+		 * For frag page support, not supported in
+		 * 32-bit architectures with 64-bit DMA.
+		 */
+		atomic_long_t pp_frag_count;
+	};
+	atomic_t _mapcount;
+	atomic_t _refcount;
+};
+
+#define NETMEM_MATCH(pg, nm)						\
+	static_assert(offsetof(struct page, pg) == offsetof(struct netmem, nm))
+NETMEM_MATCH(flags, flags);
+NETMEM_MATCH(lru, pp_magic);
+NETMEM_MATCH(pp, pp);
+NETMEM_MATCH(mapping, _pp_mapping_pad);
+NETMEM_MATCH(dma_addr, dma_addr);
+NETMEM_MATCH(dma_addr_upper, dma_addr_upper);
+NETMEM_MATCH(pp_frag_count, pp_frag_count);
+NETMEM_MATCH(_mapcount, _mapcount);
+NETMEM_MATCH(_refcount, _refcount);
+#undef NETMEM_MATCH
+static_assert(sizeof(struct netmem) <= sizeof(struct page));
+
 /*
  * Fast allocation side cache array/stack
  *
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 02/24] netmem: Add utility functions
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 01/24] netmem: Create new type Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 03/24] page_pool: Add netmem_set_dma_addr() and netmem_get_dma_addr() Matthew Wilcox (Oracle)
                   ` (24 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

netmem_page() is defined this way to preserve constness.  page_netmem()
doesn't call compound_head() because netmem users always use the head
page; it does include a debugging assert to check that it's true.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/net/page_pool.h | 42 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index af6ff8c302a0..0ce20b95290b 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -91,6 +91,48 @@ NETMEM_MATCH(_refcount, _refcount);
 #undef NETMEM_MATCH
 static_assert(sizeof(struct netmem) <= sizeof(struct page));
 
+#define netmem_page(nmem) (_Generic((*nmem),				\
+	const struct netmem:	(const struct page *)nmem,		\
+	struct netmem:		(struct page *)nmem))
+
+static inline struct netmem *page_netmem(struct page *page)
+{
+	VM_BUG_ON_PAGE(PageTail(page), page);
+	return (struct netmem *)page;
+}
+
+static inline unsigned long netmem_pfn(const struct netmem *nmem)
+{
+	return page_to_pfn(netmem_page(nmem));
+}
+
+static inline unsigned long netmem_nid(const struct netmem *nmem)
+{
+	return page_to_nid(netmem_page(nmem));
+}
+
+static inline struct netmem *virt_to_netmem(const void *x)
+{
+	return page_netmem(virt_to_head_page(x));
+}
+
+static inline int netmem_ref_count(const struct netmem *nmem)
+{
+	return page_ref_count(netmem_page(nmem));
+}
+
+static inline void netmem_put(struct netmem *nmem)
+{
+	struct folio *folio = (struct folio *)nmem;
+
+	return folio_put(folio);
+}
+
+static inline bool netmem_is_pfmemalloc(const struct netmem *nmem)
+{
+	return nmem->pp_magic & BIT(1);
+}
+
 /*
  * Fast allocation side cache array/stack
  *
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 03/24] page_pool: Add netmem_set_dma_addr() and netmem_get_dma_addr()
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 01/24] netmem: Create new type Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 02/24] netmem: Add utility functions Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 04/24] page_pool: Convert page_pool_release_page() to page_pool_release_netmem() Matthew Wilcox (Oracle)
                   ` (23 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

Turn page_pool_set_dma_addr() and page_pool_get_dma_addr() into
wrappers.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/net/page_pool.h | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index 0ce20b95290b..a68746a5b99c 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -427,21 +427,31 @@ static inline void page_pool_recycle_direct(struct page_pool *pool,
 #define PAGE_POOL_DMA_USE_PP_FRAG_COUNT	\
 		(sizeof(dma_addr_t) > sizeof(unsigned long))
 
-static inline dma_addr_t page_pool_get_dma_addr(struct page *page)
+static inline dma_addr_t netmem_get_dma_addr(struct netmem *nmem)
 {
-	dma_addr_t ret = page->dma_addr;
+	dma_addr_t ret = nmem->dma_addr;
 
 	if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT)
-		ret |= (dma_addr_t)page->dma_addr_upper << 16 << 16;
+		ret |= (dma_addr_t)nmem->dma_addr_upper << 16 << 16;
 
 	return ret;
 }
 
-static inline void page_pool_set_dma_addr(struct page *page, dma_addr_t addr)
+static inline dma_addr_t page_pool_get_dma_addr(struct page *page)
+{
+	return netmem_get_dma_addr(page_netmem(page));
+}
+
+static inline void netmem_set_dma_addr(struct netmem *nmem, dma_addr_t addr)
 {
-	page->dma_addr = addr;
+	nmem->dma_addr = addr;
 	if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT)
-		page->dma_addr_upper = upper_32_bits(addr);
+		nmem->dma_addr_upper = upper_32_bits(addr);
+}
+
+static inline void page_pool_set_dma_addr(struct page *page, dma_addr_t addr)
+{
+	netmem_set_dma_addr(page_netmem(page), addr);
 }
 
 static inline bool is_page_pool_compiled_in(void)
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 04/24] page_pool: Convert page_pool_release_page() to page_pool_release_netmem()
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (2 preceding siblings ...)
  2022-11-30 22:07 ` [PATCH 03/24] page_pool: Add netmem_set_dma_addr() and netmem_get_dma_addr() Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 05/24] page_pool: Start using netmem in allocation path Matthew Wilcox (Oracle)
                   ` (22 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

Also convert page_pool_clear_pp_info() and trace_page_pool_state_release()
to take a netmem.  Include a wrapper for page_pool_release_page() to
avoid converting all callers.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/net/page_pool.h          | 14 ++++++++++----
 include/trace/events/page_pool.h | 14 +++++++-------
 net/core/page_pool.c             | 18 +++++++++---------
 3 files changed, 26 insertions(+), 20 deletions(-)

diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index a68746a5b99c..453797f9cb90 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -18,7 +18,7 @@
  *
  * API keeps track of in-flight pages, in-order to let API user know
  * when it is safe to dealloactor page_pool object.  Thus, API users
- * must make sure to call page_pool_release_page() when a page is
+ * must make sure to call page_pool_release_netmem() when a page is
  * "leaving" the page_pool.  Or call page_pool_put_page() where
  * appropiate.  For maintaining correct accounting.
  *
@@ -332,7 +332,7 @@ struct xdp_mem_info;
 void page_pool_destroy(struct page_pool *pool);
 void page_pool_use_xdp_mem(struct page_pool *pool, void (*disconnect)(void *),
 			   struct xdp_mem_info *mem);
-void page_pool_release_page(struct page_pool *pool, struct page *page);
+void page_pool_release_netmem(struct page_pool *pool, struct netmem *nmem);
 void page_pool_put_page_bulk(struct page_pool *pool, void **data,
 			     int count);
 #else
@@ -345,8 +345,8 @@ static inline void page_pool_use_xdp_mem(struct page_pool *pool,
 					 struct xdp_mem_info *mem)
 {
 }
-static inline void page_pool_release_page(struct page_pool *pool,
-					  struct page *page)
+static inline void page_pool_release_netmem(struct page_pool *pool,
+					  struct netmem *nmem)
 {
 }
 
@@ -356,6 +356,12 @@ static inline void page_pool_put_page_bulk(struct page_pool *pool, void **data,
 }
 #endif
 
+static inline void page_pool_release_page(struct page_pool *pool,
+					struct page *page)
+{
+	page_pool_release_netmem(pool, page_netmem(page));
+}
+
 void page_pool_put_defragged_page(struct page_pool *pool, struct page *page,
 				  unsigned int dma_sync_size,
 				  bool allow_direct);
diff --git a/include/trace/events/page_pool.h b/include/trace/events/page_pool.h
index ca534501158b..113aad0c9e5b 100644
--- a/include/trace/events/page_pool.h
+++ b/include/trace/events/page_pool.h
@@ -42,26 +42,26 @@ TRACE_EVENT(page_pool_release,
 TRACE_EVENT(page_pool_state_release,
 
 	TP_PROTO(const struct page_pool *pool,
-		 const struct page *page, u32 release),
+		 const struct netmem *nmem, u32 release),
 
-	TP_ARGS(pool, page, release),
+	TP_ARGS(pool, nmem, release),
 
 	TP_STRUCT__entry(
 		__field(const struct page_pool *,	pool)
-		__field(const struct page *,		page)
+		__field(const struct netmem *,		nmem)
 		__field(u32,				release)
 		__field(unsigned long,			pfn)
 	),
 
 	TP_fast_assign(
 		__entry->pool		= pool;
-		__entry->page		= page;
+		__entry->nmem		= nmem;
 		__entry->release	= release;
-		__entry->pfn		= page_to_pfn(page);
+		__entry->pfn		= netmem_pfn(nmem);
 	),
 
-	TP_printk("page_pool=%p page=%p pfn=0x%lx release=%u",
-		  __entry->pool, __entry->page, __entry->pfn, __entry->release)
+	TP_printk("page_pool=%p nmem=%p pfn=0x%lx release=%u",
+		  __entry->pool, __entry->nmem, __entry->pfn, __entry->release)
 );
 
 TRACE_EVENT(page_pool_state_hold,
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 9b203d8660e4..437241aba5a7 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -336,10 +336,10 @@ static void page_pool_set_pp_info(struct page_pool *pool,
 		pool->p.init_callback(page, pool->p.init_arg);
 }
 
-static void page_pool_clear_pp_info(struct page *page)
+static void page_pool_clear_pp_info(struct netmem *nmem)
 {
-	page->pp_magic = 0;
-	page->pp = NULL;
+	nmem->pp_magic = 0;
+	nmem->pp = NULL;
 }
 
 static struct page *__page_pool_alloc_page_order(struct page_pool *pool,
@@ -467,7 +467,7 @@ static s32 page_pool_inflight(struct page_pool *pool)
  * a regular page (that will eventually be returned to the normal
  * page-allocator via put_page).
  */
-void page_pool_release_page(struct page_pool *pool, struct page *page)
+void page_pool_release_netmem(struct page_pool *pool, struct netmem *nmem)
 {
 	dma_addr_t dma;
 	int count;
@@ -478,23 +478,23 @@ void page_pool_release_page(struct page_pool *pool, struct page *page)
 		 */
 		goto skip_dma_unmap;
 
-	dma = page_pool_get_dma_addr(page);
+	dma = netmem_get_dma_addr(nmem);
 
 	/* When page is unmapped, it cannot be returned to our pool */
 	dma_unmap_page_attrs(pool->p.dev, dma,
 			     PAGE_SIZE << pool->p.order, pool->p.dma_dir,
 			     DMA_ATTR_SKIP_CPU_SYNC);
-	page_pool_set_dma_addr(page, 0);
+	netmem_set_dma_addr(nmem, 0);
 skip_dma_unmap:
-	page_pool_clear_pp_info(page);
+	page_pool_clear_pp_info(nmem);
 
 	/* This may be the last page returned, releasing the pool, so
 	 * it is not safe to reference pool afterwards.
 	 */
 	count = atomic_inc_return_relaxed(&pool->pages_state_release_cnt);
-	trace_page_pool_state_release(pool, page, count);
+	trace_page_pool_state_release(pool, nmem, count);
 }
-EXPORT_SYMBOL(page_pool_release_page);
+EXPORT_SYMBOL(page_pool_release_netmem);
 
 /* Return a page to the page allocator, cleaning up our state */
 static void page_pool_return_page(struct page_pool *pool, struct page *page)
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 05/24] page_pool: Start using netmem in allocation path.
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (3 preceding siblings ...)
  2022-11-30 22:07 ` [PATCH 04/24] page_pool: Convert page_pool_release_page() to page_pool_release_netmem() Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 06/24] page_pool: Convert page_pool_return_page() to page_pool_return_netmem() Matthew Wilcox (Oracle)
                   ` (21 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

Convert __page_pool_alloc_page_order() and __page_pool_alloc_pages_slow()
to use netmem internally.  This removes a couple of calls
to compound_head() that are hidden inside put_page().
Convert trace_page_pool_state_hold(), page_pool_dma_map() and
page_pool_set_pp_info() to take a netmem argument.

Saves 83 bytes of text in __page_pool_alloc_page_order() and 98 in
__page_pool_alloc_pages_slow() for a total of 181 bytes.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/trace/events/page_pool.h | 14 +++++------
 net/core/page_pool.c             | 42 +++++++++++++++++---------------
 2 files changed, 29 insertions(+), 27 deletions(-)

diff --git a/include/trace/events/page_pool.h b/include/trace/events/page_pool.h
index 113aad0c9e5b..d1237a7ce481 100644
--- a/include/trace/events/page_pool.h
+++ b/include/trace/events/page_pool.h
@@ -67,26 +67,26 @@ TRACE_EVENT(page_pool_state_release,
 TRACE_EVENT(page_pool_state_hold,
 
 	TP_PROTO(const struct page_pool *pool,
-		 const struct page *page, u32 hold),
+		 const struct netmem *nmem, u32 hold),
 
-	TP_ARGS(pool, page, hold),
+	TP_ARGS(pool, nmem, hold),
 
 	TP_STRUCT__entry(
 		__field(const struct page_pool *,	pool)
-		__field(const struct page *,		page)
+		__field(const struct netmem *,		nmem)
 		__field(u32,				hold)
 		__field(unsigned long,			pfn)
 	),
 
 	TP_fast_assign(
 		__entry->pool	= pool;
-		__entry->page	= page;
+		__entry->nmem	= nmem;
 		__entry->hold	= hold;
-		__entry->pfn	= page_to_pfn(page);
+		__entry->pfn	= netmem_pfn(nmem);
 	),
 
-	TP_printk("page_pool=%p page=%p pfn=0x%lx hold=%u",
-		  __entry->pool, __entry->page, __entry->pfn, __entry->hold)
+	TP_printk("page_pool=%p netmem=%p pfn=0x%lx hold=%u",
+		  __entry->pool, __entry->nmem, __entry->pfn, __entry->hold)
 );
 
 TRACE_EVENT(page_pool_update_nid,
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 437241aba5a7..4e985502c569 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -304,8 +304,9 @@ static void page_pool_dma_sync_for_device(struct page_pool *pool,
 					 pool->p.dma_dir);
 }
 
-static bool page_pool_dma_map(struct page_pool *pool, struct page *page)
+static bool page_pool_dma_map(struct page_pool *pool, struct netmem *nmem)
 {
+	struct page *page = netmem_page(nmem);
 	dma_addr_t dma;
 
 	/* Setup DMA mapping: use 'struct page' area for storing DMA-addr
@@ -328,12 +329,12 @@ static bool page_pool_dma_map(struct page_pool *pool, struct page *page)
 }
 
 static void page_pool_set_pp_info(struct page_pool *pool,
-				  struct page *page)
+				  struct netmem *nmem)
 {
-	page->pp = pool;
-	page->pp_magic |= PP_SIGNATURE;
+	nmem->pp = pool;
+	nmem->pp_magic |= PP_SIGNATURE;
 	if (pool->p.init_callback)
-		pool->p.init_callback(page, pool->p.init_arg);
+		pool->p.init_callback(netmem_page(nmem), pool->p.init_arg);
 }
 
 static void page_pool_clear_pp_info(struct netmem *nmem)
@@ -345,26 +346,26 @@ static void page_pool_clear_pp_info(struct netmem *nmem)
 static struct page *__page_pool_alloc_page_order(struct page_pool *pool,
 						 gfp_t gfp)
 {
-	struct page *page;
+	struct netmem *nmem;
 
 	gfp |= __GFP_COMP;
-	page = alloc_pages_node(pool->p.nid, gfp, pool->p.order);
-	if (unlikely(!page))
+	nmem = page_netmem(alloc_pages_node(pool->p.nid, gfp, pool->p.order));
+	if (unlikely(!nmem))
 		return NULL;
 
 	if ((pool->p.flags & PP_FLAG_DMA_MAP) &&
-	    unlikely(!page_pool_dma_map(pool, page))) {
-		put_page(page);
+	    unlikely(!page_pool_dma_map(pool, nmem))) {
+		netmem_put(nmem);
 		return NULL;
 	}
 
 	alloc_stat_inc(pool, slow_high_order);
-	page_pool_set_pp_info(pool, page);
+	page_pool_set_pp_info(pool, nmem);
 
 	/* Track how many pages are held 'in-flight' */
 	pool->pages_state_hold_cnt++;
-	trace_page_pool_state_hold(pool, page, pool->pages_state_hold_cnt);
-	return page;
+	trace_page_pool_state_hold(pool, nmem, pool->pages_state_hold_cnt);
+	return netmem_page(nmem);
 }
 
 /* slow path */
@@ -398,18 +399,18 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
 	 * page element have not been (possibly) DMA mapped.
 	 */
 	for (i = 0; i < nr_pages; i++) {
-		page = pool->alloc.cache[i];
+		struct netmem *nmem = page_netmem(pool->alloc.cache[i]);
 		if ((pp_flags & PP_FLAG_DMA_MAP) &&
-		    unlikely(!page_pool_dma_map(pool, page))) {
-			put_page(page);
+		    unlikely(!page_pool_dma_map(pool, nmem))) {
+			netmem_put(nmem);
 			continue;
 		}
 
-		page_pool_set_pp_info(pool, page);
-		pool->alloc.cache[pool->alloc.count++] = page;
+		page_pool_set_pp_info(pool, nmem);
+		pool->alloc.cache[pool->alloc.count++] = netmem_page(nmem);
 		/* Track how many pages are held 'in-flight' */
 		pool->pages_state_hold_cnt++;
-		trace_page_pool_state_hold(pool, page,
+		trace_page_pool_state_hold(pool, nmem,
 					   pool->pages_state_hold_cnt);
 	}
 
@@ -421,7 +422,8 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
 		page = NULL;
 	}
 
-	/* When page just alloc'ed is should/must have refcnt 1. */
+	/* When page just allocated it should have refcnt 1 (but may have
+	 * speculative references) */
 	return page;
 }
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 06/24] page_pool: Convert page_pool_return_page() to page_pool_return_netmem()
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (4 preceding siblings ...)
  2022-11-30 22:07 ` [PATCH 05/24] page_pool: Start using netmem in allocation path Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 07/24] page_pool: Convert __page_pool_put_page() to __page_pool_put_netmem() Matthew Wilcox (Oracle)
                   ` (20 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

Removes a call to compound_head(), saving 464 bytes of kernel text
as page_pool_return_page() is inlined seven times.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 net/core/page_pool.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 4e985502c569..b606952773a6 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -220,7 +220,13 @@ struct page_pool *page_pool_create(const struct page_pool_params *params)
 }
 EXPORT_SYMBOL(page_pool_create);
 
-static void page_pool_return_page(struct page_pool *pool, struct page *page);
+static void page_pool_return_netmem(struct page_pool *pool, struct netmem *nm);
+
+static inline
+void page_pool_return_page(struct page_pool *pool, struct page *page)
+{
+	page_pool_return_netmem(pool, page_netmem(page));
+}
 
 noinline
 static struct page *page_pool_refill_alloc_cache(struct page_pool *pool)
@@ -499,11 +505,11 @@ void page_pool_release_netmem(struct page_pool *pool, struct netmem *nmem)
 EXPORT_SYMBOL(page_pool_release_netmem);
 
 /* Return a page to the page allocator, cleaning up our state */
-static void page_pool_return_page(struct page_pool *pool, struct page *page)
+static void page_pool_return_netmem(struct page_pool *pool, struct netmem *nmem)
 {
-	page_pool_release_page(pool, page);
+	page_pool_release_netmem(pool, nmem);
 
-	put_page(page);
+	netmem_put(nmem);
 	/* An optimization would be to call __free_pages(page, pool->p.order)
 	 * knowing page is not part of page-cache (thus avoiding a
 	 * __page_cache_release() call).
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 07/24] page_pool: Convert __page_pool_put_page() to __page_pool_put_netmem()
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (5 preceding siblings ...)
  2022-11-30 22:07 ` [PATCH 06/24] page_pool: Convert page_pool_return_page() to page_pool_return_netmem() Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 08/24] page_pool: Convert pp_alloc_cache to contain netmem Matthew Wilcox (Oracle)
                   ` (19 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

Removes the call to compound_head() hidden in put_page() which
saves 169 bytes of kernel text as __page_pool_put_page() is
inlined twice.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 net/core/page_pool.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index b606952773a6..8f3f7cc5a2d5 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -558,8 +558,8 @@ static bool page_pool_recycle_in_cache(struct page *page,
  * If the page refcnt != 1, then the page will be returned to memory
  * subsystem.
  */
-static __always_inline struct page *
-__page_pool_put_page(struct page_pool *pool, struct page *page,
+static __always_inline struct netmem *
+__page_pool_put_netmem(struct page_pool *pool, struct netmem *nmem,
 		     unsigned int dma_sync_size, bool allow_direct)
 {
 	/* This allocator is optimized for the XDP mode that uses
@@ -571,19 +571,20 @@ __page_pool_put_page(struct page_pool *pool, struct page *page,
 	 * page is NOT reusable when allocated when system is under
 	 * some pressure. (page_is_pfmemalloc)
 	 */
-	if (likely(page_ref_count(page) == 1 && !page_is_pfmemalloc(page))) {
-		/* Read barrier done in page_ref_count / READ_ONCE */
+	if (likely(netmem_ref_count(nmem) == 1 &&
+		   !netmem_is_pfmemalloc(nmem))) {
+		/* Read barrier done in netmem_ref_count / READ_ONCE */
 
 		if (pool->p.flags & PP_FLAG_DMA_SYNC_DEV)
-			page_pool_dma_sync_for_device(pool, page,
+			page_pool_dma_sync_for_device(pool, netmem_page(nmem),
 						      dma_sync_size);
 
 		if (allow_direct && in_serving_softirq() &&
-		    page_pool_recycle_in_cache(page, pool))
+		    page_pool_recycle_in_cache(netmem_page(nmem), pool))
 			return NULL;
 
 		/* Page found as candidate for recycling */
-		return page;
+		return nmem;
 	}
 	/* Fallback/non-XDP mode: API user have elevated refcnt.
 	 *
@@ -599,13 +600,21 @@ __page_pool_put_page(struct page_pool *pool, struct page *page,
 	 * will be invoking put_page.
 	 */
 	recycle_stat_inc(pool, released_refcnt);
-	/* Do not replace this with page_pool_return_page() */
-	page_pool_release_page(pool, page);
-	put_page(page);
+	/* Do not replace this with page_pool_return_netmem() */
+	page_pool_release_netmem(pool, nmem);
+	netmem_put(nmem);
 
 	return NULL;
 }
 
+static __always_inline struct page *
+__page_pool_put_page(struct page_pool *pool, struct page *page,
+		     unsigned int dma_sync_size, bool allow_direct)
+{
+	return netmem_page(__page_pool_put_netmem(pool, page_netmem(page),
+						dma_sync_size, allow_direct));
+}
+
 void page_pool_put_defragged_page(struct page_pool *pool, struct page *page,
 				  unsigned int dma_sync_size, bool allow_direct)
 {
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 08/24] page_pool: Convert pp_alloc_cache to contain netmem
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (6 preceding siblings ...)
  2022-11-30 22:07 ` [PATCH 07/24] page_pool: Convert __page_pool_put_page() to __page_pool_put_netmem() Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 09/24] page_pool: Convert page_pool_defrag_page() to page_pool_defrag_netmem() Matthew Wilcox (Oracle)
                   ` (18 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

Change the type here from page to netmem.  It works out well to
convert page_pool_refill_alloc_cache() to return a netmem instead
of a page as part of this commit.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/net/page_pool.h |  2 +-
 net/core/page_pool.c    | 52 ++++++++++++++++++++---------------------
 2 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index 453797f9cb90..88eb5be77b2c 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -151,7 +151,7 @@ static inline bool netmem_is_pfmemalloc(const struct netmem *nmem)
 #define PP_ALLOC_CACHE_REFILL	64
 struct pp_alloc_cache {
 	u32 count;
-	struct page *cache[PP_ALLOC_CACHE_SIZE];
+	struct netmem *cache[PP_ALLOC_CACHE_SIZE];
 };
 
 struct page_pool_params {
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 8f3f7cc5a2d5..c54217ce6b77 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -229,10 +229,10 @@ void page_pool_return_page(struct page_pool *pool, struct page *page)
 }
 
 noinline
-static struct page *page_pool_refill_alloc_cache(struct page_pool *pool)
+static struct netmem *page_pool_refill_alloc_cache(struct page_pool *pool)
 {
 	struct ptr_ring *r = &pool->ring;
-	struct page *page;
+	struct netmem *nmem;
 	int pref_nid; /* preferred NUMA node */
 
 	/* Quicker fallback, avoid locks when ring is empty */
@@ -253,49 +253,49 @@ static struct page *page_pool_refill_alloc_cache(struct page_pool *pool)
 
 	/* Refill alloc array, but only if NUMA match */
 	do {
-		page = __ptr_ring_consume(r);
-		if (unlikely(!page))
+		nmem = __ptr_ring_consume(r);
+		if (unlikely(!nmem))
 			break;
 
-		if (likely(page_to_nid(page) == pref_nid)) {
-			pool->alloc.cache[pool->alloc.count++] = page;
+		if (likely(netmem_nid(nmem) == pref_nid)) {
+			pool->alloc.cache[pool->alloc.count++] = nmem;
 		} else {
 			/* NUMA mismatch;
 			 * (1) release 1 page to page-allocator and
 			 * (2) break out to fallthrough to alloc_pages_node.
 			 * This limit stress on page buddy alloactor.
 			 */
-			page_pool_return_page(pool, page);
+			page_pool_return_netmem(pool, nmem);
 			alloc_stat_inc(pool, waive);
-			page = NULL;
+			nmem = NULL;
 			break;
 		}
 	} while (pool->alloc.count < PP_ALLOC_CACHE_REFILL);
 
 	/* Return last page */
 	if (likely(pool->alloc.count > 0)) {
-		page = pool->alloc.cache[--pool->alloc.count];
+		nmem = pool->alloc.cache[--pool->alloc.count];
 		alloc_stat_inc(pool, refill);
 	}
 
-	return page;
+	return nmem;
 }
 
 /* fast path */
 static struct page *__page_pool_get_cached(struct page_pool *pool)
 {
-	struct page *page;
+	struct netmem *nmem;
 
 	/* Caller MUST guarantee safe non-concurrent access, e.g. softirq */
 	if (likely(pool->alloc.count)) {
 		/* Fast-path */
-		page = pool->alloc.cache[--pool->alloc.count];
+		nmem = pool->alloc.cache[--pool->alloc.count];
 		alloc_stat_inc(pool, fast);
 	} else {
-		page = page_pool_refill_alloc_cache(pool);
+		nmem = page_pool_refill_alloc_cache(pool);
 	}
 
-	return page;
+	return netmem_page(nmem);
 }
 
 static void page_pool_dma_sync_for_device(struct page_pool *pool,
@@ -391,13 +391,13 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
 
 	/* Unnecessary as alloc cache is empty, but guarantees zero count */
 	if (unlikely(pool->alloc.count > 0))
-		return pool->alloc.cache[--pool->alloc.count];
+		return netmem_page(pool->alloc.cache[--pool->alloc.count]);
 
 	/* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array */
 	memset(&pool->alloc.cache, 0, sizeof(void *) * bulk);
 
 	nr_pages = alloc_pages_bulk_array_node(gfp, pool->p.nid, bulk,
-					       pool->alloc.cache);
+					(struct page **)pool->alloc.cache);
 	if (unlikely(!nr_pages))
 		return NULL;
 
@@ -405,7 +405,7 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
 	 * page element have not been (possibly) DMA mapped.
 	 */
 	for (i = 0; i < nr_pages; i++) {
-		struct netmem *nmem = page_netmem(pool->alloc.cache[i]);
+		struct netmem *nmem = pool->alloc.cache[i];
 		if ((pp_flags & PP_FLAG_DMA_MAP) &&
 		    unlikely(!page_pool_dma_map(pool, nmem))) {
 			netmem_put(nmem);
@@ -413,7 +413,7 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
 		}
 
 		page_pool_set_pp_info(pool, nmem);
-		pool->alloc.cache[pool->alloc.count++] = netmem_page(nmem);
+		pool->alloc.cache[pool->alloc.count++] = nmem;
 		/* Track how many pages are held 'in-flight' */
 		pool->pages_state_hold_cnt++;
 		trace_page_pool_state_hold(pool, nmem,
@@ -422,7 +422,7 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
 
 	/* Return last page */
 	if (likely(pool->alloc.count > 0)) {
-		page = pool->alloc.cache[--pool->alloc.count];
+		page = netmem_page(pool->alloc.cache[--pool->alloc.count]);
 		alloc_stat_inc(pool, slow);
 	} else {
 		page = NULL;
@@ -547,7 +547,7 @@ static bool page_pool_recycle_in_cache(struct page *page,
 	}
 
 	/* Caller MUST have verified/know (page_ref_count(page) == 1) */
-	pool->alloc.cache[pool->alloc.count++] = page;
+	pool->alloc.cache[pool->alloc.count++] = page_netmem(page);
 	recycle_stat_inc(pool, cached);
 	return true;
 }
@@ -785,7 +785,7 @@ static void page_pool_free(struct page_pool *pool)
 
 static void page_pool_empty_alloc_cache_once(struct page_pool *pool)
 {
-	struct page *page;
+	struct netmem *nmem;
 
 	if (pool->destroy_cnt)
 		return;
@@ -795,8 +795,8 @@ static void page_pool_empty_alloc_cache_once(struct page_pool *pool)
 	 * call concurrently.
 	 */
 	while (pool->alloc.count) {
-		page = pool->alloc.cache[--pool->alloc.count];
-		page_pool_return_page(pool, page);
+		nmem = pool->alloc.cache[--pool->alloc.count];
+		page_pool_return_netmem(pool, nmem);
 	}
 }
 
@@ -878,15 +878,15 @@ EXPORT_SYMBOL(page_pool_destroy);
 /* Caller must provide appropriate safe context, e.g. NAPI. */
 void page_pool_update_nid(struct page_pool *pool, int new_nid)
 {
-	struct page *page;
+	struct netmem *nmem;
 
 	trace_page_pool_update_nid(pool, new_nid);
 	pool->p.nid = new_nid;
 
 	/* Flush pool alloc cache, as refill will check NUMA node */
 	while (pool->alloc.count) {
-		page = pool->alloc.cache[--pool->alloc.count];
-		page_pool_return_page(pool, page);
+		nmem = pool->alloc.cache[--pool->alloc.count];
+		page_pool_return_netmem(pool, nmem);
 	}
 }
 EXPORT_SYMBOL(page_pool_update_nid);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 09/24] page_pool: Convert page_pool_defrag_page() to page_pool_defrag_netmem()
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (7 preceding siblings ...)
  2022-11-30 22:07 ` [PATCH 08/24] page_pool: Convert pp_alloc_cache to contain netmem Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 10/24] page_pool: Convert page_pool_put_defragged_page() to netmem Matthew Wilcox (Oracle)
                   ` (17 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

Add a page_pool_defrag_page() wrapper.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/net/page_pool.h | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index 88eb5be77b2c..bfb77b75f333 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -371,7 +371,7 @@ static inline void page_pool_fragment_page(struct page *page, long nr)
 	atomic_long_set(&page->pp_frag_count, nr);
 }
 
-static inline long page_pool_defrag_page(struct page *page, long nr)
+static inline long page_pool_defrag_netmem(struct netmem *nmem, long nr)
 {
 	long ret;
 
@@ -384,14 +384,19 @@ static inline long page_pool_defrag_page(struct page *page, long nr)
 	 * especially when dealing with a page that may be partitioned
 	 * into only 2 or 3 pieces.
 	 */
-	if (atomic_long_read(&page->pp_frag_count) == nr)
+	if (atomic_long_read(&nmem->pp_frag_count) == nr)
 		return 0;
 
-	ret = atomic_long_sub_return(nr, &page->pp_frag_count);
+	ret = atomic_long_sub_return(nr, &nmem->pp_frag_count);
 	WARN_ON(ret < 0);
 	return ret;
 }
 
+static inline long page_pool_defrag_page(struct page *page, long nr)
+{
+	return page_pool_defrag_netmem(page_netmem(page), nr);
+}
+
 static inline bool page_pool_is_last_frag(struct page_pool *pool,
 					  struct page *page)
 {
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 10/24] page_pool: Convert page_pool_put_defragged_page() to netmem
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (8 preceding siblings ...)
  2022-11-30 22:07 ` [PATCH 09/24] page_pool: Convert page_pool_defrag_page() to page_pool_defrag_netmem() Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 11/24] page_pool: Convert page_pool_empty_ring() to use netmem Matthew Wilcox (Oracle)
                   ` (16 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

Also convert page_pool_is_last_frag(), page_pool_put_page(),
page_pool_recycle_in_ring() and use netmem in page_pool_put_page_bulk().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/net/page_pool.h | 23 ++++++++++++++++-------
 net/core/page_pool.c    | 29 +++++++++++++++--------------
 2 files changed, 31 insertions(+), 21 deletions(-)

diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index bfb77b75f333..db617073025e 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -362,7 +362,7 @@ static inline void page_pool_release_page(struct page_pool *pool,
 	page_pool_release_netmem(pool, page_netmem(page));
 }
 
-void page_pool_put_defragged_page(struct page_pool *pool, struct page *page,
+void page_pool_put_defragged_netmem(struct page_pool *pool, struct netmem *nmem,
 				  unsigned int dma_sync_size,
 				  bool allow_direct);
 
@@ -398,15 +398,15 @@ static inline long page_pool_defrag_page(struct page *page, long nr)
 }
 
 static inline bool page_pool_is_last_frag(struct page_pool *pool,
-					  struct page *page)
+					  struct netmem *nmem)
 {
 	/* If fragments aren't enabled or count is 0 we were the last user */
 	return !(pool->p.flags & PP_FLAG_PAGE_FRAG) ||
-	       (page_pool_defrag_page(page, 1) == 0);
+	       (page_pool_defrag_netmem(nmem, 1) == 0);
 }
 
-static inline void page_pool_put_page(struct page_pool *pool,
-				      struct page *page,
+static inline void page_pool_put_netmem(struct page_pool *pool,
+				      struct netmem *nmem,
 				      unsigned int dma_sync_size,
 				      bool allow_direct)
 {
@@ -414,13 +414,22 @@ static inline void page_pool_put_page(struct page_pool *pool,
 	 * allow registering MEM_TYPE_PAGE_POOL, but shield linker.
 	 */
 #ifdef CONFIG_PAGE_POOL
-	if (!page_pool_is_last_frag(pool, page))
+	if (!page_pool_is_last_frag(pool, nmem))
 		return;
 
-	page_pool_put_defragged_page(pool, page, dma_sync_size, allow_direct);
+	page_pool_put_defragged_netmem(pool, nmem, dma_sync_size, allow_direct);
 #endif
 }
 
+static inline void page_pool_put_page(struct page_pool *pool,
+				      struct page *page,
+				      unsigned int dma_sync_size,
+				      bool allow_direct)
+{
+	page_pool_put_netmem(pool, page_netmem(page), dma_sync_size,
+				allow_direct);
+}
+
 /* Same as above but will try to sync the entire area pool->max_len */
 static inline void page_pool_put_full_page(struct page_pool *pool,
 					   struct page *page, bool allow_direct)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index c54217ce6b77..e727a74504c2 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -516,14 +516,15 @@ static void page_pool_return_netmem(struct page_pool *pool, struct netmem *nmem)
 	 */
 }
 
-static bool page_pool_recycle_in_ring(struct page_pool *pool, struct page *page)
+static bool page_pool_recycle_in_ring(struct page_pool *pool,
+					struct netmem *nmem)
 {
 	int ret;
 	/* BH protection not needed if current is serving softirq */
 	if (in_serving_softirq())
-		ret = ptr_ring_produce(&pool->ring, page);
+		ret = ptr_ring_produce(&pool->ring, nmem);
 	else
-		ret = ptr_ring_produce_bh(&pool->ring, page);
+		ret = ptr_ring_produce_bh(&pool->ring, nmem);
 
 	if (!ret) {
 		recycle_stat_inc(pool, ring);
@@ -615,17 +616,17 @@ __page_pool_put_page(struct page_pool *pool, struct page *page,
 						dma_sync_size, allow_direct));
 }
 
-void page_pool_put_defragged_page(struct page_pool *pool, struct page *page,
+void page_pool_put_defragged_netmem(struct page_pool *pool, struct netmem *nmem,
 				  unsigned int dma_sync_size, bool allow_direct)
 {
-	page = __page_pool_put_page(pool, page, dma_sync_size, allow_direct);
-	if (page && !page_pool_recycle_in_ring(pool, page)) {
+	nmem = __page_pool_put_netmem(pool, nmem, dma_sync_size, allow_direct);
+	if (nmem && !page_pool_recycle_in_ring(pool, nmem)) {
 		/* Cache full, fallback to free pages */
 		recycle_stat_inc(pool, ring_full);
-		page_pool_return_page(pool, page);
+		page_pool_return_netmem(pool, nmem);
 	}
 }
-EXPORT_SYMBOL(page_pool_put_defragged_page);
+EXPORT_SYMBOL(page_pool_put_defragged_netmem);
 
 /* Caller must not use data area after call, as this function overwrites it */
 void page_pool_put_page_bulk(struct page_pool *pool, void **data,
@@ -634,16 +635,16 @@ void page_pool_put_page_bulk(struct page_pool *pool, void **data,
 	int i, bulk_len = 0;
 
 	for (i = 0; i < count; i++) {
-		struct page *page = virt_to_head_page(data[i]);
+		struct netmem *nmem = virt_to_netmem(data[i]);
 
 		/* It is not the last user for the page frag case */
-		if (!page_pool_is_last_frag(pool, page))
+		if (!page_pool_is_last_frag(pool, nmem))
 			continue;
 
-		page = __page_pool_put_page(pool, page, -1, false);
+		nmem = __page_pool_put_netmem(pool, nmem, -1, false);
 		/* Approved for bulk recycling in ptr_ring cache */
-		if (page)
-			data[bulk_len++] = page;
+		if (nmem)
+			data[bulk_len++] = nmem;
 	}
 
 	if (unlikely(!bulk_len))
@@ -669,7 +670,7 @@ void page_pool_put_page_bulk(struct page_pool *pool, void **data,
 	 * since put_page() with refcnt == 1 can be an expensive operation
 	 */
 	for (; i < bulk_len; i++)
-		page_pool_return_page(pool, data[i]);
+		page_pool_return_netmem(pool, data[i]);
 }
 EXPORT_SYMBOL(page_pool_put_page_bulk);
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 11/24] page_pool: Convert page_pool_empty_ring() to use netmem
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (9 preceding siblings ...)
  2022-11-30 22:07 ` [PATCH 10/24] page_pool: Convert page_pool_put_defragged_page() to netmem Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-12-02 21:25   ` Alexander H Duyck
  2022-11-30 22:07 ` [PATCH 12/24] page_pool: Convert page_pool_alloc_pages() to page_pool_alloc_netmem() Matthew Wilcox (Oracle)
                   ` (15 subsequent siblings)
  26 siblings, 1 reply; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

Retrieve a netmem from the ptr_ring instead of a page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 net/core/page_pool.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index e727a74504c2..7a77e3220205 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -755,16 +755,16 @@ EXPORT_SYMBOL(page_pool_alloc_frag);
 
 static void page_pool_empty_ring(struct page_pool *pool)
 {
-	struct page *page;
+	struct netmem *nmem;
 
 	/* Empty recycle ring */
-	while ((page = ptr_ring_consume_bh(&pool->ring))) {
+	while ((nmem = ptr_ring_consume_bh(&pool->ring)) != NULL) {
 		/* Verify the refcnt invariant of cached pages */
-		if (!(page_ref_count(page) == 1))
+		if (!(netmem_ref_count(nmem) == 1))
 			pr_crit("%s() page_pool refcnt %d violation\n",
-				__func__, page_ref_count(page));
+				__func__, netmem_ref_count(nmem));
 
-		page_pool_return_page(pool, page);
+		page_pool_return_netmem(pool, nmem);
 	}
 }
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 12/24] page_pool: Convert page_pool_alloc_pages() to page_pool_alloc_netmem()
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (10 preceding siblings ...)
  2022-11-30 22:07 ` [PATCH 11/24] page_pool: Convert page_pool_empty_ring() to use netmem Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 13/24] page_pool: Convert page_pool_dma_sync_for_device() to take a netmem Matthew Wilcox (Oracle)
                   ` (14 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

Add a page_pool_alloc_pages() compatibility wrapper.  Also convert
__page_pool_alloc_pages_slow() to __page_pool_alloc_netmem_slow()
and __page_pool_alloc_page_order() to __page_pool_alloc_netmem().
__page_pool_get_cached() is converted to return a netmem.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/net/page_pool.h |  8 +++++++-
 net/core/page_pool.c    | 39 +++++++++++++++++++--------------------
 2 files changed, 26 insertions(+), 21 deletions(-)

diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index db617073025e..4c730591de46 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -292,7 +292,13 @@ struct page_pool {
 	u64 destroy_cnt;
 };
 
-struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp);
+struct netmem *page_pool_alloc_netmem(struct page_pool *pool, gfp_t gfp);
+
+static inline
+struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp)
+{
+	return netmem_page(page_pool_alloc_netmem(pool, gfp));
+}
 
 static inline struct page *page_pool_dev_alloc_pages(struct page_pool *pool)
 {
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 7a77e3220205..efe9f1471caa 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -282,7 +282,7 @@ static struct netmem *page_pool_refill_alloc_cache(struct page_pool *pool)
 }
 
 /* fast path */
-static struct page *__page_pool_get_cached(struct page_pool *pool)
+static struct netmem *__page_pool_get_cached(struct page_pool *pool)
 {
 	struct netmem *nmem;
 
@@ -295,7 +295,7 @@ static struct page *__page_pool_get_cached(struct page_pool *pool)
 		nmem = page_pool_refill_alloc_cache(pool);
 	}
 
-	return netmem_page(nmem);
+	return nmem;
 }
 
 static void page_pool_dma_sync_for_device(struct page_pool *pool,
@@ -349,8 +349,8 @@ static void page_pool_clear_pp_info(struct netmem *nmem)
 	nmem->pp = NULL;
 }
 
-static struct page *__page_pool_alloc_page_order(struct page_pool *pool,
-						 gfp_t gfp)
+static
+struct netmem *__page_pool_alloc_netmem(struct page_pool *pool, gfp_t gfp)
 {
 	struct netmem *nmem;
 
@@ -371,27 +371,27 @@ static struct page *__page_pool_alloc_page_order(struct page_pool *pool,
 	/* Track how many pages are held 'in-flight' */
 	pool->pages_state_hold_cnt++;
 	trace_page_pool_state_hold(pool, nmem, pool->pages_state_hold_cnt);
-	return netmem_page(nmem);
+	return nmem;
 }
 
 /* slow path */
 noinline
-static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
+static struct netmem *__page_pool_alloc_netmem_slow(struct page_pool *pool,
 						 gfp_t gfp)
 {
 	const int bulk = PP_ALLOC_CACHE_REFILL;
 	unsigned int pp_flags = pool->p.flags;
 	unsigned int pp_order = pool->p.order;
-	struct page *page;
+	struct netmem *nmem;
 	int i, nr_pages;
 
 	/* Don't support bulk alloc for high-order pages */
 	if (unlikely(pp_order))
-		return __page_pool_alloc_page_order(pool, gfp);
+		return __page_pool_alloc_netmem(pool, gfp);
 
 	/* Unnecessary as alloc cache is empty, but guarantees zero count */
 	if (unlikely(pool->alloc.count > 0))
-		return netmem_page(pool->alloc.cache[--pool->alloc.count]);
+		return pool->alloc.cache[--pool->alloc.count];
 
 	/* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array */
 	memset(&pool->alloc.cache, 0, sizeof(void *) * bulk);
@@ -422,34 +422,33 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
 
 	/* Return last page */
 	if (likely(pool->alloc.count > 0)) {
-		page = netmem_page(pool->alloc.cache[--pool->alloc.count]);
+		nmem = pool->alloc.cache[--pool->alloc.count];
 		alloc_stat_inc(pool, slow);
 	} else {
-		page = NULL;
+		nmem = NULL;
 	}
 
 	/* When page just allocated it should have refcnt 1 (but may have
 	 * speculative references) */
-	return page;
+	return nmem;
 }
 
 /* For using page_pool replace: alloc_pages() API calls, but provide
  * synchronization guarantee for allocation side.
  */
-struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp)
+struct netmem *page_pool_alloc_netmem(struct page_pool *pool, gfp_t gfp)
 {
-	struct page *page;
+	struct netmem *nmem;
 
 	/* Fast-path: Get a page from cache */
-	page = __page_pool_get_cached(pool);
-	if (page)
-		return page;
+	nmem = __page_pool_get_cached(pool);
+	if (nmem)
+		return nmem;
 
 	/* Slow-path: cache empty, do real allocation */
-	page = __page_pool_alloc_pages_slow(pool, gfp);
-	return page;
+	return __page_pool_alloc_netmem_slow(pool, gfp);
 }
-EXPORT_SYMBOL(page_pool_alloc_pages);
+EXPORT_SYMBOL(page_pool_alloc_netmem);
 
 /* Calculate distance between two u32 values, valid if distance is below 2^(31)
  *  https://en.wikipedia.org/wiki/Serial_number_arithmetic#General_Solution
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 13/24] page_pool: Convert page_pool_dma_sync_for_device() to take a netmem
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (11 preceding siblings ...)
  2022-11-30 22:07 ` [PATCH 12/24] page_pool: Convert page_pool_alloc_pages() to page_pool_alloc_netmem() Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 14/24] page_pool: Convert page_pool_recycle_in_cache() to netmem Matthew Wilcox (Oracle)
                   ` (13 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

Change all callers.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 net/core/page_pool.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index efe9f1471caa..9ef65b383b40 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -299,10 +299,10 @@ static struct netmem *__page_pool_get_cached(struct page_pool *pool)
 }
 
 static void page_pool_dma_sync_for_device(struct page_pool *pool,
-					  struct page *page,
+					  struct netmem *nmem,
 					  unsigned int dma_sync_size)
 {
-	dma_addr_t dma_addr = page_pool_get_dma_addr(page);
+	dma_addr_t dma_addr = netmem_get_dma_addr(nmem);
 
 	dma_sync_size = min(dma_sync_size, pool->p.max_len);
 	dma_sync_single_range_for_device(pool->p.dev, dma_addr,
@@ -329,7 +329,7 @@ static bool page_pool_dma_map(struct page_pool *pool, struct netmem *nmem)
 	page_pool_set_dma_addr(page, dma);
 
 	if (pool->p.flags & PP_FLAG_DMA_SYNC_DEV)
-		page_pool_dma_sync_for_device(pool, page, pool->p.max_len);
+		page_pool_dma_sync_for_device(pool, nmem, pool->p.max_len);
 
 	return true;
 }
@@ -576,7 +576,7 @@ __page_pool_put_netmem(struct page_pool *pool, struct netmem *nmem,
 		/* Read barrier done in netmem_ref_count / READ_ONCE */
 
 		if (pool->p.flags & PP_FLAG_DMA_SYNC_DEV)
-			page_pool_dma_sync_for_device(pool, netmem_page(nmem),
+			page_pool_dma_sync_for_device(pool, nmem,
 						      dma_sync_size);
 
 		if (allow_direct && in_serving_softirq() &&
@@ -676,6 +676,7 @@ EXPORT_SYMBOL(page_pool_put_page_bulk);
 static struct page *page_pool_drain_frag(struct page_pool *pool,
 					 struct page *page)
 {
+	struct netmem *nmem = page_netmem(page);
 	long drain_count = BIAS_MAX - pool->frag_users;
 
 	/* Some user is still using the page frag */
@@ -684,7 +685,7 @@ static struct page *page_pool_drain_frag(struct page_pool *pool,
 
 	if (page_ref_count(page) == 1 && !page_is_pfmemalloc(page)) {
 		if (pool->p.flags & PP_FLAG_DMA_SYNC_DEV)
-			page_pool_dma_sync_for_device(pool, page, -1);
+			page_pool_dma_sync_for_device(pool, nmem, -1);
 
 		return page;
 	}
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 14/24] page_pool: Convert page_pool_recycle_in_cache() to netmem
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (12 preceding siblings ...)
  2022-11-30 22:07 ` [PATCH 13/24] page_pool: Convert page_pool_dma_sync_for_device() to take a netmem Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 15/24] page_pool: Remove page_pool_defrag_page() Matthew Wilcox (Oracle)
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

Removes a few casts.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 net/core/page_pool.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 9ef65b383b40..b34d1695698a 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -538,7 +538,7 @@ static bool page_pool_recycle_in_ring(struct page_pool *pool,
  *
  * Caller must provide appropriate safe context.
  */
-static bool page_pool_recycle_in_cache(struct page *page,
+static bool page_pool_recycle_in_cache(struct netmem *nmem,
 				       struct page_pool *pool)
 {
 	if (unlikely(pool->alloc.count == PP_ALLOC_CACHE_SIZE)) {
@@ -547,7 +547,7 @@ static bool page_pool_recycle_in_cache(struct page *page,
 	}
 
 	/* Caller MUST have verified/know (page_ref_count(page) == 1) */
-	pool->alloc.cache[pool->alloc.count++] = page_netmem(page);
+	pool->alloc.cache[pool->alloc.count++] = nmem;
 	recycle_stat_inc(pool, cached);
 	return true;
 }
@@ -580,7 +580,7 @@ __page_pool_put_netmem(struct page_pool *pool, struct netmem *nmem,
 						      dma_sync_size);
 
 		if (allow_direct && in_serving_softirq() &&
-		    page_pool_recycle_in_cache(netmem_page(nmem), pool))
+		    page_pool_recycle_in_cache(nmem, pool))
 			return NULL;
 
 		/* Page found as candidate for recycling */
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 15/24] page_pool: Remove page_pool_defrag_page()
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (13 preceding siblings ...)
  2022-11-30 22:07 ` [PATCH 14/24] page_pool: Convert page_pool_recycle_in_cache() to netmem Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 16/24] page_pool: Use netmem in page_pool_drain_frag() Matthew Wilcox (Oracle)
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

This wrapper is no longer used.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 net/core/page_pool.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index b34d1695698a..c89a13393a23 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -607,14 +607,6 @@ __page_pool_put_netmem(struct page_pool *pool, struct netmem *nmem,
 	return NULL;
 }
 
-static __always_inline struct page *
-__page_pool_put_page(struct page_pool *pool, struct page *page,
-		     unsigned int dma_sync_size, bool allow_direct)
-{
-	return netmem_page(__page_pool_put_netmem(pool, page_netmem(page),
-						dma_sync_size, allow_direct));
-}
-
 void page_pool_put_defragged_netmem(struct page_pool *pool, struct netmem *nmem,
 				  unsigned int dma_sync_size, bool allow_direct)
 {
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 16/24] page_pool: Use netmem in page_pool_drain_frag()
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (14 preceding siblings ...)
  2022-11-30 22:07 ` [PATCH 15/24] page_pool: Remove page_pool_defrag_page() Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 17/24] page_pool: Convert page_pool_return_skb_page() to use netmem Matthew Wilcox (Oracle)
                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

We're not quite ready to change the API of page_pool_drain_frag(),
but we can remove the use of several wrappers by using the netmem
throughout.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 net/core/page_pool.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index c89a13393a23..39f09d011a46 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -672,17 +672,17 @@ static struct page *page_pool_drain_frag(struct page_pool *pool,
 	long drain_count = BIAS_MAX - pool->frag_users;
 
 	/* Some user is still using the page frag */
-	if (likely(page_pool_defrag_page(page, drain_count)))
+	if (likely(page_pool_defrag_netmem(nmem, drain_count)))
 		return NULL;
 
-	if (page_ref_count(page) == 1 && !page_is_pfmemalloc(page)) {
+	if (netmem_ref_count(nmem) == 1 && !netmem_is_pfmemalloc(nmem)) {
 		if (pool->p.flags & PP_FLAG_DMA_SYNC_DEV)
 			page_pool_dma_sync_for_device(pool, nmem, -1);
 
 		return page;
 	}
 
-	page_pool_return_page(pool, page);
+	page_pool_return_netmem(pool, nmem);
 	return NULL;
 }
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 17/24] page_pool: Convert page_pool_return_skb_page() to use netmem
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (15 preceding siblings ...)
  2022-11-30 22:07 ` [PATCH 16/24] page_pool: Use netmem in page_pool_drain_frag() Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 18/24] page_pool: Convert frag_page to frag_nmem Matthew Wilcox (Oracle)
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

This function accesses the pagepool members of struct page directly,
so it needs to become netmem.  Add page_pool_put_full_netmem().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/net/page_pool.h |  8 +++++++-
 net/core/page_pool.c    | 13 ++++++-------
 2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index 4c730591de46..701f94947e8a 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -437,10 +437,16 @@ static inline void page_pool_put_page(struct page_pool *pool,
 }
 
 /* Same as above but will try to sync the entire area pool->max_len */
+static inline void page_pool_put_full_netmem(struct page_pool *pool,
+		struct netmem *nmem, bool allow_direct)
+{
+	page_pool_put_netmem(pool, nmem, -1, allow_direct);
+}
+
 static inline void page_pool_put_full_page(struct page_pool *pool,
 					   struct page *page, bool allow_direct)
 {
-	page_pool_put_page(pool, page, -1, allow_direct);
+	page_pool_put_full_netmem(pool, page_netmem(page), allow_direct);
 }
 
 /* Same as above but the caller must guarantee safe context. e.g NAPI */
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 39f09d011a46..b4540d242081 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -886,28 +886,27 @@ EXPORT_SYMBOL(page_pool_update_nid);
 
 bool page_pool_return_skb_page(struct page *page)
 {
+	struct netmem *nmem = page_netmem(compound_head(page));
 	struct page_pool *pp;
 
-	page = compound_head(page);
-
-	/* page->pp_magic is OR'ed with PP_SIGNATURE after the allocation
+	/* nmem->pp_magic is OR'ed with PP_SIGNATURE after the allocation
 	 * in order to preserve any existing bits, such as bit 0 for the
 	 * head page of compound page and bit 1 for pfmemalloc page, so
 	 * mask those bits for freeing side when doing below checking,
-	 * and page_is_pfmemalloc() is checked in __page_pool_put_page()
+	 * and netmem_is_pfmemalloc() is checked in __page_pool_put_netmem()
 	 * to avoid recycling the pfmemalloc page.
 	 */
-	if (unlikely((page->pp_magic & ~0x3UL) != PP_SIGNATURE))
+	if (unlikely((nmem->pp_magic & ~0x3UL) != PP_SIGNATURE))
 		return false;
 
-	pp = page->pp;
+	pp = nmem->pp;
 
 	/* Driver set this to memory recycling info. Reset it on recycle.
 	 * This will *not* work for NIC using a split-page memory model.
 	 * The page will be returned to the pool here regardless of the
 	 * 'flipped' fragment being in use or not.
 	 */
-	page_pool_put_full_page(pp, page, false);
+	page_pool_put_full_netmem(pp, nmem, false);
 
 	return true;
 }
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 18/24] page_pool: Convert frag_page to frag_nmem
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (16 preceding siblings ...)
  2022-11-30 22:07 ` [PATCH 17/24] page_pool: Convert page_pool_return_skb_page() to use netmem Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 19/24] xdp: Convert to netmem Matthew Wilcox (Oracle)
                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

Remove page_pool_defrag_page() and page_pool_return_page() as they have
no more callers.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/net/page_pool.h | 17 ++++++---------
 net/core/page_pool.c    | 47 ++++++++++++++++++-----------------------
 2 files changed, 26 insertions(+), 38 deletions(-)

diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index 701f94947e8a..ce1049a03f2d 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -240,7 +240,7 @@ struct page_pool {
 
 	u32 pages_state_hold_cnt;
 	unsigned int frag_offset;
-	struct page *frag_page;
+	struct netmem *frag_nmem;
 	long frag_users;
 
 #ifdef CONFIG_PAGE_POOL_STATS
@@ -307,8 +307,8 @@ static inline struct page *page_pool_dev_alloc_pages(struct page_pool *pool)
 	return page_pool_alloc_pages(pool, gfp);
 }
 
-struct page *page_pool_alloc_frag(struct page_pool *pool, unsigned int *offset,
-				  unsigned int size, gfp_t gfp);
+struct netmem *page_pool_alloc_frag(struct page_pool *pool,
+		unsigned int *offset, unsigned int size, gfp_t gfp);
 
 static inline struct page *page_pool_dev_alloc_frag(struct page_pool *pool,
 						    unsigned int *offset,
@@ -316,7 +316,7 @@ static inline struct page *page_pool_dev_alloc_frag(struct page_pool *pool,
 {
 	gfp_t gfp = (GFP_ATOMIC | __GFP_NOWARN);
 
-	return page_pool_alloc_frag(pool, offset, size, gfp);
+	return netmem_page(page_pool_alloc_frag(pool, offset, size, gfp));
 }
 
 /* get the stored dma direction. A driver might decide to treat this locally and
@@ -372,9 +372,9 @@ void page_pool_put_defragged_netmem(struct page_pool *pool, struct netmem *nmem,
 				  unsigned int dma_sync_size,
 				  bool allow_direct);
 
-static inline void page_pool_fragment_page(struct page *page, long nr)
+static inline void page_pool_fragment_netmem(struct netmem *nmem, long nr)
 {
-	atomic_long_set(&page->pp_frag_count, nr);
+	atomic_long_set(&nmem->pp_frag_count, nr);
 }
 
 static inline long page_pool_defrag_netmem(struct netmem *nmem, long nr)
@@ -398,11 +398,6 @@ static inline long page_pool_defrag_netmem(struct netmem *nmem, long nr)
 	return ret;
 }
 
-static inline long page_pool_defrag_page(struct page *page, long nr)
-{
-	return page_pool_defrag_netmem(page_netmem(page), nr);
-}
-
 static inline bool page_pool_is_last_frag(struct page_pool *pool,
 					  struct netmem *nmem)
 {
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index b4540d242081..5be78ec93af8 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -222,12 +222,6 @@ EXPORT_SYMBOL(page_pool_create);
 
 static void page_pool_return_netmem(struct page_pool *pool, struct netmem *nm);
 
-static inline
-void page_pool_return_page(struct page_pool *pool, struct page *page)
-{
-	page_pool_return_netmem(pool, page_netmem(page));
-}
-
 noinline
 static struct netmem *page_pool_refill_alloc_cache(struct page_pool *pool)
 {
@@ -665,10 +659,9 @@ void page_pool_put_page_bulk(struct page_pool *pool, void **data,
 }
 EXPORT_SYMBOL(page_pool_put_page_bulk);
 
-static struct page *page_pool_drain_frag(struct page_pool *pool,
-					 struct page *page)
+static struct netmem *page_pool_drain_frag(struct page_pool *pool,
+					 struct netmem *nmem)
 {
-	struct netmem *nmem = page_netmem(page);
 	long drain_count = BIAS_MAX - pool->frag_users;
 
 	/* Some user is still using the page frag */
@@ -679,7 +672,7 @@ static struct page *page_pool_drain_frag(struct page_pool *pool,
 		if (pool->p.flags & PP_FLAG_DMA_SYNC_DEV)
 			page_pool_dma_sync_for_device(pool, nmem, -1);
 
-		return page;
+		return nmem;
 	}
 
 	page_pool_return_netmem(pool, nmem);
@@ -689,22 +682,22 @@ static struct page *page_pool_drain_frag(struct page_pool *pool,
 static void page_pool_free_frag(struct page_pool *pool)
 {
 	long drain_count = BIAS_MAX - pool->frag_users;
-	struct page *page = pool->frag_page;
+	struct netmem *nmem = pool->frag_nmem;
 
-	pool->frag_page = NULL;
+	pool->frag_nmem = NULL;
 
-	if (!page || page_pool_defrag_page(page, drain_count))
+	if (!nmem || page_pool_defrag_netmem(nmem, drain_count))
 		return;
 
-	page_pool_return_page(pool, page);
+	page_pool_return_netmem(pool, nmem);
 }
 
-struct page *page_pool_alloc_frag(struct page_pool *pool,
+struct netmem *page_pool_alloc_frag(struct page_pool *pool,
 				  unsigned int *offset,
 				  unsigned int size, gfp_t gfp)
 {
 	unsigned int max_size = PAGE_SIZE << pool->p.order;
-	struct page *page = pool->frag_page;
+	struct netmem *nmem = pool->frag_nmem;
 
 	if (WARN_ON(!(pool->p.flags & PP_FLAG_PAGE_FRAG) ||
 		    size > max_size))
@@ -713,35 +706,35 @@ struct page *page_pool_alloc_frag(struct page_pool *pool,
 	size = ALIGN(size, dma_get_cache_alignment());
 	*offset = pool->frag_offset;
 
-	if (page && *offset + size > max_size) {
-		page = page_pool_drain_frag(pool, page);
-		if (page) {
+	if (nmem && *offset + size > max_size) {
+		nmem = page_pool_drain_frag(pool, nmem);
+		if (nmem) {
 			alloc_stat_inc(pool, fast);
 			goto frag_reset;
 		}
 	}
 
-	if (!page) {
-		page = page_pool_alloc_pages(pool, gfp);
-		if (unlikely(!page)) {
-			pool->frag_page = NULL;
+	if (!nmem) {
+		nmem = page_pool_alloc_netmem(pool, gfp);
+		if (unlikely(!nmem)) {
+			pool->frag_nmem = NULL;
 			return NULL;
 		}
 
-		pool->frag_page = page;
+		pool->frag_nmem = nmem;
 
 frag_reset:
 		pool->frag_users = 1;
 		*offset = 0;
 		pool->frag_offset = size;
-		page_pool_fragment_page(page, BIAS_MAX);
-		return page;
+		page_pool_fragment_netmem(nmem, BIAS_MAX);
+		return nmem;
 	}
 
 	pool->frag_users++;
 	pool->frag_offset = *offset + size;
 	alloc_stat_inc(pool, fast);
-	return page;
+	return nmem;
 }
 EXPORT_SYMBOL(page_pool_alloc_frag);
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 19/24] xdp: Convert to netmem
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (17 preceding siblings ...)
  2022-11-30 22:07 ` [PATCH 18/24] page_pool: Convert frag_page to frag_nmem Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-11-30 22:07 ` [PATCH 20/24] mm: Remove page pool members from struct page Matthew Wilcox (Oracle)
                   ` (7 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

We dereference the 'pp' member of struct page, so we must use a netmem
here.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 net/core/xdp.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/core/xdp.c b/net/core/xdp.c
index 844c9d99dc0e..7520c3b27356 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -375,17 +375,18 @@ EXPORT_SYMBOL_GPL(xdp_rxq_info_reg_mem_model);
 void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct,
 		  struct xdp_buff *xdp)
 {
+	struct netmem *nmem;
 	struct page *page;
 
 	switch (mem->type) {
 	case MEM_TYPE_PAGE_POOL:
-		page = virt_to_head_page(data);
+		nmem = virt_to_netmem(data);
 		if (napi_direct && xdp_return_frame_no_direct())
 			napi_direct = false;
-		/* No need to check ((page->pp_magic & ~0x3UL) == PP_SIGNATURE)
+		/* No need to check ((nmem->pp_magic & ~0x3UL) == PP_SIGNATURE)
 		 * as mem->type knows this a page_pool page
 		 */
-		page_pool_put_full_page(page->pp, page, napi_direct);
+		page_pool_put_full_netmem(nmem->pp, nmem, napi_direct);
 		break;
 	case MEM_TYPE_PAGE_SHARED:
 		page_frag_free(data);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 20/24] mm: Remove page pool members from struct page
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (18 preceding siblings ...)
  2022-11-30 22:07 ` [PATCH 19/24] xdp: Convert to netmem Matthew Wilcox (Oracle)
@ 2022-11-30 22:07 ` Matthew Wilcox (Oracle)
  2022-11-30 22:08 ` [PATCH 21/24] netmem_to_virt Matthew Wilcox (Oracle)
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

These are now split out into their own netmem struct.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm_types.h | 22 ----------------------
 include/net/page_pool.h  |  4 ----
 2 files changed, 26 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 1ad1ef3a1288..6999af135f1d 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -113,28 +113,6 @@ struct page {
 			 */
 			unsigned long private;
 		};
-		struct {	/* page_pool used by netstack */
-			/**
-			 * @pp_magic: magic value to avoid recycling non
-			 * page_pool allocated pages.
-			 */
-			unsigned long pp_magic;
-			struct page_pool *pp;
-			unsigned long _pp_mapping_pad;
-			unsigned long dma_addr;
-			union {
-				/**
-				 * dma_addr_upper: might require a 64-bit
-				 * value on 32-bit architectures.
-				 */
-				unsigned long dma_addr_upper;
-				/**
-				 * For frag page support, not supported in
-				 * 32-bit architectures with 64-bit DMA.
-				 */
-				atomic_long_t pp_frag_count;
-			};
-		};
 		struct {	/* Tail pages of compound page */
 			unsigned long compound_head;	/* Bit zero is set */
 
diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index ce1049a03f2d..222eedc39140 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -81,11 +81,7 @@ struct netmem {
 	static_assert(offsetof(struct page, pg) == offsetof(struct netmem, nm))
 NETMEM_MATCH(flags, flags);
 NETMEM_MATCH(lru, pp_magic);
-NETMEM_MATCH(pp, pp);
 NETMEM_MATCH(mapping, _pp_mapping_pad);
-NETMEM_MATCH(dma_addr, dma_addr);
-NETMEM_MATCH(dma_addr_upper, dma_addr_upper);
-NETMEM_MATCH(pp_frag_count, pp_frag_count);
 NETMEM_MATCH(_mapcount, _mapcount);
 NETMEM_MATCH(_refcount, _refcount);
 #undef NETMEM_MATCH
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 21/24] netmem_to_virt
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (19 preceding siblings ...)
  2022-11-30 22:07 ` [PATCH 20/24] mm: Remove page pool members from struct page Matthew Wilcox (Oracle)
@ 2022-11-30 22:08 ` Matthew Wilcox (Oracle)
  2022-11-30 22:08 ` [PATCH 22/24] page_pool: Pass a netmem to init_callback() Matthew Wilcox (Oracle)
                   ` (5 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:08 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

---
 include/net/page_pool.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index 222eedc39140..e13e3a8e83d3 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -112,6 +112,11 @@ static inline struct netmem *virt_to_netmem(const void *x)
 	return page_netmem(virt_to_head_page(x));
 }
 
+static inline void *netmem_to_virt(const struct netmem *nmem)
+{
+	return page_to_virt(netmem_page(nmem));
+}
+
 static inline int netmem_ref_count(const struct netmem *nmem)
 {
 	return page_ref_count(netmem_page(nmem));
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 22/24] page_pool: Pass a netmem to init_callback()
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (20 preceding siblings ...)
  2022-11-30 22:08 ` [PATCH 21/24] netmem_to_virt Matthew Wilcox (Oracle)
@ 2022-11-30 22:08 ` Matthew Wilcox (Oracle)
  2022-11-30 22:08 ` [PATCH 23/24] net: Add support for netmem in skb_frag Matthew Wilcox (Oracle)
                   ` (4 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:08 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

Convert the only user of init_callback.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/net/page_pool.h | 2 +-
 net/bpf/test_run.c      | 4 ++--
 net/core/page_pool.c    | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index e13e3a8e83d3..4878fe30f52c 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -164,7 +164,7 @@ struct page_pool_params {
 	enum dma_data_direction dma_dir; /* DMA mapping direction */
 	unsigned int	max_len; /* max DMA sync memory size */
 	unsigned int	offset;  /* DMA addr offset */
-	void (*init_callback)(struct page *page, void *arg);
+	void (*init_callback)(struct netmem *nmem, void *arg);
 	void *init_arg;
 };
 
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 6094ef7cffcd..921b085802af 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -116,9 +116,9 @@ struct xdp_test_data {
 #define TEST_XDP_FRAME_SIZE (PAGE_SIZE - sizeof(struct xdp_page_head))
 #define TEST_XDP_MAX_BATCH 256
 
-static void xdp_test_run_init_page(struct page *page, void *arg)
+static void xdp_test_run_init_page(struct netmem *nmem, void *arg)
 {
-	struct xdp_page_head *head = phys_to_virt(page_to_phys(page));
+	struct xdp_page_head *head = netmem_to_virt(nmem);
 	struct xdp_buff *new_ctx, *orig_ctx;
 	u32 headroom = XDP_PACKET_HEADROOM;
 	struct xdp_test_data *xdp = arg;
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 5be78ec93af8..bed40515e74c 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -334,7 +334,7 @@ static void page_pool_set_pp_info(struct page_pool *pool,
 	nmem->pp = pool;
 	nmem->pp_magic |= PP_SIGNATURE;
 	if (pool->p.init_callback)
-		pool->p.init_callback(netmem_page(nmem), pool->p.init_arg);
+		pool->p.init_callback(nmem, pool->p.init_arg);
 }
 
 static void page_pool_clear_pp_info(struct netmem *nmem)
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 23/24] net: Add support for netmem in skb_frag
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (21 preceding siblings ...)
  2022-11-30 22:08 ` [PATCH 22/24] page_pool: Pass a netmem to init_callback() Matthew Wilcox (Oracle)
@ 2022-11-30 22:08 ` Matthew Wilcox (Oracle)
  2022-11-30 22:08 ` [PATCH 24/24] mvneta: Convert to netmem Matthew Wilcox (Oracle)
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:08 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

Allow drivers to add netmem to skbs & retrieve them again.  If the
VM_BUG_ON triggers, we can add a call to compound_head() either in
this function or in page_netmem().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/skbuff.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4e464a27adaf..dabfb392ca01 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3345,6 +3345,12 @@ static inline struct page *skb_frag_page(const skb_frag_t *frag)
 	return frag->bv_page;
 }
 
+static inline struct netmem *skb_frag_netmem(const skb_frag_t *frag)
+{
+	VM_BUG_ON_PAGE(PageTail(frag->bv_page), frag->bv_page);
+	return page_netmem(frag->bv_page);
+}
+
 /**
  * __skb_frag_ref - take an addition reference on a paged fragment.
  * @frag: the paged fragment
@@ -3453,6 +3459,11 @@ static inline void __skb_frag_set_page(skb_frag_t *frag, struct page *page)
 	frag->bv_page = page;
 }
 
+static inline void __skb_frag_set_netmem(skb_frag_t *frag, struct netmem *nmem)
+{
+	__skb_frag_set_page(frag, netmem_page(nmem));
+}
+
 /**
  * skb_frag_set_page - sets the page contained in a paged fragment of an skb
  * @skb: the buffer
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 24/24] mvneta: Convert to netmem
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (22 preceding siblings ...)
  2022-11-30 22:08 ` [PATCH 23/24] net: Add support for netmem in skb_frag Matthew Wilcox (Oracle)
@ 2022-11-30 22:08 ` Matthew Wilcox (Oracle)
  2022-12-05 15:34 ` [PATCH 00/24] Split page pools from struct page Jesper Dangaard Brouer
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-11-30 22:08 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

Use the netmem APIs instead of the page APIs.  Improves type-safety.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 drivers/net/ethernet/marvell/mvneta.c | 48 +++++++++++++--------------
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index c2cb98d24f5c..5b8cfd50b7e1 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -1931,15 +1931,15 @@ static int mvneta_rx_refill(struct mvneta_port *pp,
 			    gfp_t gfp_mask)
 {
 	dma_addr_t phys_addr;
-	struct page *page;
+	struct netmem *nmem;
 
-	page = page_pool_alloc_pages(rxq->page_pool,
+	nmem = page_pool_alloc_netmem(rxq->page_pool,
 				     gfp_mask | __GFP_NOWARN);
-	if (!page)
+	if (!nmem)
 		return -ENOMEM;
 
-	phys_addr = page_pool_get_dma_addr(page) + pp->rx_offset_correction;
-	mvneta_rx_desc_fill(rx_desc, phys_addr, page, rxq);
+	phys_addr = netmem_get_dma_addr(nmem) + pp->rx_offset_correction;
+	mvneta_rx_desc_fill(rx_desc, phys_addr, nmem, rxq);
 
 	return 0;
 }
@@ -2006,7 +2006,7 @@ static void mvneta_rxq_drop_pkts(struct mvneta_port *pp,
 		if (!data || !(rx_desc->buf_phys_addr))
 			continue;
 
-		page_pool_put_full_page(rxq->page_pool, data, false);
+		page_pool_put_full_netmem(rxq->page_pool, data, false);
 	}
 	if (xdp_rxq_info_is_reg(&rxq->xdp_rxq))
 		xdp_rxq_info_unreg(&rxq->xdp_rxq);
@@ -2072,11 +2072,11 @@ mvneta_xdp_put_buff(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 		goto out;
 
 	for (i = 0; i < sinfo->nr_frags; i++)
-		page_pool_put_full_page(rxq->page_pool,
-					skb_frag_page(&sinfo->frags[i]), true);
+		page_pool_put_full_netmem(rxq->page_pool,
+				skb_frag_netmem(&sinfo->frags[i]), true);
 
 out:
-	page_pool_put_page(rxq->page_pool, virt_to_head_page(xdp->data),
+	page_pool_put_netmem(rxq->page_pool, virt_to_netmem(xdp->data),
 			   sync_len, true);
 }
 
@@ -2088,7 +2088,6 @@ mvneta_xdp_submit_frame(struct mvneta_port *pp, struct mvneta_tx_queue *txq,
 	struct device *dev = pp->dev->dev.parent;
 	struct mvneta_tx_desc *tx_desc;
 	int i, num_frames = 1;
-	struct page *page;
 
 	if (unlikely(xdp_frame_has_frags(xdpf)))
 		num_frames += sinfo->nr_frags;
@@ -2123,9 +2122,10 @@ mvneta_xdp_submit_frame(struct mvneta_port *pp, struct mvneta_tx_queue *txq,
 
 			buf->type = MVNETA_TYPE_XDP_NDO;
 		} else {
-			page = unlikely(frag) ? skb_frag_page(frag)
-					      : virt_to_page(xdpf->data);
-			dma_addr = page_pool_get_dma_addr(page);
+			struct netmem *nmem = unlikely(frag) ?
+						skb_frag_netmem(frag) :
+						virt_to_netmem(xdpf->data);
+			dma_addr = netmem_get_dma_addr(nmem);
 			if (unlikely(frag))
 				dma_addr += skb_frag_off(frag);
 			else
@@ -2308,9 +2308,9 @@ mvneta_swbm_rx_frame(struct mvneta_port *pp,
 		     struct mvneta_rx_desc *rx_desc,
 		     struct mvneta_rx_queue *rxq,
 		     struct xdp_buff *xdp, int *size,
-		     struct page *page)
+		     struct netmem *nmem)
 {
-	unsigned char *data = page_address(page);
+	unsigned char *data = netmem_to_virt(nmem);
 	int data_len = -MVNETA_MH_SIZE, len;
 	struct net_device *dev = pp->dev;
 	enum dma_data_direction dma_dir;
@@ -2343,7 +2343,7 @@ mvneta_swbm_add_rx_fragment(struct mvneta_port *pp,
 			    struct mvneta_rx_desc *rx_desc,
 			    struct mvneta_rx_queue *rxq,
 			    struct xdp_buff *xdp, int *size,
-			    struct page *page)
+			    struct netmem *nmem)
 {
 	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
 	struct net_device *dev = pp->dev;
@@ -2371,16 +2371,16 @@ mvneta_swbm_add_rx_fragment(struct mvneta_port *pp,
 
 		skb_frag_off_set(frag, pp->rx_offset_correction);
 		skb_frag_size_set(frag, data_len);
-		__skb_frag_set_page(frag, page);
+		__skb_frag_set_netmem(frag, nmem);
 
 		if (!xdp_buff_has_frags(xdp)) {
 			sinfo->xdp_frags_size = *size;
 			xdp_buff_set_frags_flag(xdp);
 		}
-		if (page_is_pfmemalloc(page))
+		if (netmem_is_pfmemalloc(nmem))
 			xdp_buff_set_frag_pfmemalloc(xdp);
 	} else {
-		page_pool_put_full_page(rxq->page_pool, page, true);
+		page_pool_put_full_netmem(rxq->page_pool, nmem, true);
 	}
 	*size -= len;
 }
@@ -2440,10 +2440,10 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 		struct mvneta_rx_desc *rx_desc = mvneta_rxq_next_desc_get(rxq);
 		u32 rx_status, index;
 		struct sk_buff *skb;
-		struct page *page;
+		struct netmem *nmem;
 
 		index = rx_desc - rxq->descs;
-		page = (struct page *)rxq->buf_virt_addr[index];
+		nmem = rxq->buf_virt_addr[index];
 
 		rx_status = rx_desc->status;
 		rx_proc++;
@@ -2461,17 +2461,17 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 			desc_status = rx_status;
 
 			mvneta_swbm_rx_frame(pp, rx_desc, rxq, &xdp_buf,
-					     &size, page);
+					     &size, nmem);
 		} else {
 			if (unlikely(!xdp_buf.data_hard_start)) {
 				rx_desc->buf_phys_addr = 0;
-				page_pool_put_full_page(rxq->page_pool, page,
+				page_pool_put_full_netmem(rxq->page_pool, nmem,
 							true);
 				goto next;
 			}
 
 			mvneta_swbm_add_rx_fragment(pp, rx_desc, rxq, &xdp_buf,
-						    &size, page);
+						    &size, nmem);
 		} /* Middle or Last descriptor */
 
 		if (!(rx_status & MVNETA_RXD_LAST_DESC))
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 11/24] page_pool: Convert page_pool_empty_ring() to use netmem
  2022-11-30 22:07 ` [PATCH 11/24] page_pool: Convert page_pool_empty_ring() to use netmem Matthew Wilcox (Oracle)
@ 2022-12-02 21:25   ` Alexander H Duyck
  0 siblings, 0 replies; 36+ messages in thread
From: Alexander H Duyck @ 2022-12-02 21:25 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: netdev, linux-mm

On Wed, 2022-11-30 at 22:07 +0000, Matthew Wilcox (Oracle) wrote:
> Retrieve a netmem from the ptr_ring instead of a page.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  net/core/page_pool.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index e727a74504c2..7a77e3220205 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -755,16 +755,16 @@ EXPORT_SYMBOL(page_pool_alloc_frag);
>  
>  static void page_pool_empty_ring(struct page_pool *pool)
>  {
> -	struct page *page;
> +	struct netmem *nmem;
>  
>  	/* Empty recycle ring */
> -	while ((page = ptr_ring_consume_bh(&pool->ring))) {
> +	while ((nmem = ptr_ring_consume_bh(&pool->ring)) != NULL) {
>  		/* Verify the refcnt invariant of cached pages */
> -		if (!(page_ref_count(page) == 1))
> +		if (!(netmem_ref_count(nmem) == 1))

One minor code nit here is that this could just be:
		if (netmem_ref_count(nmem) != 1)

>  			pr_crit("%s() page_pool refcnt %d violation\n",
> -				__func__, page_ref_count(page));
> +				__func__, netmem_ref_count(nmem));
>  
> -		page_pool_return_page(pool, page);
> +		page_pool_return_netmem(pool, nmem);
>  	}
>  }
>  



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/24] netmem: Create new type
  2022-11-30 22:07 ` [PATCH 01/24] netmem: Create new type Matthew Wilcox (Oracle)
@ 2022-12-05 14:42   ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2022-12-05 14:42 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: brouer, netdev, linux-mm



On 30/11/2022 23.07, Matthew Wilcox (Oracle) wrote:
> As part of simplifying struct page, create a new netmem type which
> mirrors the page_pool members in struct page.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>   include/net/page_pool.h | 41 +++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 41 insertions(+)
> 
> diff --git a/include/net/page_pool.h b/include/net/page_pool.h
> index 813c93499f20..af6ff8c302a0 100644
> --- a/include/net/page_pool.h
> +++ b/include/net/page_pool.h
> @@ -50,6 +50,47 @@
>   				 PP_FLAG_DMA_SYNC_DEV |\
>   				 PP_FLAG_PAGE_FRAG)
>   
> +/* page_pool used by netstack */

Can we improve the comment, making in more clear that this netmem struct
is mirroring/sharing/using part of struct page?

My proposal:

/* page_pool used by netstack mirrors/uses members in struct page */

> +struct netmem {
> +	unsigned long flags;		/* Page flags */
> +	/**
> +	 * @pp_magic: magic value to avoid recycling non
> +	 * page_pool allocated pages.
> +	 */
> +	unsigned long pp_magic;
> +	struct page_pool *pp;
> +	unsigned long _pp_mapping_pad;
> +	unsigned long dma_addr;
> +	union {
> +		/**
> +		 * dma_addr_upper: might require a 64-bit
> +		 * value on 32-bit architectures.
> +		 */
> +		unsigned long dma_addr_upper;
> +		/**
> +		 * For frag page support, not supported in
> +		 * 32-bit architectures with 64-bit DMA.
> +		 */
> +		atomic_long_t pp_frag_count;
> +	};
> +	atomic_t _mapcount;
> +	atomic_t _refcount;
> +};
> +
> +#define NETMEM_MATCH(pg, nm)						\
> +	static_assert(offsetof(struct page, pg) == offsetof(struct netmem, nm))
> +NETMEM_MATCH(flags, flags);
> +NETMEM_MATCH(lru, pp_magic);
> +NETMEM_MATCH(pp, pp);
> +NETMEM_MATCH(mapping, _pp_mapping_pad);
> +NETMEM_MATCH(dma_addr, dma_addr);
> +NETMEM_MATCH(dma_addr_upper, dma_addr_upper);
> +NETMEM_MATCH(pp_frag_count, pp_frag_count);
> +NETMEM_MATCH(_mapcount, _mapcount);
> +NETMEM_MATCH(_refcount, _refcount);
> +#undef NETMEM_MATCH
> +static_assert(sizeof(struct netmem) <= sizeof(struct page));
> +
>   /*
>    * Fast allocation side cache array/stack
>    *



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/24] Split page pools from struct page
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (23 preceding siblings ...)
  2022-11-30 22:08 ` [PATCH 24/24] mvneta: Convert to netmem Matthew Wilcox (Oracle)
@ 2022-12-05 15:34 ` Jesper Dangaard Brouer
  2022-12-05 15:44   ` Ilias Apalodimas
  2022-12-05 16:31   ` Matthew Wilcox
  2022-12-06 16:05 ` [PATCH 25/26] netpool: Additional utility functions Matthew Wilcox (Oracle)
  2022-12-06 16:05 ` [PATCH 26/26] mlx5: Convert to netmem Matthew Wilcox (Oracle)
  26 siblings, 2 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2022-12-05 15:34 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: brouer, netdev, linux-mm


On 30/11/2022 23.07, Matthew Wilcox (Oracle) wrote:
> The MM subsystem is trying to reduce struct page to a single pointer.
> The first step towards that is splitting struct page by its individual
> users, as has already been done with folio and slab.  This attempt chooses
> 'netmem' as a name, but I am not even slightly committed to that name,
> and will happily use another.

I've not been able to come-up with a better name, so I'm okay with
'netmem'.  Others are of-cause free to bikesheet this ;-)

> There are some relatively significant reductions in kernel text
> size from these changes.  I'm not qualified to judge how they
> might affect performance, but every call to put_page() includes
> a call to compound_head(), which is now rather more complex
> than it once was (at least in a distro config which enables
> CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP).
> 

I have a micro-benchmark [1][2], that I want to run on this patchset.
Reducing the asm code 'text' size is less likely to improve a
microbenchmark. The 100Gbit mlx5 driver uses page_pool, so perhaps I can
run a packet benchmark that can show the (expected) performance improvement.

[1] 
https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/bench_page_pool_simple.c
[2] 
https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/bench_page_pool_cross_cpu.c

> I've only converted one user of the page_pool APIs to use the new netmem
> APIs, all the others continue to use the page based ones.
> 

I guess we/netdev-devels need to update the NIC drivers that uses page_pool.

> Uh, I see I left netmem_to_virt() as its own commit instead of squashing
> it into "netmem: Add utility functions".  I'll fix that in the next
> version, because I'm sure you'll want some changes anyway.
> 
> Happy to answer questions.
> 
> Matthew Wilcox (Oracle) (24):
>    netmem: Create new type
>    netmem: Add utility functions
>    page_pool: Add netmem_set_dma_addr() and netmem_get_dma_addr()
>    page_pool: Convert page_pool_release_page() to
>      page_pool_release_netmem()
>    page_pool: Start using netmem in allocation path.
>    page_pool: Convert page_pool_return_page() to
>      page_pool_return_netmem()
>    page_pool: Convert __page_pool_put_page() to __page_pool_put_netmem()
>    page_pool: Convert pp_alloc_cache to contain netmem
>    page_pool: Convert page_pool_defrag_page() to
>      page_pool_defrag_netmem()
>    page_pool: Convert page_pool_put_defragged_page() to netmem
>    page_pool: Convert page_pool_empty_ring() to use netmem
>    page_pool: Convert page_pool_alloc_pages() to page_pool_alloc_netmem()
>    page_pool: Convert page_pool_dma_sync_for_device() to take a netmem
>    page_pool: Convert page_pool_recycle_in_cache() to netmem
>    page_pool: Remove page_pool_defrag_page()
>    page_pool: Use netmem in page_pool_drain_frag()
>    page_pool: Convert page_pool_return_skb_page() to use netmem
>    page_pool: Convert frag_page to frag_nmem
>    xdp: Convert to netmem
>    mm: Remove page pool members from struct page
>    netmem_to_virt
>    page_pool: Pass a netmem to init_callback()
>    net: Add support for netmem in skb_frag
>    mvneta: Convert to netmem
> 
>   drivers/net/ethernet/marvell/mvneta.c |  48 ++---
>   include/linux/mm_types.h              |  22 ---
>   include/linux/skbuff.h                |  11 ++
>   include/net/page_pool.h               | 181 ++++++++++++++---
>   include/trace/events/page_pool.h      |  28 +--
>   net/bpf/test_run.c                    |   4 +-
>   net/core/page_pool.c                  | 274 +++++++++++++-------------
>   net/core/xdp.c                        |   7 +-
>   8 files changed, 344 insertions(+), 231 deletions(-)
> 
> 
> base-commit: 13ee7ef407cfcf63f4f047460ac5bb6ba5a3447d



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/24] Split page pools from struct page
  2022-12-05 15:34 ` [PATCH 00/24] Split page pools from struct page Jesper Dangaard Brouer
@ 2022-12-05 15:44   ` Ilias Apalodimas
  2022-12-05 16:31   ` Matthew Wilcox
  1 sibling, 0 replies; 36+ messages in thread
From: Ilias Apalodimas @ 2022-12-05 15:44 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Matthew Wilcox (Oracle),
	Jesper Dangaard Brouer, brouer, netdev, linux-mm

Hi Jesper,

On Mon, Dec 05, 2022 at 04:34:10PM +0100, Jesper Dangaard Brouer wrote:
> 
> On 30/11/2022 23.07, Matthew Wilcox (Oracle) wrote:
> > The MM subsystem is trying to reduce struct page to a single pointer.
> > The first step towards that is splitting struct page by its individual
> > users, as has already been done with folio and slab.  This attempt chooses
> > 'netmem' as a name, but I am not even slightly committed to that name,
> > and will happily use another.
> 
> I've not been able to come-up with a better name, so I'm okay with
> 'netmem'.  Others are of-cause free to bikesheet this ;-)

Same here. But if anyone has a better name please shout.

> 
> > There are some relatively significant reductions in kernel text
> > size from these changes.  I'm not qualified to judge how they
> > might affect performance, but every call to put_page() includes
> > a call to compound_head(), which is now rather more complex
> > than it once was (at least in a distro config which enables
> > CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP).
> > 
> 
> I have a micro-benchmark [1][2], that I want to run on this patchset.
> Reducing the asm code 'text' size is less likely to improve a
> microbenchmark. The 100Gbit mlx5 driver uses page_pool, so perhaps I can
> run a packet benchmark that can show the (expected) performance improvement.
> 
> [1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/bench_page_pool_simple.c
> [2] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/bench_page_pool_cross_cpu.c
> 

If you could give it a spin it would be great.  I did apply the patchset
and was running fine on my Arm box. I was about to run these tests, but then
I remembered that this only works for x86.  I don't have any cards supported
by page pool around.

> > I've only converted one user of the page_pool APIs to use the new netmem
> > APIs, all the others continue to use the page based ones.
> > 
> 
> I guess we/netdev-devels need to update the NIC drivers that uses page_pool.
> 
 
[...]

Regards
/Ilias


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/24] Split page pools from struct page
  2022-12-05 15:34 ` [PATCH 00/24] Split page pools from struct page Jesper Dangaard Brouer
  2022-12-05 15:44   ` Ilias Apalodimas
@ 2022-12-05 16:31   ` Matthew Wilcox
  2022-12-06  9:43     ` Jesper Dangaard Brouer
  1 sibling, 1 reply; 36+ messages in thread
From: Matthew Wilcox @ 2022-12-05 16:31 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Jesper Dangaard Brouer, Ilias Apalodimas, brouer, netdev, linux-mm

On Mon, Dec 05, 2022 at 04:34:10PM +0100, Jesper Dangaard Brouer wrote:
> I have a micro-benchmark [1][2], that I want to run on this patchset.
> Reducing the asm code 'text' size is less likely to improve a
> microbenchmark. The 100Gbit mlx5 driver uses page_pool, so perhaps I can
> run a packet benchmark that can show the (expected) performance improvement.
> 
> [1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/bench_page_pool_simple.c
> [2] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/bench_page_pool_cross_cpu.c

Appreciate it!  I'm not expecting any performance change outside noise,
but things do surprise me.  I'd appreciate it if you'd test with a
"distro" config, ie enabling CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP so
we show the most expensive case.

> > I've only converted one user of the page_pool APIs to use the new netmem
> > APIs, all the others continue to use the page based ones.
> > 
> 
> I guess we/netdev-devels need to update the NIC drivers that uses page_pool.

Oh, it's not a huge amount of work, and I don't mind doing it.  I only
did one in order to show the kinds of changes that are needed.  I can
do the mlx5 conversion now ...


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/24] Split page pools from struct page
  2022-12-05 16:31   ` Matthew Wilcox
@ 2022-12-06  9:43     ` Jesper Dangaard Brouer
  2022-12-06 16:08       ` Matthew Wilcox
  0 siblings, 1 reply; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2022-12-06  9:43 UTC (permalink / raw)
  To: Matthew Wilcox, Jesper Dangaard Brouer
  Cc: brouer, Jesper Dangaard Brouer, Ilias Apalodimas, netdev, linux-mm



On 05/12/2022 17.31, Matthew Wilcox wrote:
> On Mon, Dec 05, 2022 at 04:34:10PM +0100, Jesper Dangaard Brouer wrote:
>> I have a micro-benchmark [1][2], that I want to run on this patchset.
>> Reducing the asm code 'text' size is less likely to improve a
>> microbenchmark. The 100Gbit mlx5 driver uses page_pool, so perhaps I can
>> run a packet benchmark that can show the (expected) performance improvement.
>>
>> [1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/bench_page_pool_simple.c
>> [2] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/bench_page_pool_cross_cpu.c
> 
> Appreciate it!  I'm not expecting any performance change outside noise,
> but things do surprise me.  I'd appreciate it if you'd test with a
> "distro" config, ie enabling CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP so
> we show the most expensive case.
> 

I have CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP=y BUT it isn't default
runtime enabled.

Should I also choose CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON
or enable it via sysctl ?

  $ grep -H . /proc/sys/vm/hugetlb_optimize_vmemmap
  /proc/sys/vm/hugetlb_optimize_vmemmap:0

--Jesper



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 25/26] netpool: Additional utility functions
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (24 preceding siblings ...)
  2022-12-05 15:34 ` [PATCH 00/24] Split page pools from struct page Jesper Dangaard Brouer
@ 2022-12-06 16:05 ` Matthew Wilcox (Oracle)
  2022-12-06 16:05 ` [PATCH 26/26] mlx5: Convert to netmem Matthew Wilcox (Oracle)
  26 siblings, 0 replies; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-12-06 16:05 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

To be folded into earlier commit
---
 include/net/page_pool.h | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index 4878fe30f52c..94bad45ed8d0 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -117,16 +117,28 @@ static inline void *netmem_to_virt(const struct netmem *nmem)
 	return page_to_virt(netmem_page(nmem));
 }
 
+static inline void *netmem_address(const struct netmem *nmem)
+{
+	return page_address(netmem_page(nmem));
+}
+
 static inline int netmem_ref_count(const struct netmem *nmem)
 {
 	return page_ref_count(netmem_page(nmem));
 }
 
+static inline void netmem_get(struct netmem *nmem)
+{
+	struct folio *folio = (struct folio *)nmem;
+
+	folio_get(folio);
+}
+
 static inline void netmem_put(struct netmem *nmem)
 {
 	struct folio *folio = (struct folio *)nmem;
 
-	return folio_put(folio);
+	folio_put(folio);
 }
 
 static inline bool netmem_is_pfmemalloc(const struct netmem *nmem)
@@ -295,6 +307,11 @@ struct page_pool {
 
 struct netmem *page_pool_alloc_netmem(struct page_pool *pool, gfp_t gfp);
 
+static inline struct netmem *page_pool_dev_alloc_netmem(struct page_pool *pool)
+{
+	return page_pool_alloc_netmem(pool, GFP_ATOMIC | __GFP_NOWARN);
+}
+
 static inline
 struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp)
 {
@@ -452,6 +469,12 @@ static inline void page_pool_recycle_direct(struct page_pool *pool,
 	page_pool_put_full_page(pool, page, true);
 }
 
+static inline void page_pool_recycle_netmem(struct page_pool *pool,
+					    struct netmem *nmem)
+{
+	page_pool_put_full_netmem(pool, nmem, true);
+}
+
 #define PAGE_POOL_DMA_USE_PP_FRAG_COUNT	\
 		(sizeof(dma_addr_t) > sizeof(unsigned long))
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 26/26] mlx5: Convert to netmem
  2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
                   ` (25 preceding siblings ...)
  2022-12-06 16:05 ` [PATCH 25/26] netpool: Additional utility functions Matthew Wilcox (Oracle)
@ 2022-12-06 16:05 ` Matthew Wilcox (Oracle)
  2022-12-08 15:10   ` Jesper Dangaard Brouer
  26 siblings, 1 reply; 36+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-12-06 16:05 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: Matthew Wilcox (Oracle), netdev, linux-mm

Use the netmem APIs instead of the page_pool APIs.  Possibly we should
add a netmem equivalent of skb_add_rx_frag(), but that can happen
later.  Saves one call to compound_head() in the call to put_page()
in mlx5e_page_release_dynamic() which saves 58 bytes of text.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  10 +-
 .../net/ethernet/mellanox/mlx5/core/en/txrx.h |   4 +-
 .../net/ethernet/mellanox/mlx5/core/en/xdp.c  |  23 ++--
 .../net/ethernet/mellanox/mlx5/core/en/xdp.h  |   2 +-
 .../net/ethernet/mellanox/mlx5/core/en_main.c |  12 +-
 .../net/ethernet/mellanox/mlx5/core/en_rx.c   | 130 +++++++++---------
 6 files changed, 93 insertions(+), 88 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index ff5b302531d5..f334f87273c9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -467,7 +467,7 @@ struct mlx5e_txqsq {
 } ____cacheline_aligned_in_smp;
 
 union mlx5e_alloc_unit {
-	struct page *page;
+	struct netmem *nmem;
 	struct xdp_buff *xsk;
 };
 
@@ -501,7 +501,7 @@ struct mlx5e_xdp_info {
 		} frame;
 		struct {
 			struct mlx5e_rq *rq;
-			struct page *page;
+			struct netmem *nmem;
 		} page;
 	};
 };
@@ -619,7 +619,7 @@ struct mlx5e_mpw_info {
 struct mlx5e_page_cache {
 	u32 head;
 	u32 tail;
-	struct page *page_cache[MLX5E_CACHE_SIZE];
+	struct netmem *page_cache[MLX5E_CACHE_SIZE];
 };
 
 struct mlx5e_rq;
@@ -657,13 +657,13 @@ struct mlx5e_rq_frags_info {
 
 struct mlx5e_dma_info {
 	dma_addr_t addr;
-	struct page *page;
+	struct netmem *nmem;
 };
 
 struct mlx5e_shampo_hd {
 	u32 mkey;
 	struct mlx5e_dma_info *info;
-	struct page *last_page;
+	struct netmem *last_nmem;
 	u16 hd_per_wq;
 	u16 hd_per_wqe;
 	unsigned long *bitmap;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
index 853f312cd757..aa231d96c52c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
@@ -65,8 +65,8 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget);
 int mlx5e_poll_ico_cq(struct mlx5e_cq *cq);
 
 /* RX */
-void mlx5e_page_dma_unmap(struct mlx5e_rq *rq, struct page *page);
-void mlx5e_page_release_dynamic(struct mlx5e_rq *rq, struct page *page, bool recycle);
+void mlx5e_page_dma_unmap(struct mlx5e_rq *rq, struct netmem *nmem);
+void mlx5e_page_release_dynamic(struct mlx5e_rq *rq, struct netmem *nmem, bool recycle);
 INDIRECT_CALLABLE_DECLARE(bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq));
 INDIRECT_CALLABLE_DECLARE(bool mlx5e_post_rx_mpwqes(struct mlx5e_rq *rq));
 int mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index 20507ef2f956..8e9136381592 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -57,7 +57,7 @@ int mlx5e_xdp_max_mtu(struct mlx5e_params *params, struct mlx5e_xsk_param *xsk)
 
 static inline bool
 mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq,
-		    struct page *page, struct xdp_buff *xdp)
+		    struct netmem *nmem, struct xdp_buff *xdp)
 {
 	struct skb_shared_info *sinfo = NULL;
 	struct mlx5e_xmit_data xdptxd;
@@ -116,7 +116,7 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq,
 	xdpi.mode = MLX5E_XDP_XMIT_MODE_PAGE;
 	xdpi.page.rq = rq;
 
-	dma_addr = page_pool_get_dma_addr(page) + (xdpf->data - (void *)xdpf);
+	dma_addr = netmem_get_dma_addr(nmem) + (xdpf->data - (void *)xdpf);
 	dma_sync_single_for_device(sq->pdev, dma_addr, xdptxd.len, DMA_BIDIRECTIONAL);
 
 	if (unlikely(xdp_frame_has_frags(xdpf))) {
@@ -127,7 +127,7 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq,
 			dma_addr_t addr;
 			u32 len;
 
-			addr = page_pool_get_dma_addr(skb_frag_page(frag)) +
+			addr = netmem_get_dma_addr(skb_frag_netmem(frag)) +
 				skb_frag_off(frag);
 			len = skb_frag_size(frag);
 			dma_sync_single_for_device(sq->pdev, addr, len,
@@ -141,14 +141,14 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq,
 				      mlx5e_xmit_xdp_frame, sq, &xdptxd, sinfo, 0)))
 		return false;
 
-	xdpi.page.page = page;
+	xdpi.page.nmem = nmem;
 	mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, &xdpi);
 
 	if (unlikely(xdp_frame_has_frags(xdpf))) {
 		for (i = 0; i < sinfo->nr_frags; i++) {
 			skb_frag_t *frag = &sinfo->frags[i];
 
-			xdpi.page.page = skb_frag_page(frag);
+			xdpi.page.nmem = skb_frag_netmem(frag);
 			mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, &xdpi);
 		}
 	}
@@ -157,7 +157,7 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq,
 }
 
 /* returns true if packet was consumed by xdp */
-bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page,
+bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct netmem *nmem,
 		      struct bpf_prog *prog, struct xdp_buff *xdp)
 {
 	u32 act;
@@ -168,19 +168,19 @@ bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page,
 	case XDP_PASS:
 		return false;
 	case XDP_TX:
-		if (unlikely(!mlx5e_xmit_xdp_buff(rq->xdpsq, rq, page, xdp)))
+		if (unlikely(!mlx5e_xmit_xdp_buff(rq->xdpsq, rq, nmem, xdp)))
 			goto xdp_abort;
 		__set_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags); /* non-atomic */
 		return true;
 	case XDP_REDIRECT:
-		/* When XDP enabled then page-refcnt==1 here */
+		/* When XDP enabled then nmem->refcnt==1 here */
 		err = xdp_do_redirect(rq->netdev, xdp, prog);
 		if (unlikely(err))
 			goto xdp_abort;
 		__set_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags);
 		__set_bit(MLX5E_RQ_FLAG_XDP_REDIRECT, rq->flags);
 		if (xdp->rxq->mem.type != MEM_TYPE_XSK_BUFF_POOL)
-			mlx5e_page_dma_unmap(rq, page);
+			mlx5e_page_dma_unmap(rq, nmem);
 		rq->stats->xdp_redirect++;
 		return true;
 	default:
@@ -445,7 +445,7 @@ mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd,
 			skb_frag_t *frag = &sinfo->frags[i];
 			dma_addr_t addr;
 
-			addr = page_pool_get_dma_addr(skb_frag_page(frag)) +
+			addr = netmem_get_dma_addr(skb_frag_netmem(frag)) +
 				skb_frag_off(frag);
 
 			dseg++;
@@ -495,7 +495,8 @@ static void mlx5e_free_xdpsq_desc(struct mlx5e_xdpsq *sq,
 			break;
 		case MLX5E_XDP_XMIT_MODE_PAGE:
 			/* XDP_TX from the regular RQ */
-			mlx5e_page_release_dynamic(xdpi.page.rq, xdpi.page.page, recycle);
+			mlx5e_page_release_dynamic(xdpi.page.rq,
+						xdpi.page.nmem, recycle);
 			break;
 		case MLX5E_XDP_XMIT_MODE_XSK:
 			/* AF_XDP send */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
index bc2d9034af5b..5bc875f131a2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
@@ -46,7 +46,7 @@
 
 struct mlx5e_xsk_param;
 int mlx5e_xdp_max_mtu(struct mlx5e_params *params, struct mlx5e_xsk_param *xsk);
-bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page,
+bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct netmem *nmem,
 		      struct bpf_prog *prog, struct xdp_buff *xdp);
 void mlx5e_xdp_mpwqe_complete(struct mlx5e_xdpsq *sq);
 bool mlx5e_poll_xdpsq_cq(struct mlx5e_cq *cq);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 217c8a478977..e6b7a6b263ab 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -555,16 +555,18 @@ static void mlx5e_rq_err_cqe_work(struct work_struct *recover_work)
 
 static int mlx5e_alloc_mpwqe_rq_drop_page(struct mlx5e_rq *rq)
 {
-	rq->wqe_overflow.page = alloc_page(GFP_KERNEL);
-	if (!rq->wqe_overflow.page)
+	struct page *page = alloc_page(GFP_KERNEL);
+	if (!page)
 		return -ENOMEM;
 
-	rq->wqe_overflow.addr = dma_map_page(rq->pdev, rq->wqe_overflow.page, 0,
+	rq->wqe_overflow.addr = dma_map_page(rq->pdev, page, 0,
 					     PAGE_SIZE, rq->buff.map_dir);
 	if (dma_mapping_error(rq->pdev, rq->wqe_overflow.addr)) {
-		__free_page(rq->wqe_overflow.page);
+		__free_page(page);
 		return -ENOMEM;
 	}
+
+	rq->wqe_overflow.nmem = page_netmem(page);
 	return 0;
 }
 
@@ -572,7 +574,7 @@ static void mlx5e_free_mpwqe_rq_drop_page(struct mlx5e_rq *rq)
 {
 	 dma_unmap_page(rq->pdev, rq->wqe_overflow.addr, PAGE_SIZE,
 			rq->buff.map_dir);
-	 __free_page(rq->wqe_overflow.page);
+	 __free_page(netmem_page(rq->wqe_overflow.nmem));
 }
 
 static int mlx5e_init_rxq_rq(struct mlx5e_channel *c, struct mlx5e_params *params,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index b1ea0b995d9c..77dee6138dc8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -274,7 +274,7 @@ static inline u32 mlx5e_decompress_cqes_start(struct mlx5e_rq *rq,
 	return mlx5e_decompress_cqes_cont(rq, wq, 1, budget_rem);
 }
 
-static inline bool mlx5e_rx_cache_put(struct mlx5e_rq *rq, struct page *page)
+static inline bool mlx5e_rx_cache_put(struct mlx5e_rq *rq, struct netmem *nmem)
 {
 	struct mlx5e_page_cache *cache = &rq->page_cache;
 	u32 tail_next = (cache->tail + 1) & (MLX5E_CACHE_SIZE - 1);
@@ -285,12 +285,12 @@ static inline bool mlx5e_rx_cache_put(struct mlx5e_rq *rq, struct page *page)
 		return false;
 	}
 
-	if (!dev_page_is_reusable(page)) {
+	if (!dev_page_is_reusable(netmem_page(nmem))) {
 		stats->cache_waive++;
 		return false;
 	}
 
-	cache->page_cache[cache->tail] = page;
+	cache->page_cache[cache->tail] = nmem;
 	cache->tail = tail_next;
 	return true;
 }
@@ -306,16 +306,16 @@ static inline bool mlx5e_rx_cache_get(struct mlx5e_rq *rq, union mlx5e_alloc_uni
 		return false;
 	}
 
-	if (page_ref_count(cache->page_cache[cache->head]) != 1) {
+	if (netmem_ref_count(cache->page_cache[cache->head]) != 1) {
 		stats->cache_busy++;
 		return false;
 	}
 
-	au->page = cache->page_cache[cache->head];
+	au->nmem = cache->page_cache[cache->head];
 	cache->head = (cache->head + 1) & (MLX5E_CACHE_SIZE - 1);
 	stats->cache_reuse++;
 
-	addr = page_pool_get_dma_addr(au->page);
+	addr = netmem_get_dma_addr(au->nmem);
 	/* Non-XSK always uses PAGE_SIZE. */
 	dma_sync_single_for_device(rq->pdev, addr, PAGE_SIZE, rq->buff.map_dir);
 	return true;
@@ -328,43 +328,45 @@ static inline int mlx5e_page_alloc_pool(struct mlx5e_rq *rq, union mlx5e_alloc_u
 	if (mlx5e_rx_cache_get(rq, au))
 		return 0;
 
-	au->page = page_pool_dev_alloc_pages(rq->page_pool);
-	if (unlikely(!au->page))
+	au->nmem = page_pool_dev_alloc_netmem(rq->page_pool);
+	if (unlikely(!au->nmem))
 		return -ENOMEM;
 
 	/* Non-XSK always uses PAGE_SIZE. */
-	addr = dma_map_page(rq->pdev, au->page, 0, PAGE_SIZE, rq->buff.map_dir);
+	addr = dma_map_page(rq->pdev, netmem_page(au->nmem), 0, PAGE_SIZE,
+				rq->buff.map_dir);
 	if (unlikely(dma_mapping_error(rq->pdev, addr))) {
-		page_pool_recycle_direct(rq->page_pool, au->page);
-		au->page = NULL;
+		page_pool_recycle_netmem(rq->page_pool, au->nmem);
+		au->nmem = NULL;
 		return -ENOMEM;
 	}
-	page_pool_set_dma_addr(au->page, addr);
+	netmem_set_dma_addr(au->nmem, addr);
 
 	return 0;
 }
 
-void mlx5e_page_dma_unmap(struct mlx5e_rq *rq, struct page *page)
+void mlx5e_nmem_dma_unmap(struct mlx5e_rq *rq, struct netmem *nmem)
 {
-	dma_addr_t dma_addr = page_pool_get_dma_addr(page);
+	dma_addr_t dma_addr = netmem_get_dma_addr(nmem);
 
 	dma_unmap_page_attrs(rq->pdev, dma_addr, PAGE_SIZE, rq->buff.map_dir,
 			     DMA_ATTR_SKIP_CPU_SYNC);
-	page_pool_set_dma_addr(page, 0);
+	netmem_set_dma_addr(nmem, 0);
 }
 
-void mlx5e_page_release_dynamic(struct mlx5e_rq *rq, struct page *page, bool recycle)
+void mlx5e_page_release_dynamic(struct mlx5e_rq *rq, struct netmem *nmem,
+		bool recycle)
 {
 	if (likely(recycle)) {
-		if (mlx5e_rx_cache_put(rq, page))
+		if (mlx5e_rx_cache_put(rq, nmem))
 			return;
 
-		mlx5e_page_dma_unmap(rq, page);
-		page_pool_recycle_direct(rq->page_pool, page);
+		mlx5e_nmem_dma_unmap(rq, nmem);
+		page_pool_recycle_netmem(rq->page_pool, nmem);
 	} else {
-		mlx5e_page_dma_unmap(rq, page);
-		page_pool_release_page(rq->page_pool, page);
-		put_page(page);
+		mlx5e_nmem_dma_unmap(rq, nmem);
+		page_pool_release_netmem(rq->page_pool, nmem);
+		netmem_put(nmem);
 	}
 }
 
@@ -389,7 +391,7 @@ static inline void mlx5e_put_rx_frag(struct mlx5e_rq *rq,
 				     bool recycle)
 {
 	if (frag->last_in_page)
-		mlx5e_page_release_dynamic(rq, frag->au->page, recycle);
+		mlx5e_page_release_dynamic(rq, frag->au->nmem, recycle);
 }
 
 static inline struct mlx5e_wqe_frag_info *get_frag(struct mlx5e_rq *rq, u16 ix)
@@ -413,7 +415,7 @@ static int mlx5e_alloc_rx_wqe(struct mlx5e_rq *rq, struct mlx5e_rx_wqe_cyc *wqe,
 			goto free_frags;
 
 		headroom = i == 0 ? rq->buff.headroom : 0;
-		addr = page_pool_get_dma_addr(frag->au->page);
+		addr = netmem_get_dma_addr(frag->au->nmem);
 		wqe->data[i].addr = cpu_to_be64(addr + frag->offset + headroom);
 	}
 
@@ -475,21 +477,21 @@ mlx5e_add_skb_frag(struct mlx5e_rq *rq, struct sk_buff *skb,
 		   union mlx5e_alloc_unit *au, u32 frag_offset, u32 len,
 		   unsigned int truesize)
 {
-	dma_addr_t addr = page_pool_get_dma_addr(au->page);
+	dma_addr_t addr = netmem_get_dma_addr(au->nmem);
 
 	dma_sync_single_for_cpu(rq->pdev, addr + frag_offset, len,
 				rq->buff.map_dir);
-	page_ref_inc(au->page);
+	netmem_get(au->nmem);
 	skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
-			au->page, frag_offset, len, truesize);
+			netmem_page(au->nmem), frag_offset, len, truesize);
 }
 
 static inline void
 mlx5e_copy_skb_header(struct mlx5e_rq *rq, struct sk_buff *skb,
-		      struct page *page, dma_addr_t addr,
+		      struct netmem *nmem, dma_addr_t addr,
 		      int offset_from, int dma_offset, u32 headlen)
 {
-	const void *from = page_address(page) + offset_from;
+	const void *from = netmem_address(nmem) + offset_from;
 	/* Aligning len to sizeof(long) optimizes memcpy performance */
 	unsigned int len = ALIGN(headlen, sizeof(long));
 
@@ -522,7 +524,7 @@ mlx5e_free_rx_mpwqe(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, bool recycle
 	} else {
 		for (i = 0; i < rq->mpwqe.pages_per_wqe; i++)
 			if (no_xdp_xmit || !test_bit(i, wi->xdp_xmit_bitmap))
-				mlx5e_page_release_dynamic(rq, alloc_units[i].page, recycle);
+				mlx5e_page_release_dynamic(rq, alloc_units[i].nmem, recycle);
 	}
 }
 
@@ -586,7 +588,7 @@ static int mlx5e_build_shampo_hd_umr(struct mlx5e_rq *rq,
 	struct mlx5e_shampo_hd *shampo = rq->mpwqe.shampo;
 	u16 entries, pi, header_offset, err, wqe_bbs, new_entries;
 	u32 lkey = rq->mdev->mlx5e_res.hw_objs.mkey;
-	struct page *page = shampo->last_page;
+	struct netmem *nmem = shampo->last_nmem;
 	u64 addr = shampo->last_addr;
 	struct mlx5e_dma_info *dma_info;
 	struct mlx5e_umr_wqe *umr_wqe;
@@ -613,11 +615,11 @@ static int mlx5e_build_shampo_hd_umr(struct mlx5e_rq *rq,
 			err = mlx5e_page_alloc_pool(rq, &au);
 			if (unlikely(err))
 				goto err_unmap;
-			page = dma_info->page = au.page;
-			addr = dma_info->addr = page_pool_get_dma_addr(au.page);
+			nmem = dma_info->nmem = au.nmem;
+			addr = dma_info->addr = netmem_get_dma_addr(au.nmem);
 		} else {
 			dma_info->addr = addr + header_offset;
-			dma_info->page = page;
+			dma_info->nmem = nmem;
 		}
 
 update_klm:
@@ -635,7 +637,7 @@ static int mlx5e_build_shampo_hd_umr(struct mlx5e_rq *rq,
 	};
 
 	shampo->pi = (shampo->pi + new_entries) & (shampo->hd_per_wq - 1);
-	shampo->last_page = page;
+	shampo->last_nmem = nmem;
 	shampo->last_addr = addr;
 	sq->pc += wqe_bbs;
 	sq->doorbell_cseg = &umr_wqe->ctrl;
@@ -647,7 +649,7 @@ static int mlx5e_build_shampo_hd_umr(struct mlx5e_rq *rq,
 		dma_info = &shampo->info[--index];
 		if (!(i & (MLX5E_SHAMPO_WQ_HEADER_PER_PAGE - 1))) {
 			dma_info->addr = ALIGN_DOWN(dma_info->addr, PAGE_SIZE);
-			mlx5e_page_release_dynamic(rq, dma_info->page, true);
+			mlx5e_page_release_dynamic(rq, dma_info->nmem, true);
 		}
 	}
 	rq->stats->buff_alloc_err++;
@@ -721,7 +723,7 @@ static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix)
 		err = mlx5e_page_alloc_pool(rq, au);
 		if (unlikely(err))
 			goto err_unmap;
-		addr = page_pool_get_dma_addr(au->page);
+		addr = netmem_get_dma_addr(au->nmem);
 		umr_wqe->inline_mtts[i] = (struct mlx5_mtt) {
 			.ptag = cpu_to_be64(addr | MLX5_EN_WR),
 		};
@@ -752,7 +754,7 @@ static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix)
 err_unmap:
 	while (--i >= 0) {
 		au--;
-		mlx5e_page_release_dynamic(rq, au->page, true);
+		mlx5e_page_release_dynamic(rq, au->nmem, true);
 	}
 
 err:
@@ -771,7 +773,7 @@ void mlx5e_shampo_dealloc_hd(struct mlx5e_rq *rq, u16 len, u16 start, bool close
 {
 	struct mlx5e_shampo_hd *shampo = rq->mpwqe.shampo;
 	int hd_per_wq = shampo->hd_per_wq;
-	struct page *deleted_page = NULL;
+	struct netmem *deleted_nmem = NULL;
 	struct mlx5e_dma_info *hd_info;
 	int i, index = start;
 
@@ -784,9 +786,9 @@ void mlx5e_shampo_dealloc_hd(struct mlx5e_rq *rq, u16 len, u16 start, bool close
 
 		hd_info = &shampo->info[index];
 		hd_info->addr = ALIGN_DOWN(hd_info->addr, PAGE_SIZE);
-		if (hd_info->page != deleted_page) {
-			deleted_page = hd_info->page;
-			mlx5e_page_release_dynamic(rq, hd_info->page, false);
+		if (hd_info->nmem != deleted_nmem) {
+			deleted_nmem = hd_info->nmem;
+			mlx5e_page_release_dynamic(rq, hd_info->nmem, false);
 		}
 	}
 
@@ -1125,7 +1127,7 @@ static void *mlx5e_shampo_get_packet_hd(struct mlx5e_rq *rq, u16 header_index)
 	struct mlx5e_dma_info *last_head = &rq->mpwqe.shampo->info[header_index];
 	u16 head_offset = (last_head->addr & (PAGE_SIZE - 1)) + rq->buff.headroom;
 
-	return page_address(last_head->page) + head_offset;
+	return netmem_address(last_head->nmem) + head_offset;
 }
 
 static void mlx5e_shampo_update_ipv4_udp_hdr(struct mlx5e_rq *rq, struct iphdr *ipv4)
@@ -1584,11 +1586,11 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi,
 	dma_addr_t addr;
 	u32 frag_size;
 
-	va             = page_address(au->page) + wi->offset;
+	va             = netmem_address(au->nmem) + wi->offset;
 	data           = va + rx_headroom;
 	frag_size      = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt);
 
-	addr = page_pool_get_dma_addr(au->page);
+	addr = netmem_get_dma_addr(au->nmem);
 	dma_sync_single_range_for_cpu(rq->pdev, addr, wi->offset,
 				      frag_size, rq->buff.map_dir);
 	net_prefetch(data);
@@ -1599,7 +1601,7 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi,
 
 		net_prefetchw(va); /* xdp_frame data area */
 		mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt, &xdp);
-		if (mlx5e_xdp_handle(rq, au->page, prog, &xdp))
+		if (mlx5e_xdp_handle(rq, au->nmem, prog, &xdp))
 			return NULL; /* page/packet was consumed by XDP */
 
 		rx_headroom = xdp.data - xdp.data_hard_start;
@@ -1612,7 +1614,7 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi,
 		return NULL;
 
 	/* queue up for recycling/reuse */
-	page_ref_inc(au->page);
+	netmem_get(au->nmem);
 
 	return skb;
 }
@@ -1634,10 +1636,10 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi
 	u32 truesize;
 	void *va;
 
-	va = page_address(au->page) + wi->offset;
+	va = netmem_address(au->nmem) + wi->offset;
 	frag_consumed_bytes = min_t(u32, frag_info->frag_size, cqe_bcnt);
 
-	addr = page_pool_get_dma_addr(au->page);
+	addr = netmem_get_dma_addr(au->nmem);
 	dma_sync_single_range_for_cpu(rq->pdev, addr, wi->offset,
 				      rq->buff.frame0_sz, rq->buff.map_dir);
 	net_prefetchw(va); /* xdp_frame data area */
@@ -1658,7 +1660,7 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi
 
 		frag_consumed_bytes = min_t(u32, frag_info->frag_size, cqe_bcnt);
 
-		addr = page_pool_get_dma_addr(au->page);
+		addr = netmem_get_dma_addr(au->nmem);
 		dma_sync_single_for_cpu(rq->pdev, addr + wi->offset,
 					frag_consumed_bytes, rq->buff.map_dir);
 
@@ -1672,11 +1674,11 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi
 		}
 
 		frag = &sinfo->frags[sinfo->nr_frags++];
-		__skb_frag_set_page(frag, au->page);
+		__skb_frag_set_netmem(frag, au->nmem);
 		skb_frag_off_set(frag, wi->offset);
 		skb_frag_size_set(frag, frag_consumed_bytes);
 
-		if (page_is_pfmemalloc(au->page))
+		if (netmem_is_pfmemalloc(au->nmem))
 			xdp_buff_set_frag_pfmemalloc(&xdp);
 
 		sinfo->xdp_frags_size += frag_consumed_bytes;
@@ -1690,7 +1692,7 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi
 	au = head_wi->au;
 
 	prog = rcu_dereference(rq->xdp_prog);
-	if (prog && mlx5e_xdp_handle(rq, au->page, prog, &xdp)) {
+	if (prog && mlx5e_xdp_handle(rq, au->nmem, prog, &xdp)) {
 		if (test_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) {
 			int i;
 
@@ -1707,7 +1709,7 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi
 	if (unlikely(!skb))
 		return NULL;
 
-	page_ref_inc(au->page);
+	netmem_get(au->nmem);
 
 	if (unlikely(xdp_buff_has_frags(&xdp))) {
 		int i;
@@ -1956,8 +1958,8 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w
 
 	mlx5e_fill_skb_data(skb, rq, au, byte_cnt, frag_offset);
 	/* copy header */
-	addr = page_pool_get_dma_addr(head_au->page);
-	mlx5e_copy_skb_header(rq, skb, head_au->page, addr,
+	addr = netmem_get_dma_addr(head_au->nmem);
+	mlx5e_copy_skb_header(rq, skb, head_au->nmem, addr,
 			      head_offset, head_offset, headlen);
 	/* skb linear part was allocated with headlen and aligned to long */
 	skb->tail += headlen;
@@ -1985,11 +1987,11 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi,
 		return NULL;
 	}
 
-	va             = page_address(au->page) + head_offset;
+	va             = netmem_address(au->nmem) + head_offset;
 	data           = va + rx_headroom;
 	frag_size      = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt);
 
-	addr = page_pool_get_dma_addr(au->page);
+	addr = netmem_get_dma_addr(au->nmem);
 	dma_sync_single_range_for_cpu(rq->pdev, addr, head_offset,
 				      frag_size, rq->buff.map_dir);
 	net_prefetch(data);
@@ -2000,7 +2002,7 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi,
 
 		net_prefetchw(va); /* xdp_frame data area */
 		mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt, &xdp);
-		if (mlx5e_xdp_handle(rq, au->page, prog, &xdp)) {
+		if (mlx5e_xdp_handle(rq, au->nmem, prog, &xdp)) {
 			if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags))
 				__set_bit(page_idx, wi->xdp_xmit_bitmap); /* non-atomic */
 			return NULL; /* page/packet was consumed by XDP */
@@ -2016,7 +2018,7 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi,
 		return NULL;
 
 	/* queue up for recycling/reuse */
-	page_ref_inc(au->page);
+	netmem_get(au->nmem);
 
 	return skb;
 }
@@ -2033,7 +2035,7 @@ mlx5e_skb_from_cqe_shampo(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi,
 	void *hdr, *data;
 	u32 frag_size;
 
-	hdr		= page_address(head->page) + head_offset;
+	hdr		= netmem_address(head->nmem) + head_offset;
 	data		= hdr + rx_headroom;
 	frag_size	= MLX5_SKB_FRAG_SZ(rx_headroom + head_size);
 
@@ -2048,7 +2050,7 @@ mlx5e_skb_from_cqe_shampo(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi,
 			return NULL;
 
 		/* queue up for recycling/reuse */
-		page_ref_inc(head->page);
+		netmem_get(head->nmem);
 
 	} else {
 		/* allocate SKB and copy header for large header */
@@ -2061,7 +2063,7 @@ mlx5e_skb_from_cqe_shampo(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi,
 		}
 
 		prefetchw(skb->data);
-		mlx5e_copy_skb_header(rq, skb, head->page, head->addr,
+		mlx5e_copy_skb_header(rq, skb, head->nmem, head->addr,
 				      head_offset + rx_headroom,
 				      rx_headroom, head_size);
 		/* skb linear part was allocated with headlen and aligned to long */
@@ -2113,7 +2115,7 @@ mlx5e_free_rx_shampo_hd_entry(struct mlx5e_rq *rq, u16 header_index)
 
 	if (((header_index + 1) & (MLX5E_SHAMPO_WQ_HEADER_PER_PAGE - 1)) == 0) {
 		shampo->info[header_index].addr = ALIGN_DOWN(addr, PAGE_SIZE);
-		mlx5e_page_release_dynamic(rq, shampo->info[header_index].page, true);
+		mlx5e_page_release_dynamic(rq, shampo->info[header_index].nmem, true);
 	}
 	bitmap_clear(shampo->bitmap, header_index, 1);
 }
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/24] Split page pools from struct page
  2022-12-06  9:43     ` Jesper Dangaard Brouer
@ 2022-12-06 16:08       ` Matthew Wilcox
  2022-12-08 15:33         ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 36+ messages in thread
From: Matthew Wilcox @ 2022-12-06 16:08 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: brouer, Jesper Dangaard Brouer, Ilias Apalodimas, netdev, linux-mm

On Tue, Dec 06, 2022 at 10:43:05AM +0100, Jesper Dangaard Brouer wrote:
> 
> 
> On 05/12/2022 17.31, Matthew Wilcox wrote:
> > On Mon, Dec 05, 2022 at 04:34:10PM +0100, Jesper Dangaard Brouer wrote:
> > > I have a micro-benchmark [1][2], that I want to run on this patchset.
> > > Reducing the asm code 'text' size is less likely to improve a
> > > microbenchmark. The 100Gbit mlx5 driver uses page_pool, so perhaps I can
> > > run a packet benchmark that can show the (expected) performance improvement.
> > > 
> > > [1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/bench_page_pool_simple.c
> > > [2] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/bench_page_pool_cross_cpu.c
> > 
> > Appreciate it!  I'm not expecting any performance change outside noise,
> > but things do surprise me.  I'd appreciate it if you'd test with a
> > "distro" config, ie enabling CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP so
> > we show the most expensive case.
> > 
> 
> I have CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP=y BUT it isn't default
> runtime enabled.

That's fine.  I think the vast majority of machines won't actually have
it enabled.  It's mostly useful for hosting setups where allocating 1GB
pages for VMs is common.

The mlx5 driver was straightforward, but showed some gaps in the API.
You'd already got the majority of the wins by using page_ref_inc()
instead of get_page(), but I did find one put_page() ;-)


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 26/26] mlx5: Convert to netmem
  2022-12-06 16:05 ` [PATCH 26/26] mlx5: Convert to netmem Matthew Wilcox (Oracle)
@ 2022-12-08 15:10   ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2022-12-08 15:10 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Jesper Dangaard Brouer, Ilias Apalodimas
  Cc: brouer, netdev, linux-mm

[-- Attachment #1: Type: text/plain, Size: 946 bytes --]


On 06/12/2022 17.05, Matthew Wilcox (Oracle) wrote:
> Use the netmem APIs instead of the page_pool APIs.  Possibly we should
> add a netmem equivalent of skb_add_rx_frag(), but that can happen
> later.  Saves one call to compound_head() in the call to put_page()
> in mlx5e_page_release_dynamic() which saves 58 bytes of text.
> 
> Signed-off-by: Matthew Wilcox (Oracle)<willy@infradead.org>
> ---
>   drivers/net/ethernet/mellanox/mlx5/core/en.h  |  10 +-
>   .../net/ethernet/mellanox/mlx5/core/en/txrx.h |   4 +-
>   .../net/ethernet/mellanox/mlx5/core/en/xdp.c  |  23 ++--
>   .../net/ethernet/mellanox/mlx5/core/en/xdp.h  |   2 +-
>   .../net/ethernet/mellanox/mlx5/core/en_main.c |  12 +-
>   .../net/ethernet/mellanox/mlx5/core/en_rx.c   | 130 +++++++++---------
>   6 files changed, 93 insertions(+), 88 deletions(-)

This doesn't compile... patch that fix this is attached.
(I've boot it, but not run any mlx5 XDP tests, yet)

--Jesper

[-- Attachment #2: 27-mlx5-fix --]
[-- Type: text/plain, Size: 1868 bytes --]

mlx5: fix up patch

From: Jesper Dangaard Brouer <brouer@redhat.com>

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h |    2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c  |    3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
index aa231d96c52c..688d3ea9aa36 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
@@ -65,7 +65,7 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget);
 int mlx5e_poll_ico_cq(struct mlx5e_cq *cq);
 
 /* RX */
-void mlx5e_page_dma_unmap(struct mlx5e_rq *rq, struct netmem *nmem);
+void mlx5e_nmem_dma_unmap(struct mlx5e_rq *rq, struct netmem *nmem);
 void mlx5e_page_release_dynamic(struct mlx5e_rq *rq, struct netmem *nmem, bool recycle);
 INDIRECT_CALLABLE_DECLARE(bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq));
 INDIRECT_CALLABLE_DECLARE(bool mlx5e_post_rx_mpwqes(struct mlx5e_rq *rq));
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index 8e9136381592..878e4e9f0f8b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -32,6 +32,7 @@
 
 #include <linux/bpf_trace.h>
 #include <net/xdp_sock_drv.h>
+#include "en/txrx.h"
 #include "en/xdp.h"
 #include "en/params.h"
 
@@ -180,7 +181,7 @@ bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct netmem *nmem,
 		__set_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags);
 		__set_bit(MLX5E_RQ_FLAG_XDP_REDIRECT, rq->flags);
 		if (xdp->rxq->mem.type != MEM_TYPE_XSK_BUFF_POOL)
-			mlx5e_page_dma_unmap(rq, nmem);
+			mlx5e_nmem_dma_unmap(rq, nmem);
 		rq->stats->xdp_redirect++;
 		return true;
 	default:

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/24] Split page pools from struct page
  2022-12-06 16:08       ` Matthew Wilcox
@ 2022-12-08 15:33         ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2022-12-08 15:33 UTC (permalink / raw)
  To: Matthew Wilcox, Jesper Dangaard Brouer
  Cc: brouer, Jesper Dangaard Brouer, Ilias Apalodimas, netdev, linux-mm


On 06/12/2022 17.08, Matthew Wilcox wrote:
> On Tue, Dec 06, 2022 at 10:43:05AM +0100, Jesper Dangaard Brouer wrote:
>>
>> On 05/12/2022 17.31, Matthew Wilcox wrote:
>>> On Mon, Dec 05, 2022 at 04:34:10PM +0100, Jesper Dangaard Brouer wrote:
>>>> I have a micro-benchmark [1][2], that I want to run on this patchset.
>>>> Reducing the asm code 'text' size is less likely to improve a
>>>> microbenchmark. The 100Gbit mlx5 driver uses page_pool, so perhaps I can
>>>> run a packet benchmark that can show the (expected) performance improvement.
>>>>
>>>> [1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/bench_page_pool_simple.c
>>>> [2] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/bench_page_pool_cross_cpu.c
>>>
>>> Appreciate it!  I'm not expecting any performance change outside noise,
>>> but things do surprise me.  I'd appreciate it if you'd test with a
>>> "distro" config, ie enabling CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP so
>>> we show the most expensive case.

I've tested with [1] and [2] and the performance numbers are the same.

Microbench [1] is easiest to compare, and numbers below were basically
same for both with+without patchset.

  Type:tasklet_page_pool01_fast_path Per elem: 16 cycles(tsc) 4.484 ns
  Type:tasklet_page_pool02_ptr_ring Per elem: 47 cycles(tsc) 13.147 ns
  Type:tasklet_page_pool03_slow Per elem: 173 cycles(tsc) 48.278 ns

The last line (with 173 cycles) is then pages are not recycled, but 
instead returned back into systems page allocator.  To related this to 
something, allocating order-0 pages via normal page allocator API costs 
approx 282 cycles(tsc) 78.385 ns on this system (with .config).  I 
believe page_pool is faster, because we leverage the bulk page allocator.

--Jesper



^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2022-12-08 15:33 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-30 22:07 [PATCH 00/24] Split page pools from struct page Matthew Wilcox (Oracle)
2022-11-30 22:07 ` [PATCH 01/24] netmem: Create new type Matthew Wilcox (Oracle)
2022-12-05 14:42   ` Jesper Dangaard Brouer
2022-11-30 22:07 ` [PATCH 02/24] netmem: Add utility functions Matthew Wilcox (Oracle)
2022-11-30 22:07 ` [PATCH 03/24] page_pool: Add netmem_set_dma_addr() and netmem_get_dma_addr() Matthew Wilcox (Oracle)
2022-11-30 22:07 ` [PATCH 04/24] page_pool: Convert page_pool_release_page() to page_pool_release_netmem() Matthew Wilcox (Oracle)
2022-11-30 22:07 ` [PATCH 05/24] page_pool: Start using netmem in allocation path Matthew Wilcox (Oracle)
2022-11-30 22:07 ` [PATCH 06/24] page_pool: Convert page_pool_return_page() to page_pool_return_netmem() Matthew Wilcox (Oracle)
2022-11-30 22:07 ` [PATCH 07/24] page_pool: Convert __page_pool_put_page() to __page_pool_put_netmem() Matthew Wilcox (Oracle)
2022-11-30 22:07 ` [PATCH 08/24] page_pool: Convert pp_alloc_cache to contain netmem Matthew Wilcox (Oracle)
2022-11-30 22:07 ` [PATCH 09/24] page_pool: Convert page_pool_defrag_page() to page_pool_defrag_netmem() Matthew Wilcox (Oracle)
2022-11-30 22:07 ` [PATCH 10/24] page_pool: Convert page_pool_put_defragged_page() to netmem Matthew Wilcox (Oracle)
2022-11-30 22:07 ` [PATCH 11/24] page_pool: Convert page_pool_empty_ring() to use netmem Matthew Wilcox (Oracle)
2022-12-02 21:25   ` Alexander H Duyck
2022-11-30 22:07 ` [PATCH 12/24] page_pool: Convert page_pool_alloc_pages() to page_pool_alloc_netmem() Matthew Wilcox (Oracle)
2022-11-30 22:07 ` [PATCH 13/24] page_pool: Convert page_pool_dma_sync_for_device() to take a netmem Matthew Wilcox (Oracle)
2022-11-30 22:07 ` [PATCH 14/24] page_pool: Convert page_pool_recycle_in_cache() to netmem Matthew Wilcox (Oracle)
2022-11-30 22:07 ` [PATCH 15/24] page_pool: Remove page_pool_defrag_page() Matthew Wilcox (Oracle)
2022-11-30 22:07 ` [PATCH 16/24] page_pool: Use netmem in page_pool_drain_frag() Matthew Wilcox (Oracle)
2022-11-30 22:07 ` [PATCH 17/24] page_pool: Convert page_pool_return_skb_page() to use netmem Matthew Wilcox (Oracle)
2022-11-30 22:07 ` [PATCH 18/24] page_pool: Convert frag_page to frag_nmem Matthew Wilcox (Oracle)
2022-11-30 22:07 ` [PATCH 19/24] xdp: Convert to netmem Matthew Wilcox (Oracle)
2022-11-30 22:07 ` [PATCH 20/24] mm: Remove page pool members from struct page Matthew Wilcox (Oracle)
2022-11-30 22:08 ` [PATCH 21/24] netmem_to_virt Matthew Wilcox (Oracle)
2022-11-30 22:08 ` [PATCH 22/24] page_pool: Pass a netmem to init_callback() Matthew Wilcox (Oracle)
2022-11-30 22:08 ` [PATCH 23/24] net: Add support for netmem in skb_frag Matthew Wilcox (Oracle)
2022-11-30 22:08 ` [PATCH 24/24] mvneta: Convert to netmem Matthew Wilcox (Oracle)
2022-12-05 15:34 ` [PATCH 00/24] Split page pools from struct page Jesper Dangaard Brouer
2022-12-05 15:44   ` Ilias Apalodimas
2022-12-05 16:31   ` Matthew Wilcox
2022-12-06  9:43     ` Jesper Dangaard Brouer
2022-12-06 16:08       ` Matthew Wilcox
2022-12-08 15:33         ` Jesper Dangaard Brouer
2022-12-06 16:05 ` [PATCH 25/26] netpool: Additional utility functions Matthew Wilcox (Oracle)
2022-12-06 16:05 ` [PATCH 26/26] mlx5: Convert to netmem Matthew Wilcox (Oracle)
2022-12-08 15:10   ` Jesper Dangaard Brouer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).