bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v4 0/5] introduce page_pool_alloc() API
@ 2023-06-12 13:02 Yunsheng Lin
  2023-06-12 13:02 ` [PATCH net-next v4 5/5] page_pool: update document about frag API Yunsheng Lin
  0 siblings, 1 reply; 6+ messages in thread
From: Yunsheng Lin @ 2023-06-12 13:02 UTC (permalink / raw)
  To: davem, kuba, pabeni
  Cc: netdev, linux-kernel, Yunsheng Lin, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Matthias Brugger, AngeloGioacchino Del Regno, bpf,
	linux-arm-kernel, linux-mediatek

In [1] & [2], there are usecases for veth and virtio_net to
use frag support in page pool to reduce memory usage, and it
may request different frag size depending on the head/tail
room space for xdp_frame/shinfo and mtu/packet size. When the
requested frag size is large enough that a single page can not
be split into more than one frag, using frag support only have
performance penalty because of the extra frag count handling
for frag support.

So this patchset provides a page pool API for the driver to
allocate memory with least memory utilization and performance
penalty when it doesn't know the size of memory it need
beforehand.

1. https://patchwork.kernel.org/project/netdevbpf/patch/d3ae6bd3537fbce379382ac6a42f67e22f27ece2.1683896626.git.lorenzo@kernel.org/
2. https://patchwork.kernel.org/project/netdevbpf/patch/20230526054621.18371-3-liangchen.linux@gmail.com/

V4. Fix a typo and add a patch to update document about frag
    API, PAGE_POOL_DMA_USE_PP_FRAG_COUNT is not renamed yet
    as we may need a different thread to discuss that.

V3: Incorporate changes from the disscusion with Alexander,
    mostly the inline wraper, PAGE_POOL_DMA_USE_PP_FRAG_COUNT
    change split to separate patch and comment change.
V2: Add patch to remove PP_FLAG_PAGE_FRAG flags and mention
    virtio_net usecase in the cover letter.
V1: Drop RFC tag and page_pool_frag patch.

Yunsheng Lin (5):
  page_pool: frag API support for 32-bit arch with 64-bit DMA
  page_pool: unify frag_count handling in page_pool_is_last_frag()
  page_pool: introduce page_pool_alloc() API
  page_pool: remove PP_FLAG_PAGE_FRAG flag
  page_pool: update document about frag API

 Documentation/networking/page_pool.rst        |  34 +++-
 .../net/ethernet/hisilicon/hns3/hns3_enet.c   |   3 +-
 .../marvell/octeontx2/nic/otx2_common.c       |   2 +-
 .../net/ethernet/mellanox/mlx5/core/en_main.c |  12 +-
 drivers/net/wireless/mediatek/mt76/mac80211.c |   2 +-
 include/net/page_pool.h                       | 153 +++++++++++++-----
 net/core/page_pool.c                          |  26 ++-
 net/core/skbuff.c                             |   2 +-
 8 files changed, 168 insertions(+), 66 deletions(-)

-- 
2.33.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH net-next v4 5/5] page_pool: update document about frag API
  2023-06-12 13:02 [PATCH net-next v4 0/5] introduce page_pool_alloc() API Yunsheng Lin
@ 2023-06-12 13:02 ` Yunsheng Lin
  2023-06-14  4:40   ` Jakub Kicinski
  0 siblings, 1 reply; 6+ messages in thread
From: Yunsheng Lin @ 2023-06-12 13:02 UTC (permalink / raw)
  To: davem, kuba, pabeni
  Cc: netdev, linux-kernel, Yunsheng Lin, Lorenzo Bianconi,
	Alexander Duyck, Jesper Dangaard Brouer, Ilias Apalodimas,
	Eric Dumazet, Jonathan Corbet, Alexei Starovoitov,
	Daniel Borkmann, John Fastabend, linux-doc, bpf

As more drivers begin to use the frag API, update the
document about how to decide which API to for the driver
author.

Also it seems there is a similar document in page_pool.h,
so remove it to avoid the duplication.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
CC: Lorenzo Bianconi <lorenzo@kernel.org>
CC: Alexander Duyck <alexander.duyck@gmail.com>
---
 Documentation/networking/page_pool.rst | 34 +++++++++++++++++++++-----
 include/net/page_pool.h                | 22 -----------------
 2 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/Documentation/networking/page_pool.rst b/Documentation/networking/page_pool.rst
index 873efd97f822..df3e28728008 100644
--- a/Documentation/networking/page_pool.rst
+++ b/Documentation/networking/page_pool.rst
@@ -4,12 +4,28 @@
 Page Pool API
 =============
 
-The page_pool allocator is optimized for the XDP mode that uses one frame
-per-page, but it can fallback on the regular page allocator APIs.
-
-Basic use involves replacing alloc_pages() calls with the
-page_pool_alloc_pages() call.  Drivers should use page_pool_dev_alloc_pages()
-replacing dev_alloc_pages().
+The page_pool allocator is optimized for recycling page or page frag used by skb
+packet and xdp frame.
+
+Basic use involves replacing alloc_pages() calls with different page pool
+allocator API based on different use case:
+1. page_pool_alloc_pages(): allocate memory without page splitting when driver
+   knows that the memory it need is always bigger than half of the page
+   allocated from page pool. There is no cache line dirtying for 'struct page'
+   when a page is recycled back to the page pool.
+
+2. page_pool_alloc_frag(): allocate memory with page splitting when driver knows
+   that the memory it need is always smaller than or equal to half of the page
+   allocated from page pool. Page splitting enables memory saving and thus avoid
+   TLB/cache miss for data access, but there also is some cost to implement page
+   splitting, mainly some cache line dirtying/bouncing for 'struct page' and
+   atomic operation for page->pp_frag_count.
+
+3. page_pool_alloc(): allocate memory with or without page splitting depending
+   on the requested memory size when driver doesn't know the size of memory it
+   need beforehand. It is a mix of the above two case, so it is a wrapper of the
+   above API to simplify driver's interface for memory allocation with least
+   memory utilization and performance penalty.
 
 API keeps track of in-flight pages, in order to let API user know
 when it is safe to free a page_pool object.  Thus, API users
@@ -93,6 +109,12 @@ a page will cause no race conditions is enough.
 * page_pool_dev_alloc_pages(): Get a page from the page allocator or page_pool
   caches.
 
+* page_pool_dev_alloc_frag(): Get a page frag from the page allocator or
+  page_pool caches.
+
+* page_pool_dev_alloc(): Get a page or page frag from the page allocator or
+  page_pool caches.
+
 * page_pool_get_dma_addr(): Retrieve the stored DMA address.
 
 * page_pool_get_dma_dir(): Retrieve the stored DMA direction.
diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index f4fc339ff020..5fea37fd7767 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -5,28 +5,6 @@
  *	Copyright (C) 2016 Red Hat, Inc.
  */
 
-/**
- * DOC: page_pool allocator
- *
- * This page_pool allocator is optimized for the XDP mode that
- * uses one-frame-per-page, but have fallbacks that act like the
- * regular page allocator APIs.
- *
- * Basic use involve replacing alloc_pages() calls with the
- * page_pool_alloc_pages() call.  Drivers should likely use
- * page_pool_dev_alloc_pages() replacing dev_alloc_pages().
- *
- * API keeps track of in-flight pages, in-order to let API user know
- * when it is safe to dealloactor page_pool object.  Thus, API users
- * must make sure to call page_pool_release_page() when a page is
- * "leaving" the page_pool.  Or call page_pool_put_page() where
- * appropiate.  For maintaining correct accounting.
- *
- * API user must only call page_pool_put_page() once on a page, as it
- * will either recycle the page, or in case of elevated refcnt, it
- * will release the DMA mapping and in-flight state accounting.  We
- * hope to lift this requirement in the future.
- */
 #ifndef _NET_PAGE_POOL_H
 #define _NET_PAGE_POOL_H
 
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next v4 5/5] page_pool: update document about frag API
  2023-06-12 13:02 ` [PATCH net-next v4 5/5] page_pool: update document about frag API Yunsheng Lin
@ 2023-06-14  4:40   ` Jakub Kicinski
  2023-06-14 12:04     ` Yunsheng Lin
  0 siblings, 1 reply; 6+ messages in thread
From: Jakub Kicinski @ 2023-06-14  4:40 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: davem, pabeni, netdev, linux-kernel, Lorenzo Bianconi,
	Alexander Duyck, Jesper Dangaard Brouer, Ilias Apalodimas,
	Eric Dumazet, Jonathan Corbet, Alexei Starovoitov,
	Daniel Borkmann, John Fastabend, linux-doc, bpf

On Mon, 12 Jun 2023 21:02:56 +0800 Yunsheng Lin wrote:
> +2. page_pool_alloc_frag(): allocate memory with page splitting when driver knows
> +   that the memory it need is always smaller than or equal to half of the page
> +   allocated from page pool. Page splitting enables memory saving and thus avoid
> +   TLB/cache miss for data access, but there also is some cost to implement page
> +   splitting, mainly some cache line dirtying/bouncing for 'struct page' and
> +   atomic operation for page->pp_frag_count.
> +
> +3. page_pool_alloc(): allocate memory with or without page splitting depending
> +   on the requested memory size when driver doesn't know the size of memory it
> +   need beforehand. It is a mix of the above two case, so it is a wrapper of the
> +   above API to simplify driver's interface for memory allocation with least
> +   memory utilization and performance penalty.

Seems like the semantics of page_pool_alloc() are always better than
page_pool_alloc_frag(). Is there a reason to keep these two separate?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next v4 5/5] page_pool: update document about frag API
  2023-06-14  4:40   ` Jakub Kicinski
@ 2023-06-14 12:04     ` Yunsheng Lin
  2023-06-14 16:56       ` Jakub Kicinski
  0 siblings, 1 reply; 6+ messages in thread
From: Yunsheng Lin @ 2023-06-14 12:04 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, pabeni, netdev, linux-kernel, Lorenzo Bianconi,
	Alexander Duyck, Jesper Dangaard Brouer, Ilias Apalodimas,
	Eric Dumazet, Jonathan Corbet, Alexei Starovoitov,
	Daniel Borkmann, John Fastabend, linux-doc, bpf

On 2023/6/14 12:40, Jakub Kicinski wrote:
> On Mon, 12 Jun 2023 21:02:56 +0800 Yunsheng Lin wrote:
>> +2. page_pool_alloc_frag(): allocate memory with page splitting when driver knows
>> +   that the memory it need is always smaller than or equal to half of the page
>> +   allocated from page pool. Page splitting enables memory saving and thus avoid
>> +   TLB/cache miss for data access, but there also is some cost to implement page
>> +   splitting, mainly some cache line dirtying/bouncing for 'struct page' and
>> +   atomic operation for page->pp_frag_count.
>> +
>> +3. page_pool_alloc(): allocate memory with or without page splitting depending
>> +   on the requested memory size when driver doesn't know the size of memory it
>> +   need beforehand. It is a mix of the above two case, so it is a wrapper of the
>> +   above API to simplify driver's interface for memory allocation with least
>> +   memory utilization and performance penalty.
> 
> Seems like the semantics of page_pool_alloc() are always better than
> page_pool_alloc_frag(). Is there a reason to keep these two separate?

I am agree the semantics of page_pool_alloc() is better, I was thinking
about combining those two too.
The reason I am keeping it is about the nic hw with fixed buffer size for
each desc, and that buffer size is always smaller than or equal to half
of the page allocated from page pool, so it doesn't bother doing the
checking of 'size << 1 > max_size' and doesn't care about the actual
truesize.

> .
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next v4 5/5] page_pool: update document about frag API
  2023-06-14 12:04     ` Yunsheng Lin
@ 2023-06-14 16:56       ` Jakub Kicinski
  2023-06-15  6:49         ` Yunsheng Lin
  0 siblings, 1 reply; 6+ messages in thread
From: Jakub Kicinski @ 2023-06-14 16:56 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: davem, pabeni, netdev, linux-kernel, Lorenzo Bianconi,
	Alexander Duyck, Jesper Dangaard Brouer, Ilias Apalodimas,
	Eric Dumazet, Jonathan Corbet, Alexei Starovoitov,
	Daniel Borkmann, John Fastabend, linux-doc, bpf

On Wed, 14 Jun 2023 20:04:39 +0800 Yunsheng Lin wrote:
> > Seems like the semantics of page_pool_alloc() are always better than
> > page_pool_alloc_frag(). Is there a reason to keep these two separate?  
> 
> I am agree the semantics of page_pool_alloc() is better, I was thinking
> about combining those two too.
> The reason I am keeping it is about the nic hw with fixed buffer size for
> each desc, and that buffer size is always smaller than or equal to half
> of the page allocated from page pool, so it doesn't bother doing the
> checking of 'size << 1 > max_size' and doesn't care about the actual
> truesize.

I see. Let's reorg the documentation, then? Something along the lines
of, maybe:


The page_pool allocator is optimized for recycling page or page frag used by skb
packet and xdp frame.

Basic use involves replacing napi_alloc_frag() and alloc_pages() calls
with page_pool_alloc(). page_pool_alloc() allocates memory with or
without page splitting depending on the requested memory size.

If the driver knows that it always requires full pages or its allocates
are always smaller than half a page, it can use one of the more specific
API calls:

1. page_pool_alloc_pages(): allocate memory without page splitting when
   driver knows that the memory it need is always bigger than half of the
   page allocated from page pool. There is no cache line dirtying for
   'struct page' when a page is recycled back to the page pool.

2. page_pool_alloc_frag(): allocate memory with page splitting when driver knows
   that the memory it need is always smaller than or equal to half of the page
   allocated from page pool. Page splitting enables memory saving and thus avoid
   TLB/cache miss for data access, but there also is some cost to implement page
   splitting, mainly some cache line dirtying/bouncing for 'struct page' and
   atomic operation for page->pp_frag_count.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next v4 5/5] page_pool: update document about frag API
  2023-06-14 16:56       ` Jakub Kicinski
@ 2023-06-15  6:49         ` Yunsheng Lin
  0 siblings, 0 replies; 6+ messages in thread
From: Yunsheng Lin @ 2023-06-15  6:49 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, pabeni, netdev, linux-kernel, Lorenzo Bianconi,
	Alexander Duyck, Jesper Dangaard Brouer, Ilias Apalodimas,
	Eric Dumazet, Jonathan Corbet, Alexei Starovoitov,
	Daniel Borkmann, John Fastabend, linux-doc, bpf

On 2023/6/15 0:56, Jakub Kicinski wrote:
> On Wed, 14 Jun 2023 20:04:39 +0800 Yunsheng Lin wrote:
>>> Seems like the semantics of page_pool_alloc() are always better than
>>> page_pool_alloc_frag(). Is there a reason to keep these two separate?  
>>
>> I am agree the semantics of page_pool_alloc() is better, I was thinking
>> about combining those two too.
>> The reason I am keeping it is about the nic hw with fixed buffer size for
>> each desc, and that buffer size is always smaller than or equal to half
>> of the page allocated from page pool, so it doesn't bother doing the
>> checking of 'size << 1 > max_size' and doesn't care about the actual
>> truesize.
> 
> I see. Let's reorg the documentation, then? Something along the lines
> of, maybe:

There is still one thing I am not sure about page_pool_alloc() API:
It use *size both as input and output, I am not sure if it is a general
pratice or not, or is there other better pratice than this.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-06-15  6:49 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-12 13:02 [PATCH net-next v4 0/5] introduce page_pool_alloc() API Yunsheng Lin
2023-06-12 13:02 ` [PATCH net-next v4 5/5] page_pool: update document about frag API Yunsheng Lin
2023-06-14  4:40   ` Jakub Kicinski
2023-06-14 12:04     ` Yunsheng Lin
2023-06-14 16:56       ` Jakub Kicinski
2023-06-15  6:49         ` Yunsheng Lin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).