netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 net-next 0/3] add DMA-sync-for-device capability to page_pool API
@ 2019-11-18 13:33 Lorenzo Bianconi
  2019-11-18 13:33 ` [PATCH v4 net-next 1/3] net: mvneta: rely on page_pool_recycle_direct in mvneta_run_xdp Lorenzo Bianconi
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Lorenzo Bianconi @ 2019-11-18 13:33 UTC (permalink / raw)
  To: netdev
  Cc: davem, ilias.apalodimas, brouer, lorenzo.bianconi, mcroce,
	jonathan.lemon

Introduce the possibility to sync DMA memory for device in the page_pool API.
This feature allows to sync proper DMA size and not always full buffer
(dma_sync_single_for_device can be very costly).
Please note DMA-sync-for-CPU is still device driver responsibility.
Relying on page_pool DMA sync mvneta driver improves XDP_DROP pps of
about 170Kpps:

- XDP_DROP DMA sync managed by mvneta driver:	~420Kpps
- XDP_DROP DMA sync managed by page_pool API:	~585Kpps

Changes since v3:
- move dma_sync_for_device before putting the page in ptr_ring in
  __page_pool_recycle_into_ring since ptr_ring can be consumed
  concurrently. Simplify the code moving dma_sync_for_device
  before running __page_pool_recycle_direct/__page_pool_recycle_into_ring

Changes since v2:
- rely on PP_FLAG_DMA_SYNC_DEV flag instead of dma_sync

Changes since v1:
- rename sync in dma_sync
- set dma_sync_size to 0xFFFFFFFF in page_pool_recycle_direct and
  page_pool_put_page routines
- Improve documentation

Lorenzo Bianconi (3):
  net: mvneta: rely on page_pool_recycle_direct in mvneta_run_xdp
  net: page_pool: add the possibility to sync DMA memory for device
  net: mvneta: get rid of huge dma sync in mvneta_rx_refill

 drivers/net/ethernet/marvell/mvneta.c | 24 ++++++++++++++---------
 include/net/page_pool.h               | 21 ++++++++++++++------
 net/core/page_pool.c                  | 28 +++++++++++++++++++++++++--
 3 files changed, 56 insertions(+), 17 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v4 net-next 1/3] net: mvneta: rely on page_pool_recycle_direct in mvneta_run_xdp
  2019-11-18 13:33 [PATCH v4 net-next 0/3] add DMA-sync-for-device capability to page_pool API Lorenzo Bianconi
@ 2019-11-18 13:33 ` Lorenzo Bianconi
  2019-11-18 13:33 ` [PATCH v4 net-next 2/3] net: page_pool: add the possibility to sync DMA memory for device Lorenzo Bianconi
  2019-11-18 13:33 ` [PATCH v4 net-next 3/3] net: mvneta: get rid of huge dma sync in mvneta_rx_refill Lorenzo Bianconi
  2 siblings, 0 replies; 19+ messages in thread
From: Lorenzo Bianconi @ 2019-11-18 13:33 UTC (permalink / raw)
  To: netdev
  Cc: davem, ilias.apalodimas, brouer, lorenzo.bianconi, mcroce,
	jonathan.lemon

Rely on page_pool_recycle_direct and not on xdp_return_buff in
mvneta_run_xdp. This is a preliminary patch to limit the dma sync len
to the one strictly necessary

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/marvell/mvneta.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 12e03b15f0ab..f7713c2c68e1 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -2097,7 +2097,8 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 		err = xdp_do_redirect(pp->dev, xdp, prog);
 		if (err) {
 			ret = MVNETA_XDP_DROPPED;
-			xdp_return_buff(xdp);
+			page_pool_recycle_direct(rxq->page_pool,
+						 virt_to_head_page(xdp->data));
 		} else {
 			ret = MVNETA_XDP_REDIR;
 		}
@@ -2106,7 +2107,8 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 	case XDP_TX:
 		ret = mvneta_xdp_xmit_back(pp, xdp);
 		if (ret != MVNETA_XDP_TX)
-			xdp_return_buff(xdp);
+			page_pool_recycle_direct(rxq->page_pool,
+						 virt_to_head_page(xdp->data));
 		break;
 	default:
 		bpf_warn_invalid_xdp_action(act);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 net-next 2/3] net: page_pool: add the possibility to sync DMA memory for device
  2019-11-18 13:33 [PATCH v4 net-next 0/3] add DMA-sync-for-device capability to page_pool API Lorenzo Bianconi
  2019-11-18 13:33 ` [PATCH v4 net-next 1/3] net: mvneta: rely on page_pool_recycle_direct in mvneta_run_xdp Lorenzo Bianconi
@ 2019-11-18 13:33 ` Lorenzo Bianconi
  2019-11-19 11:23   ` Jesper Dangaard Brouer
  2019-11-18 13:33 ` [PATCH v4 net-next 3/3] net: mvneta: get rid of huge dma sync in mvneta_rx_refill Lorenzo Bianconi
  2 siblings, 1 reply; 19+ messages in thread
From: Lorenzo Bianconi @ 2019-11-18 13:33 UTC (permalink / raw)
  To: netdev
  Cc: davem, ilias.apalodimas, brouer, lorenzo.bianconi, mcroce,
	jonathan.lemon

Introduce the following parameters in order to add the possibility to sync
DMA memory for device before putting allocated pages in the page_pool
caches:
- PP_FLAG_DMA_SYNC_DEV: if set in page_pool_params flags, all pages that
  the driver gets from page_pool will be DMA-synced-for-device according
  to the length provided by the device driver. Please note DMA-sync-for-CPU
  is still device driver responsibility
- offset: DMA address offset where the DMA engine starts copying rx data
- max_len: maximum DMA memory size page_pool is allowed to flush. This
  is currently used in __page_pool_alloc_pages_slow routine when pages
  are allocated from page allocator
These parameters are supposed to be set by device drivers.

This optimization reduces the length of the DMA-sync-for-device.
The optimization is valid because pages are initially
DMA-synced-for-device as defined via max_len. At RX time, the driver
will perform a DMA-sync-for-CPU on the memory for the packet length.
What is important is the memory occupied by packet payload, because
this is the area CPU is allowed to read and modify. As we don't track
cache-lines written into by the CPU, simply use the packet payload length
as dma_sync_size at page_pool recycle time. This also take into account
any tail-extend.

Tested-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 include/net/page_pool.h | 21 +++++++++++++++------
 net/core/page_pool.c    | 28 ++++++++++++++++++++++++++--
 2 files changed, 41 insertions(+), 8 deletions(-)

diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index 1121faa99c12..6f684c3a3434 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -34,8 +34,15 @@
 #include <linux/ptr_ring.h>
 #include <linux/dma-direction.h>
 
-#define PP_FLAG_DMA_MAP 1 /* Should page_pool do the DMA map/unmap */
-#define PP_FLAG_ALL	PP_FLAG_DMA_MAP
+#define PP_FLAG_DMA_MAP		1 /* Should page_pool do the DMA map/unmap */
+#define PP_FLAG_DMA_SYNC_DEV	2 /* if set all pages that the driver gets
+				   * from page_pool will be
+				   * DMA-synced-for-device according to the
+				   * length provided by the device driver.
+				   * Please note DMA-sync-for-CPU is still
+				   * device driver responsibility
+				   */
+#define PP_FLAG_ALL		(PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV)
 
 /*
  * Fast allocation side cache array/stack
@@ -65,6 +72,8 @@ struct page_pool_params {
 	int		nid;  /* Numa node id to allocate from pages from */
 	struct device	*dev; /* device, for DMA pre-mapping purposes */
 	enum dma_data_direction dma_dir; /* DMA mapping direction */
+	unsigned int	max_len; /* max DMA sync memory size */
+	unsigned int	offset;  /* DMA addr offset */
 };
 
 struct page_pool {
@@ -149,8 +158,8 @@ static inline void page_pool_use_xdp_mem(struct page_pool *pool,
 #endif
 
 /* Never call this directly, use helpers below */
-void __page_pool_put_page(struct page_pool *pool,
-			  struct page *page, bool allow_direct);
+void __page_pool_put_page(struct page_pool *pool, struct page *page,
+			  unsigned int dma_sync_size, bool allow_direct);
 
 static inline void page_pool_put_page(struct page_pool *pool,
 				      struct page *page, bool allow_direct)
@@ -159,14 +168,14 @@ static inline void page_pool_put_page(struct page_pool *pool,
 	 * allow registering MEM_TYPE_PAGE_POOL, but shield linker.
 	 */
 #ifdef CONFIG_PAGE_POOL
-	__page_pool_put_page(pool, page, allow_direct);
+	__page_pool_put_page(pool, page, -1, allow_direct);
 #endif
 }
 /* Very limited use-cases allow recycle direct */
 static inline void page_pool_recycle_direct(struct page_pool *pool,
 					    struct page *page)
 {
-	__page_pool_put_page(pool, page, true);
+	__page_pool_put_page(pool, page, -1, true);
 }
 
 /* Disconnects a page (from a page_pool).  API users can have a need
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index dfc2501c35d9..4f9aed7bce5a 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -47,6 +47,13 @@ static int page_pool_init(struct page_pool *pool,
 	    (pool->p.dma_dir != DMA_BIDIRECTIONAL))
 		return -EINVAL;
 
+	/* In order to request DMA-sync-for-device the page needs to
+	 * be mapped
+	 */
+	if ((pool->p.flags & PP_FLAG_DMA_SYNC_DEV) &&
+	    !(pool->p.flags & PP_FLAG_DMA_MAP))
+		return -EINVAL;
+
 	if (ptr_ring_init(&pool->ring, ring_qsize, GFP_KERNEL) < 0)
 		return -ENOMEM;
 
@@ -115,6 +122,16 @@ static struct page *__page_pool_get_cached(struct page_pool *pool)
 	return page;
 }
 
+static void page_pool_dma_sync_for_device(struct page_pool *pool,
+					  struct page *page,
+					  unsigned int dma_sync_size)
+{
+	dma_sync_size = min(dma_sync_size, pool->p.max_len);
+	dma_sync_single_range_for_device(pool->p.dev, page->dma_addr,
+					 pool->p.offset, dma_sync_size,
+					 pool->p.dma_dir);
+}
+
 /* slow path */
 noinline
 static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
@@ -159,6 +176,9 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
 	}
 	page->dma_addr = dma;
 
+	if (pool->p.flags & PP_FLAG_DMA_SYNC_DEV)
+		page_pool_dma_sync_for_device(pool, page, pool->p.max_len);
+
 skip_dma_map:
 	/* Track how many pages are held 'in-flight' */
 	pool->pages_state_hold_cnt++;
@@ -281,8 +301,8 @@ static bool __page_pool_recycle_direct(struct page *page,
 	return true;
 }
 
-void __page_pool_put_page(struct page_pool *pool,
-			  struct page *page, bool allow_direct)
+void __page_pool_put_page(struct page_pool *pool, struct page *page,
+			  unsigned int dma_sync_size, bool allow_direct)
 {
 	/* This allocator is optimized for the XDP mode that uses
 	 * one-frame-per-page, but have fallbacks that act like the
@@ -293,6 +313,10 @@ void __page_pool_put_page(struct page_pool *pool,
 	if (likely(page_ref_count(page) == 1)) {
 		/* Read barrier done in page_ref_count / READ_ONCE */
 
+		if (pool->p.flags & PP_FLAG_DMA_SYNC_DEV)
+			page_pool_dma_sync_for_device(pool, page,
+						      dma_sync_size);
+
 		if (allow_direct && in_serving_softirq())
 			if (__page_pool_recycle_direct(page, pool))
 				return;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 net-next 3/3] net: mvneta: get rid of huge dma sync in mvneta_rx_refill
  2019-11-18 13:33 [PATCH v4 net-next 0/3] add DMA-sync-for-device capability to page_pool API Lorenzo Bianconi
  2019-11-18 13:33 ` [PATCH v4 net-next 1/3] net: mvneta: rely on page_pool_recycle_direct in mvneta_run_xdp Lorenzo Bianconi
  2019-11-18 13:33 ` [PATCH v4 net-next 2/3] net: page_pool: add the possibility to sync DMA memory for device Lorenzo Bianconi
@ 2019-11-18 13:33 ` Lorenzo Bianconi
  2019-11-19 11:38   ` Jesper Dangaard Brouer
  2 siblings, 1 reply; 19+ messages in thread
From: Lorenzo Bianconi @ 2019-11-18 13:33 UTC (permalink / raw)
  To: netdev
  Cc: davem, ilias.apalodimas, brouer, lorenzo.bianconi, mcroce,
	jonathan.lemon

Get rid of costly dma_sync_single_for_device in mvneta_rx_refill
since now the driver can let page_pool API to manage needed DMA
sync with a proper size.

- XDP_DROP DMA sync managed by mvneta driver:	~420Kpps
- XDP_DROP DMA sync managed by page_pool API:	~585Kpps

Tested-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/marvell/mvneta.c | 26 +++++++++++++++-----------
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index f7713c2c68e1..a06d109c9e80 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -1846,7 +1846,6 @@ static int mvneta_rx_refill(struct mvneta_port *pp,
 			    struct mvneta_rx_queue *rxq,
 			    gfp_t gfp_mask)
 {
-	enum dma_data_direction dma_dir;
 	dma_addr_t phys_addr;
 	struct page *page;
 
@@ -1856,9 +1855,6 @@ static int mvneta_rx_refill(struct mvneta_port *pp,
 		return -ENOMEM;
 
 	phys_addr = page_pool_get_dma_addr(page) + pp->rx_offset_correction;
-	dma_dir = page_pool_get_dma_dir(rxq->page_pool);
-	dma_sync_single_for_device(pp->dev->dev.parent, phys_addr,
-				   MVNETA_MAX_RX_BUF_SIZE, dma_dir);
 	mvneta_rx_desc_fill(rx_desc, phys_addr, page, rxq);
 
 	return 0;
@@ -2097,8 +2093,10 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 		err = xdp_do_redirect(pp->dev, xdp, prog);
 		if (err) {
 			ret = MVNETA_XDP_DROPPED;
-			page_pool_recycle_direct(rxq->page_pool,
-						 virt_to_head_page(xdp->data));
+			__page_pool_put_page(rxq->page_pool,
+					virt_to_head_page(xdp->data),
+					xdp->data_end - xdp->data_hard_start,
+					true);
 		} else {
 			ret = MVNETA_XDP_REDIR;
 		}
@@ -2107,8 +2105,10 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 	case XDP_TX:
 		ret = mvneta_xdp_xmit_back(pp, xdp);
 		if (ret != MVNETA_XDP_TX)
-			page_pool_recycle_direct(rxq->page_pool,
-						 virt_to_head_page(xdp->data));
+			__page_pool_put_page(rxq->page_pool,
+					virt_to_head_page(xdp->data),
+					xdp->data_end - xdp->data_hard_start,
+					true);
 		break;
 	default:
 		bpf_warn_invalid_xdp_action(act);
@@ -2117,8 +2117,10 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 		trace_xdp_exception(pp->dev, prog, act);
 		/* fall through */
 	case XDP_DROP:
-		page_pool_recycle_direct(rxq->page_pool,
-					 virt_to_head_page(xdp->data));
+		__page_pool_put_page(rxq->page_pool,
+				     virt_to_head_page(xdp->data),
+				     xdp->data_end - xdp->data_hard_start,
+				     true);
 		ret = MVNETA_XDP_DROPPED;
 		break;
 	}
@@ -3067,11 +3069,13 @@ static int mvneta_create_page_pool(struct mvneta_port *pp,
 	struct bpf_prog *xdp_prog = READ_ONCE(pp->xdp_prog);
 	struct page_pool_params pp_params = {
 		.order = 0,
-		.flags = PP_FLAG_DMA_MAP,
+		.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
 		.pool_size = size,
 		.nid = cpu_to_node(0),
 		.dev = pp->dev->dev.parent,
 		.dma_dir = xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE,
+		.offset = pp->rx_offset_correction,
+		.max_len = MVNETA_MAX_RX_BUF_SIZE,
 	};
 	int err;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 net-next 2/3] net: page_pool: add the possibility to sync DMA memory for device
  2019-11-18 13:33 ` [PATCH v4 net-next 2/3] net: page_pool: add the possibility to sync DMA memory for device Lorenzo Bianconi
@ 2019-11-19 11:23   ` Jesper Dangaard Brouer
  2019-11-19 11:33     ` Ilias Apalodimas
  2019-11-19 12:14     ` Lorenzo Bianconi
  0 siblings, 2 replies; 19+ messages in thread
From: Jesper Dangaard Brouer @ 2019-11-19 11:23 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: netdev, davem, ilias.apalodimas, lorenzo.bianconi, mcroce,
	jonathan.lemon, brouer

On Mon, 18 Nov 2019 15:33:45 +0200
Lorenzo Bianconi <lorenzo@kernel.org> wrote:

> diff --git a/include/net/page_pool.h b/include/net/page_pool.h
> index 1121faa99c12..6f684c3a3434 100644
> --- a/include/net/page_pool.h
> +++ b/include/net/page_pool.h
> @@ -34,8 +34,15 @@
>  #include <linux/ptr_ring.h>
>  #include <linux/dma-direction.h>
>  
> -#define PP_FLAG_DMA_MAP 1 /* Should page_pool do the DMA map/unmap */
> -#define PP_FLAG_ALL	PP_FLAG_DMA_MAP
> +#define PP_FLAG_DMA_MAP		1 /* Should page_pool do the DMA map/unmap */
> +#define PP_FLAG_DMA_SYNC_DEV	2 /* if set all pages that the driver gets
> +				   * from page_pool will be
> +				   * DMA-synced-for-device according to the
> +				   * length provided by the device driver.
> +				   * Please note DMA-sync-for-CPU is still
> +				   * device driver responsibility
> +				   */
> +#define PP_FLAG_ALL		(PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV)
>  
[...]

Can you please change this to use the BIT(X) api.

#include <linux/bits.h>

#define PP_FLAG_DMA_MAP		BIT(0)
#define PP_FLAG_DMA_SYNC_DEV	BIT(1)



> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index dfc2501c35d9..4f9aed7bce5a 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -47,6 +47,13 @@ static int page_pool_init(struct page_pool *pool,
>  	    (pool->p.dma_dir != DMA_BIDIRECTIONAL))
>  		return -EINVAL;
>  
> +	/* In order to request DMA-sync-for-device the page needs to
> +	 * be mapped
> +	 */
> +	if ((pool->p.flags & PP_FLAG_DMA_SYNC_DEV) &&
> +	    !(pool->p.flags & PP_FLAG_DMA_MAP))
> +		return -EINVAL;
> +

I like that you have moved this check to setup time.

There are two other parameters the DMA_SYNC_DEV depend on:

 	struct page_pool_params pp_params = {
 		.order = 0,
-		.flags = PP_FLAG_DMA_MAP,
+		.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
 		.pool_size = size,
 		.nid = cpu_to_node(0),
 		.dev = pp->dev->dev.parent,
 		.dma_dir = xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE,
+		.offset = pp->rx_offset_correction,
+		.max_len = MVNETA_MAX_RX_BUF_SIZE,
 	};

Can you add a check, that .max_len must not be zero.  The reason is
that I can easily see people misconfiguring this.  And the effect is
that the DMA-sync-for-device is essentially disabled, without user
realizing this. The not-realizing part is really bad, especially
because bugs that can occur from this are very rare and hard to catch.

I'm up for discussing if there should be a similar check for .offset.
IMHO we should also check .offset is configured, and then be open to
remove this check once a driver user want to use offset=0.  Does the
mvneta driver already have a use-case for this (in non-XDP mode)?

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 net-next 2/3] net: page_pool: add the possibility to sync DMA memory for device
  2019-11-19 11:23   ` Jesper Dangaard Brouer
@ 2019-11-19 11:33     ` Ilias Apalodimas
  2019-11-19 15:11       ` Jesper Dangaard Brouer
  2019-11-19 12:14     ` Lorenzo Bianconi
  1 sibling, 1 reply; 19+ messages in thread
From: Ilias Apalodimas @ 2019-11-19 11:33 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Lorenzo Bianconi, netdev, davem, lorenzo.bianconi, mcroce,
	jonathan.lemon

> > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > index dfc2501c35d9..4f9aed7bce5a 100644
> > --- a/net/core/page_pool.c
> > +++ b/net/core/page_pool.c
> > @@ -47,6 +47,13 @@ static int page_pool_init(struct page_pool *pool,
> >  	    (pool->p.dma_dir != DMA_BIDIRECTIONAL))
> >  		return -EINVAL;
> >  
> > +	/* In order to request DMA-sync-for-device the page needs to
> > +	 * be mapped
> > +	 */
> > +	if ((pool->p.flags & PP_FLAG_DMA_SYNC_DEV) &&
> > +	    !(pool->p.flags & PP_FLAG_DMA_MAP))
> > +		return -EINVAL;
> > +
> 
> I like that you have moved this check to setup time.
> 
> There are two other parameters the DMA_SYNC_DEV depend on:
> 
>  	struct page_pool_params pp_params = {
>  		.order = 0,
> -		.flags = PP_FLAG_DMA_MAP,
> +		.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
>  		.pool_size = size,
>  		.nid = cpu_to_node(0),
>  		.dev = pp->dev->dev.parent,
>  		.dma_dir = xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE,
> +		.offset = pp->rx_offset_correction,
> +		.max_len = MVNETA_MAX_RX_BUF_SIZE,
>  	};
> 
> Can you add a check, that .max_len must not be zero.  The reason is
> that I can easily see people misconfiguring this.  And the effect is
> that the DMA-sync-for-device is essentially disabled, without user
> realizing this. The not-realizing part is really bad, especially
> because bugs that can occur from this are very rare and hard to catch.

+1 we sync based on the min() value of those 

> 
> I'm up for discussing if there should be a similar check for .offset.
> IMHO we should also check .offset is configured, and then be open to
> remove this check once a driver user want to use offset=0.  Does the
> mvneta driver already have a use-case for this (in non-XDP mode)?

Not sure about this, since it does not break anything apart from some
performance hit

Cheers
/Ilias
> 
> -- 
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 net-next 3/3] net: mvneta: get rid of huge dma sync in mvneta_rx_refill
  2019-11-18 13:33 ` [PATCH v4 net-next 3/3] net: mvneta: get rid of huge dma sync in mvneta_rx_refill Lorenzo Bianconi
@ 2019-11-19 11:38   ` Jesper Dangaard Brouer
  2019-11-19 12:19     ` Lorenzo Bianconi
  0 siblings, 1 reply; 19+ messages in thread
From: Jesper Dangaard Brouer @ 2019-11-19 11:38 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: netdev, davem, ilias.apalodimas, lorenzo.bianconi, mcroce,
	jonathan.lemon, brouer

On Mon, 18 Nov 2019 15:33:46 +0200
Lorenzo Bianconi <lorenzo@kernel.org> wrote:

> diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> index f7713c2c68e1..a06d109c9e80 100644
> --- a/drivers/net/ethernet/marvell/mvneta.c
> +++ b/drivers/net/ethernet/marvell/mvneta.c
[...]
> @@ -2097,8 +2093,10 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
>  		err = xdp_do_redirect(pp->dev, xdp, prog);
>  		if (err) {
>  			ret = MVNETA_XDP_DROPPED;
> -			page_pool_recycle_direct(rxq->page_pool,
> -						 virt_to_head_page(xdp->data));
> +			__page_pool_put_page(rxq->page_pool,
> +					virt_to_head_page(xdp->data),
> +					xdp->data_end - xdp->data_hard_start,
> +					true);
>  		} else {
>  			ret = MVNETA_XDP_REDIR;
>  		}
> @@ -2107,8 +2105,10 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
>  	case XDP_TX:
>  		ret = mvneta_xdp_xmit_back(pp, xdp);
>  		if (ret != MVNETA_XDP_TX)
> -			page_pool_recycle_direct(rxq->page_pool,
> -						 virt_to_head_page(xdp->data));
> +			__page_pool_put_page(rxq->page_pool,
> +					virt_to_head_page(xdp->data),
> +					xdp->data_end - xdp->data_hard_start,
> +					true);
>  		break;
>  	default:
>  		bpf_warn_invalid_xdp_action(act);
> @@ -2117,8 +2117,10 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
>  		trace_xdp_exception(pp->dev, prog, act);
>  		/* fall through */
>  	case XDP_DROP:
> -		page_pool_recycle_direct(rxq->page_pool,
> -					 virt_to_head_page(xdp->data));
> +		__page_pool_put_page(rxq->page_pool,
> +				     virt_to_head_page(xdp->data),
> +				     xdp->data_end - xdp->data_hard_start,
> +				     true);

This does beg for the question: Should we create an API wrapper for
this in the header file?

But what to name it?

I know Jonathan doesn't like the "direct" part of the  previous function
name page_pool_recycle_direct.  (I do considered calling this 'napi'
instead, as it would be inline with networking use-cases, but it seemed
limited if other subsystem end-up using this).

Does is 'page_pool_put_page_len' sound better?

But I want also want hide the bool 'allow_direct' in the API name.
(As it makes it easier to identify users that uses this from softirq)

Going for 'page_pool_put_page_len_napi' starts to be come rather long.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 net-next 2/3] net: page_pool: add the possibility to sync DMA memory for device
  2019-11-19 11:23   ` Jesper Dangaard Brouer
  2019-11-19 11:33     ` Ilias Apalodimas
@ 2019-11-19 12:14     ` Lorenzo Bianconi
  2019-11-19 15:13       ` Jesper Dangaard Brouer
  1 sibling, 1 reply; 19+ messages in thread
From: Lorenzo Bianconi @ 2019-11-19 12:14 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Lorenzo Bianconi, netdev, davem, ilias.apalodimas, mcroce,
	jonathan.lemon

[-- Attachment #1: Type: text/plain, Size: 3666 bytes --]

> On Mon, 18 Nov 2019 15:33:45 +0200
> Lorenzo Bianconi <lorenzo@kernel.org> wrote:
> 
> > diff --git a/include/net/page_pool.h b/include/net/page_pool.h
> > index 1121faa99c12..6f684c3a3434 100644
> > --- a/include/net/page_pool.h
> > +++ b/include/net/page_pool.h
> > @@ -34,8 +34,15 @@
> >  #include <linux/ptr_ring.h>
> >  #include <linux/dma-direction.h>
> >  
> > -#define PP_FLAG_DMA_MAP 1 /* Should page_pool do the DMA map/unmap */
> > -#define PP_FLAG_ALL	PP_FLAG_DMA_MAP
> > +#define PP_FLAG_DMA_MAP		1 /* Should page_pool do the DMA map/unmap */
> > +#define PP_FLAG_DMA_SYNC_DEV	2 /* if set all pages that the driver gets
> > +				   * from page_pool will be
> > +				   * DMA-synced-for-device according to the
> > +				   * length provided by the device driver.
> > +				   * Please note DMA-sync-for-CPU is still
> > +				   * device driver responsibility
> > +				   */
> > +#define PP_FLAG_ALL		(PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV)
> >  
> [...]
> 
> Can you please change this to use the BIT(X) api.
> 
> #include <linux/bits.h>
> 
> #define PP_FLAG_DMA_MAP		BIT(0)
> #define PP_FLAG_DMA_SYNC_DEV	BIT(1)

Hi Jesper,

sure, will do in v5

> 
> 
> 
> > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > index dfc2501c35d9..4f9aed7bce5a 100644
> > --- a/net/core/page_pool.c
> > +++ b/net/core/page_pool.c
> > @@ -47,6 +47,13 @@ static int page_pool_init(struct page_pool *pool,
> >  	    (pool->p.dma_dir != DMA_BIDIRECTIONAL))
> >  		return -EINVAL;
> >  
> > +	/* In order to request DMA-sync-for-device the page needs to
> > +	 * be mapped
> > +	 */
> > +	if ((pool->p.flags & PP_FLAG_DMA_SYNC_DEV) &&
> > +	    !(pool->p.flags & PP_FLAG_DMA_MAP))
> > +		return -EINVAL;
> > +
> 
> I like that you have moved this check to setup time.
> 
> There are two other parameters the DMA_SYNC_DEV depend on:
> 
>  	struct page_pool_params pp_params = {
>  		.order = 0,
> -		.flags = PP_FLAG_DMA_MAP,
> +		.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
>  		.pool_size = size,
>  		.nid = cpu_to_node(0),
>  		.dev = pp->dev->dev.parent,
>  		.dma_dir = xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE,
> +		.offset = pp->rx_offset_correction,
> +		.max_len = MVNETA_MAX_RX_BUF_SIZE,
>  	};
> 
> Can you add a check, that .max_len must not be zero.  The reason is
> that I can easily see people misconfiguring this.  And the effect is
> that the DMA-sync-for-device is essentially disabled, without user
> realizing this. The not-realizing part is really bad, especially
> because bugs that can occur from this are very rare and hard to catch.

I guess we need to check it just if we provide PP_FLAG_DMA_SYNC_DEV.
Something like:

	if (pool->p.flags & PP_FLAG_DMA_SYNC_DEV) {
		if (!(pool->p.flags & PP_FLAG_DMA_MAP))
			return -EINVAL;

		if (!pool->p.max_len)
			return -EINVAL;
	}

> 
> I'm up for discussing if there should be a similar check for .offset.
> IMHO we should also check .offset is configured, and then be open to
> remove this check once a driver user want to use offset=0.  Does the
> mvneta driver already have a use-case for this (in non-XDP mode)?

With 'non-XDP mode' do you mean not loading a BPF program? If so yes, it used
in __page_pool_alloc_pages_slow getting pages from page allocator.
What would be a right min value for it? Just 0 or
XDP_PACKET_HEADROOM/NET_SKB_PAD? I guess here it matters if a BPF program is
loaded or not.

Regards,
Lorenzo

> 
> -- 
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 net-next 3/3] net: mvneta: get rid of huge dma sync in mvneta_rx_refill
  2019-11-19 11:38   ` Jesper Dangaard Brouer
@ 2019-11-19 12:19     ` Lorenzo Bianconi
  2019-11-19 14:51       ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 19+ messages in thread
From: Lorenzo Bianconi @ 2019-11-19 12:19 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Lorenzo Bianconi, netdev, davem, ilias.apalodimas, mcroce,
	jonathan.lemon

[-- Attachment #1: Type: text/plain, Size: 2769 bytes --]

> On Mon, 18 Nov 2019 15:33:46 +0200
> Lorenzo Bianconi <lorenzo@kernel.org> wrote:
> 
> > diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> > index f7713c2c68e1..a06d109c9e80 100644
> > --- a/drivers/net/ethernet/marvell/mvneta.c
> > +++ b/drivers/net/ethernet/marvell/mvneta.c
> [...]
> > @@ -2097,8 +2093,10 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
> >  		err = xdp_do_redirect(pp->dev, xdp, prog);
> >  		if (err) {
> >  			ret = MVNETA_XDP_DROPPED;
> > -			page_pool_recycle_direct(rxq->page_pool,
> > -						 virt_to_head_page(xdp->data));
> > +			__page_pool_put_page(rxq->page_pool,
> > +					virt_to_head_page(xdp->data),
> > +					xdp->data_end - xdp->data_hard_start,
> > +					true);
> >  		} else {
> >  			ret = MVNETA_XDP_REDIR;
> >  		}
> > @@ -2107,8 +2105,10 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
> >  	case XDP_TX:
> >  		ret = mvneta_xdp_xmit_back(pp, xdp);
> >  		if (ret != MVNETA_XDP_TX)
> > -			page_pool_recycle_direct(rxq->page_pool,
> > -						 virt_to_head_page(xdp->data));
> > +			__page_pool_put_page(rxq->page_pool,
> > +					virt_to_head_page(xdp->data),
> > +					xdp->data_end - xdp->data_hard_start,
> > +					true);
> >  		break;
> >  	default:
> >  		bpf_warn_invalid_xdp_action(act);
> > @@ -2117,8 +2117,10 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
> >  		trace_xdp_exception(pp->dev, prog, act);
> >  		/* fall through */
> >  	case XDP_DROP:
> > -		page_pool_recycle_direct(rxq->page_pool,
> > -					 virt_to_head_page(xdp->data));
> > +		__page_pool_put_page(rxq->page_pool,
> > +				     virt_to_head_page(xdp->data),
> > +				     xdp->data_end - xdp->data_hard_start,
> > +				     true);
> 
> This does beg for the question: Should we create an API wrapper for
> this in the header file?
> 
> But what to name it?
> 
> I know Jonathan doesn't like the "direct" part of the  previous function
> name page_pool_recycle_direct.  (I do considered calling this 'napi'
> instead, as it would be inline with networking use-cases, but it seemed
> limited if other subsystem end-up using this).
> 
> Does is 'page_pool_put_page_len' sound better?
> 
> But I want also want hide the bool 'allow_direct' in the API name.
> (As it makes it easier to identify users that uses this from softirq)
> 
> Going for 'page_pool_put_page_len_napi' starts to be come rather long.

What about removing the second 'page'? Something like:
- page_pool_put_len_napi()

Regards,
Lorenzo

> 
> -- 
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 net-next 3/3] net: mvneta: get rid of huge dma sync in mvneta_rx_refill
  2019-11-19 12:19     ` Lorenzo Bianconi
@ 2019-11-19 14:51       ` Jesper Dangaard Brouer
  2019-11-19 15:38         ` Lorenzo Bianconi
  0 siblings, 1 reply; 19+ messages in thread
From: Jesper Dangaard Brouer @ 2019-11-19 14:51 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: Lorenzo Bianconi, netdev, davem, ilias.apalodimas, mcroce,
	jonathan.lemon, brouer

On Tue, 19 Nov 2019 14:19:11 +0200
Lorenzo Bianconi <lorenzo.bianconi@redhat.com> wrote:

> > On Mon, 18 Nov 2019 15:33:46 +0200
> > Lorenzo Bianconi <lorenzo@kernel.org> wrote:
> >   
> > > diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> > > index f7713c2c68e1..a06d109c9e80 100644
> > > --- a/drivers/net/ethernet/marvell/mvneta.c
> > > +++ b/drivers/net/ethernet/marvell/mvneta.c  
> > [...]  
> > > @@ -2097,8 +2093,10 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
> > >  		err = xdp_do_redirect(pp->dev, xdp, prog);
> > >  		if (err) {
> > >  			ret = MVNETA_XDP_DROPPED;
> > > -			page_pool_recycle_direct(rxq->page_pool,
> > > -						 virt_to_head_page(xdp->data));
> > > +			__page_pool_put_page(rxq->page_pool,
> > > +					virt_to_head_page(xdp->data),
> > > +					xdp->data_end - xdp->data_hard_start,
> > > +					true);
> > >  		} else {
> > >  			ret = MVNETA_XDP_REDIR;
> > >  		}
> > > @@ -2107,8 +2105,10 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
> > >  	case XDP_TX:
> > >  		ret = mvneta_xdp_xmit_back(pp, xdp);
> > >  		if (ret != MVNETA_XDP_TX)
> > > -			page_pool_recycle_direct(rxq->page_pool,
> > > -						 virt_to_head_page(xdp->data));
> > > +			__page_pool_put_page(rxq->page_pool,
> > > +					virt_to_head_page(xdp->data),
> > > +					xdp->data_end - xdp->data_hard_start,
> > > +					true);
> > >  		break;
> > >  	default:
> > >  		bpf_warn_invalid_xdp_action(act);
> > > @@ -2117,8 +2117,10 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
> > >  		trace_xdp_exception(pp->dev, prog, act);
> > >  		/* fall through */
> > >  	case XDP_DROP:
> > > -		page_pool_recycle_direct(rxq->page_pool,
> > > -					 virt_to_head_page(xdp->data));
> > > +		__page_pool_put_page(rxq->page_pool,
> > > +				     virt_to_head_page(xdp->data),
> > > +				     xdp->data_end - xdp->data_hard_start,
> > > +				     true);  
> > 
> > This does beg for the question: Should we create an API wrapper for
> > this in the header file?
> > 
> > But what to name it?
> > 
> > I know Jonathan doesn't like the "direct" part of the  previous function
> > name page_pool_recycle_direct.  (I do considered calling this 'napi'
> > instead, as it would be inline with networking use-cases, but it seemed
> > limited if other subsystem end-up using this).
> > 
> > Does is 'page_pool_put_page_len' sound better?
> > 
> > But I want also want hide the bool 'allow_direct' in the API name.
> > (As it makes it easier to identify users that uses this from softirq)
> > 
> > Going for 'page_pool_put_page_len_napi' starts to be come rather long.  
> 
> What about removing the second 'page'? Something like:
> - page_pool_put_len_napi()

Well, we (unfortunately) already have page_pool_put(), which is used
for refcnt on the page_pool object itself.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 net-next 2/3] net: page_pool: add the possibility to sync DMA memory for device
  2019-11-19 11:33     ` Ilias Apalodimas
@ 2019-11-19 15:11       ` Jesper Dangaard Brouer
  2019-11-19 15:23         ` Ilias Apalodimas
  0 siblings, 1 reply; 19+ messages in thread
From: Jesper Dangaard Brouer @ 2019-11-19 15:11 UTC (permalink / raw)
  To: Ilias Apalodimas
  Cc: Lorenzo Bianconi, netdev, davem, lorenzo.bianconi, mcroce,
	jonathan.lemon, brouer

On Tue, 19 Nov 2019 13:33:36 +0200
Ilias Apalodimas <ilias.apalodimas@linaro.org> wrote:

> > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > > index dfc2501c35d9..4f9aed7bce5a 100644
> > > --- a/net/core/page_pool.c
> > > +++ b/net/core/page_pool.c
> > > @@ -47,6 +47,13 @@ static int page_pool_init(struct page_pool *pool,
> > >  	    (pool->p.dma_dir != DMA_BIDIRECTIONAL))
> > >  		return -EINVAL;
> > >  
> > > +	/* In order to request DMA-sync-for-device the page needs to
> > > +	 * be mapped
> > > +	 */
> > > +	if ((pool->p.flags & PP_FLAG_DMA_SYNC_DEV) &&
> > > +	    !(pool->p.flags & PP_FLAG_DMA_MAP))
> > > +		return -EINVAL;
> > > +  
> > 
> > I like that you have moved this check to setup time.
> > 
> > There are two other parameters the DMA_SYNC_DEV depend on:
> > 
> >  	struct page_pool_params pp_params = {
> >  		.order = 0,
> > -		.flags = PP_FLAG_DMA_MAP,
> > +		.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
> >  		.pool_size = size,
> >  		.nid = cpu_to_node(0),
> >  		.dev = pp->dev->dev.parent,
> >  		.dma_dir = xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE,
> > +		.offset = pp->rx_offset_correction,
> > +		.max_len = MVNETA_MAX_RX_BUF_SIZE,
> >  	};
> > 
> > Can you add a check, that .max_len must not be zero.  The reason is
> > that I can easily see people misconfiguring this.  And the effect is
> > that the DMA-sync-for-device is essentially disabled, without user
> > realizing this. The not-realizing part is really bad, especially
> > because bugs that can occur from this are very rare and hard to catch.  
> 
> +1 we sync based on the min() value of those 
> 
> > 
> > I'm up for discussing if there should be a similar check for .offset.
> > IMHO we should also check .offset is configured, and then be open to
> > remove this check once a driver user want to use offset=0.  Does the
> > mvneta driver already have a use-case for this (in non-XDP mode)?  
> 
> Not sure about this, since it does not break anything apart from some
> performance hit

I don't follow the 'performance hit' comment.  This is checked at setup
time (page_pool_init), thus it doesn't affect runtime.

This is a generic optimization principle that I use a lot. Moving code
checks out of fast-path, and instead do more at setup/load-time, or
even at shutdown-time (like we do for page_pool e.g. check refcnt
invariance).  This principle is also heavily used by BPF, that adjust
BPF-instructions at load-time.  It is core to getting the performance
we need for high-speed networking.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 net-next 2/3] net: page_pool: add the possibility to sync DMA memory for device
  2019-11-19 12:14     ` Lorenzo Bianconi
@ 2019-11-19 15:13       ` Jesper Dangaard Brouer
  2019-11-19 15:25         ` Lorenzo Bianconi
  0 siblings, 1 reply; 19+ messages in thread
From: Jesper Dangaard Brouer @ 2019-11-19 15:13 UTC (permalink / raw)
  To: Lorenzo Bianconi, mcroce
  Cc: Lorenzo Bianconi, netdev, davem, ilias.apalodimas,
	jonathan.lemon, brouer

On Tue, 19 Nov 2019 14:14:30 +0200
Lorenzo Bianconi <lorenzo.bianconi@redhat.com> wrote:

> > On Mon, 18 Nov 2019 15:33:45 +0200
> > Lorenzo Bianconi <lorenzo@kernel.org> wrote:
> >   
> > > diff --git a/include/net/page_pool.h b/include/net/page_pool.h
> > > index 1121faa99c12..6f684c3a3434 100644
> > > --- a/include/net/page_pool.h
> > > +++ b/include/net/page_pool.h
> > > @@ -34,8 +34,15 @@
> > >  #include <linux/ptr_ring.h>
> > >  #include <linux/dma-direction.h>
> > >  
> > > -#define PP_FLAG_DMA_MAP 1 /* Should page_pool do the DMA map/unmap */
> > > -#define PP_FLAG_ALL	PP_FLAG_DMA_MAP
> > > +#define PP_FLAG_DMA_MAP		1 /* Should page_pool do the DMA map/unmap */
> > > +#define PP_FLAG_DMA_SYNC_DEV	2 /* if set all pages that the driver gets
> > > +				   * from page_pool will be
> > > +				   * DMA-synced-for-device according to the
> > > +				   * length provided by the device driver.
> > > +				   * Please note DMA-sync-for-CPU is still
> > > +				   * device driver responsibility
> > > +				   */
> > > +#define PP_FLAG_ALL		(PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV)
> > >    
> > [...]
> > 
> > Can you please change this to use the BIT(X) api.
> > 
> > #include <linux/bits.h>
> > 
> > #define PP_FLAG_DMA_MAP		BIT(0)
> > #define PP_FLAG_DMA_SYNC_DEV	BIT(1)  
> 
> Hi Jesper,
> 
> sure, will do in v5
> 
> > 
> > 
> >   
> > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > > index dfc2501c35d9..4f9aed7bce5a 100644
> > > --- a/net/core/page_pool.c
> > > +++ b/net/core/page_pool.c
> > > @@ -47,6 +47,13 @@ static int page_pool_init(struct page_pool *pool,
> > >  	    (pool->p.dma_dir != DMA_BIDIRECTIONAL))
> > >  		return -EINVAL;
> > >  
> > > +	/* In order to request DMA-sync-for-device the page needs to
> > > +	 * be mapped
> > > +	 */
> > > +	if ((pool->p.flags & PP_FLAG_DMA_SYNC_DEV) &&
> > > +	    !(pool->p.flags & PP_FLAG_DMA_MAP))
> > > +		return -EINVAL;
> > > +  
> > 
> > I like that you have moved this check to setup time.
> > 
> > There are two other parameters the DMA_SYNC_DEV depend on:
> > 
> >  	struct page_pool_params pp_params = {
> >  		.order = 0,
> > -		.flags = PP_FLAG_DMA_MAP,
> > +		.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
> >  		.pool_size = size,
> >  		.nid = cpu_to_node(0),
> >  		.dev = pp->dev->dev.parent,
> >  		.dma_dir = xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE,
> > +		.offset = pp->rx_offset_correction,
> > +		.max_len = MVNETA_MAX_RX_BUF_SIZE,
> >  	};
> > 
> > Can you add a check, that .max_len must not be zero.  The reason is
> > that I can easily see people misconfiguring this.  And the effect is
> > that the DMA-sync-for-device is essentially disabled, without user
> > realizing this. The not-realizing part is really bad, especially
> > because bugs that can occur from this are very rare and hard to catch.  
> 
> I guess we need to check it just if we provide PP_FLAG_DMA_SYNC_DEV.
> Something like:
> 
> 	if (pool->p.flags & PP_FLAG_DMA_SYNC_DEV) {
> 		if (!(pool->p.flags & PP_FLAG_DMA_MAP))
> 			return -EINVAL;
> 
> 		if (!pool->p.max_len)
> 			return -EINVAL;
> 	}

Yes, exactly.

> > 
> > I'm up for discussing if there should be a similar check for .offset.
> > IMHO we should also check .offset is configured, and then be open to
> > remove this check once a driver user want to use offset=0.  Does the
> > mvneta driver already have a use-case for this (in non-XDP mode)?  
> 
> With 'non-XDP mode' do you mean not loading a BPF program? If so yes, it used
> in __page_pool_alloc_pages_slow getting pages from page allocator.
> What would be a right min value for it? Just 0 or
> XDP_PACKET_HEADROOM/NET_SKB_PAD? I guess here it matters if a BPF program is
> loaded or not.

I think you are saying, that we need to allow .offset==0, because it is
used by mvneta.  Did I understand that correctly?

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 net-next 2/3] net: page_pool: add the possibility to sync DMA memory for device
  2019-11-19 15:11       ` Jesper Dangaard Brouer
@ 2019-11-19 15:23         ` Ilias Apalodimas
  0 siblings, 0 replies; 19+ messages in thread
From: Ilias Apalodimas @ 2019-11-19 15:23 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Lorenzo Bianconi, netdev, davem, lorenzo.bianconi, mcroce,
	jonathan.lemon

On Tue, Nov 19, 2019 at 04:11:09PM +0100, Jesper Dangaard Brouer wrote:
> On Tue, 19 Nov 2019 13:33:36 +0200
> Ilias Apalodimas <ilias.apalodimas@linaro.org> wrote:
> 
> > > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > > > index dfc2501c35d9..4f9aed7bce5a 100644
> > > > --- a/net/core/page_pool.c
> > > > +++ b/net/core/page_pool.c
> > > > @@ -47,6 +47,13 @@ static int page_pool_init(struct page_pool *pool,
> > > >  	    (pool->p.dma_dir != DMA_BIDIRECTIONAL))
> > > >  		return -EINVAL;
> > > >  
> > > > +	/* In order to request DMA-sync-for-device the page needs to
> > > > +	 * be mapped
> > > > +	 */
> > > > +	if ((pool->p.flags & PP_FLAG_DMA_SYNC_DEV) &&
> > > > +	    !(pool->p.flags & PP_FLAG_DMA_MAP))
> > > > +		return -EINVAL;
> > > > +  
> > > 
> > > I like that you have moved this check to setup time.
> > > 
> > > There are two other parameters the DMA_SYNC_DEV depend on:
> > > 
> > >  	struct page_pool_params pp_params = {
> > >  		.order = 0,
> > > -		.flags = PP_FLAG_DMA_MAP,
> > > +		.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
> > >  		.pool_size = size,
> > >  		.nid = cpu_to_node(0),
> > >  		.dev = pp->dev->dev.parent,
> > >  		.dma_dir = xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE,
> > > +		.offset = pp->rx_offset_correction,
> > > +		.max_len = MVNETA_MAX_RX_BUF_SIZE,
> > >  	};
> > > 
> > > Can you add a check, that .max_len must not be zero.  The reason is
> > > that I can easily see people misconfiguring this.  And the effect is
> > > that the DMA-sync-for-device is essentially disabled, without user
> > > realizing this. The not-realizing part is really bad, especially
> > > because bugs that can occur from this are very rare and hard to catch.  
> > 
> > +1 we sync based on the min() value of those 
> > 
> > > 
> > > I'm up for discussing if there should be a similar check for .offset.
> > > IMHO we should also check .offset is configured, and then be open to
> > > remove this check once a driver user want to use offset=0.  Does the
> > > mvneta driver already have a use-case for this (in non-XDP mode)?  
> > 
> > Not sure about this, since it does not break anything apart from some
> > performance hit
> 
> I don't follow the 'performance hit' comment.  This is checked at setup
> time (page_pool_init), thus it doesn't affect runtime.

If the offset is 0, you'll end up syncing a couple of uneeded bytes (whatever
headers the buffer has which doesn't need syncing). 

> 
> This is a generic optimization principle that I use a lot. Moving code
> checks out of fast-path, and instead do more at setup/load-time, or
> even at shutdown-time (like we do for page_pool e.g. check refcnt
> invariance).  This principle is also heavily used by BPF, that adjust
> BPF-instructions at load-time.  It is core to getting the performance
> we need for high-speed networking.

The offset will affect the fast path running code.

What i am worried about is that XDP and SKB pool will have different needs for
offsets. In the netsec driver i am dealing with this with reserving the same
header whether the packet is an SKB or XDP buffer. If we check the offset we are
practically forcing people to do something similar

Thanks
/Ilias
> 
> -- 
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 net-next 2/3] net: page_pool: add the possibility to sync DMA memory for device
  2019-11-19 15:13       ` Jesper Dangaard Brouer
@ 2019-11-19 15:25         ` Lorenzo Bianconi
  2019-11-19 21:17           ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 19+ messages in thread
From: Lorenzo Bianconi @ 2019-11-19 15:25 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: mcroce, Lorenzo Bianconi, netdev, davem, ilias.apalodimas,
	jonathan.lemon

[-- Attachment #1: Type: text/plain, Size: 4721 bytes --]

> On Tue, 19 Nov 2019 14:14:30 +0200
> Lorenzo Bianconi <lorenzo.bianconi@redhat.com> wrote:
> 
> > > On Mon, 18 Nov 2019 15:33:45 +0200
> > > Lorenzo Bianconi <lorenzo@kernel.org> wrote:
> > >   
> > > > diff --git a/include/net/page_pool.h b/include/net/page_pool.h
> > > > index 1121faa99c12..6f684c3a3434 100644
> > > > --- a/include/net/page_pool.h
> > > > +++ b/include/net/page_pool.h
> > > > @@ -34,8 +34,15 @@
> > > >  #include <linux/ptr_ring.h>
> > > >  #include <linux/dma-direction.h>
> > > >  
> > > > -#define PP_FLAG_DMA_MAP 1 /* Should page_pool do the DMA map/unmap */
> > > > -#define PP_FLAG_ALL	PP_FLAG_DMA_MAP
> > > > +#define PP_FLAG_DMA_MAP		1 /* Should page_pool do the DMA map/unmap */
> > > > +#define PP_FLAG_DMA_SYNC_DEV	2 /* if set all pages that the driver gets
> > > > +				   * from page_pool will be
> > > > +				   * DMA-synced-for-device according to the
> > > > +				   * length provided by the device driver.
> > > > +				   * Please note DMA-sync-for-CPU is still
> > > > +				   * device driver responsibility
> > > > +				   */
> > > > +#define PP_FLAG_ALL		(PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV)
> > > >    
> > > [...]
> > > 
> > > Can you please change this to use the BIT(X) api.
> > > 
> > > #include <linux/bits.h>
> > > 
> > > #define PP_FLAG_DMA_MAP		BIT(0)
> > > #define PP_FLAG_DMA_SYNC_DEV	BIT(1)  
> > 
> > Hi Jesper,
> > 
> > sure, will do in v5
> > 
> > > 
> > > 
> > >   
> > > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > > > index dfc2501c35d9..4f9aed7bce5a 100644
> > > > --- a/net/core/page_pool.c
> > > > +++ b/net/core/page_pool.c
> > > > @@ -47,6 +47,13 @@ static int page_pool_init(struct page_pool *pool,
> > > >  	    (pool->p.dma_dir != DMA_BIDIRECTIONAL))
> > > >  		return -EINVAL;
> > > >  
> > > > +	/* In order to request DMA-sync-for-device the page needs to
> > > > +	 * be mapped
> > > > +	 */
> > > > +	if ((pool->p.flags & PP_FLAG_DMA_SYNC_DEV) &&
> > > > +	    !(pool->p.flags & PP_FLAG_DMA_MAP))
> > > > +		return -EINVAL;
> > > > +  
> > > 
> > > I like that you have moved this check to setup time.
> > > 
> > > There are two other parameters the DMA_SYNC_DEV depend on:
> > > 
> > >  	struct page_pool_params pp_params = {
> > >  		.order = 0,
> > > -		.flags = PP_FLAG_DMA_MAP,
> > > +		.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
> > >  		.pool_size = size,
> > >  		.nid = cpu_to_node(0),
> > >  		.dev = pp->dev->dev.parent,
> > >  		.dma_dir = xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE,
> > > +		.offset = pp->rx_offset_correction,
> > > +		.max_len = MVNETA_MAX_RX_BUF_SIZE,
> > >  	};
> > > 
> > > Can you add a check, that .max_len must not be zero.  The reason is
> > > that I can easily see people misconfiguring this.  And the effect is
> > > that the DMA-sync-for-device is essentially disabled, without user
> > > realizing this. The not-realizing part is really bad, especially
> > > because bugs that can occur from this are very rare and hard to catch.  
> > 
> > I guess we need to check it just if we provide PP_FLAG_DMA_SYNC_DEV.
> > Something like:
> > 
> > 	if (pool->p.flags & PP_FLAG_DMA_SYNC_DEV) {
> > 		if (!(pool->p.flags & PP_FLAG_DMA_MAP))
> > 			return -EINVAL;
> > 
> > 		if (!pool->p.max_len)
> > 			return -EINVAL;
> >	}
> 
> Yes, exactly.
> 

ack, I will add it to v5

> > > 
> > > I'm up for discussing if there should be a similar check for .offset.
> > > IMHO we should also check .offset is configured, and then be open to
> > > remove this check once a driver user want to use offset=0.  Does the
> > > mvneta driver already have a use-case for this (in non-XDP mode)?  
> > 
> > With 'non-XDP mode' do you mean not loading a BPF program? If so yes, it used
> > in __page_pool_alloc_pages_slow getting pages from page allocator.
> > What would be a right min value for it? Just 0 or
> > XDP_PACKET_HEADROOM/NET_SKB_PAD? I guess here it matters if a BPF program is
> > loaded or not.
> 
> I think you are saying, that we need to allow .offset==0, because it is
> used by mvneta.  Did I understand that correctly?

I was just wondering what is the right value for the min offset, but rethinking
about it yes, there is a condition where  mvneta is using offset set 0 (it is the
regression reported by Andrew, when mvneta is running on a hw bm device  but bm
code is not compiled). Do you think we can skip this check for the moment until we fix
XDP on that particular board?

Regards,
Lorenzo

> 
> -- 
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 net-next 3/3] net: mvneta: get rid of huge dma sync in mvneta_rx_refill
  2019-11-19 14:51       ` Jesper Dangaard Brouer
@ 2019-11-19 15:38         ` Lorenzo Bianconi
  2019-11-19 22:23           ` Jonathan Lemon
  0 siblings, 1 reply; 19+ messages in thread
From: Lorenzo Bianconi @ 2019-11-19 15:38 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Lorenzo Bianconi, netdev, davem, ilias.apalodimas, mcroce,
	jonathan.lemon

[-- Attachment #1: Type: text/plain, Size: 1666 bytes --]

[...]
> > > > -		page_pool_recycle_direct(rxq->page_pool,
> > > > -					 virt_to_head_page(xdp->data));
> > > > +		__page_pool_put_page(rxq->page_pool,
> > > > +				     virt_to_head_page(xdp->data),
> > > > +				     xdp->data_end - xdp->data_hard_start,
> > > > +				     true);  
> > > 
> > > This does beg for the question: Should we create an API wrapper for
> > > this in the header file?
> > > 
> > > But what to name it?
> > > 
> > > I know Jonathan doesn't like the "direct" part of the  previous function
> > > name page_pool_recycle_direct.  (I do considered calling this 'napi'
> > > instead, as it would be inline with networking use-cases, but it seemed
> > > limited if other subsystem end-up using this).
> > > 
> > > Does is 'page_pool_put_page_len' sound better?
> > > 
> > > But I want also want hide the bool 'allow_direct' in the API name.
> > > (As it makes it easier to identify users that uses this from softirq)
> > > 
> > > Going for 'page_pool_put_page_len_napi' starts to be come rather long.  
> > 
> > What about removing the second 'page'? Something like:
> > - page_pool_put_len_napi()
> 
> Well, we (unfortunately) already have page_pool_put(), which is used
> for refcnt on the page_pool object itself.

__page_pool_put_page(pp, data, len, true) is a more generic version of
page_pool_recycle_direct where we can specify even the length. So what about:

- page_pool_recycle_len_direct
- page_pool_recycle_len_napi

Regards,
Lorenzo

> 
> -- 
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 net-next 2/3] net: page_pool: add the possibility to sync DMA memory for device
  2019-11-19 15:25         ` Lorenzo Bianconi
@ 2019-11-19 21:17           ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 19+ messages in thread
From: Jesper Dangaard Brouer @ 2019-11-19 21:17 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: mcroce, Lorenzo Bianconi, netdev, davem, ilias.apalodimas,
	jonathan.lemon, brouer

On Tue, 19 Nov 2019 17:25:43 +0200
Lorenzo Bianconi <lorenzo.bianconi@redhat.com> wrote:

> > On Tue, 19 Nov 2019 14:14:30 +0200
> > Lorenzo Bianconi <lorenzo.bianconi@redhat.com> wrote:
> >   
> > > > On Mon, 18 Nov 2019 15:33:45 +0200
> > > > Lorenzo Bianconi <lorenzo@kernel.org> wrote:
> > > >     
> > > > > diff --git a/include/net/page_pool.h b/include/net/page_pool.h
> > > > > index 1121faa99c12..6f684c3a3434 100644
> > > > > --- a/include/net/page_pool.h
> > > > > +++ b/include/net/page_pool.h
> > > > > @@ -34,8 +34,15 @@
> > > > >  #include <linux/ptr_ring.h>
> > > > >  #include <linux/dma-direction.h>
> > > > >  
> > > > > -#define PP_FLAG_DMA_MAP 1 /* Should page_pool do the DMA map/unmap */
> > > > > -#define PP_FLAG_ALL	PP_FLAG_DMA_MAP
> > > > > +#define PP_FLAG_DMA_MAP		1 /* Should page_pool do the DMA map/unmap */
> > > > > +#define PP_FLAG_DMA_SYNC_DEV	2 /* if set all pages that the driver gets
> > > > > +				   * from page_pool will be
> > > > > +				   * DMA-synced-for-device according to the
> > > > > +				   * length provided by the device driver.
> > > > > +				   * Please note DMA-sync-for-CPU is still
> > > > > +				   * device driver responsibility
> > > > > +				   */
> > > > > +#define PP_FLAG_ALL		(PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV)
> > > > >      
> > > > [...]
> > > > 
> > > > Can you please change this to use the BIT(X) api.
> > > > 
> > > > #include <linux/bits.h>
> > > > 
> > > > #define PP_FLAG_DMA_MAP		BIT(0)
> > > > #define PP_FLAG_DMA_SYNC_DEV	BIT(1)    
> > > 
> > > Hi Jesper,
> > > 
> > > sure, will do in v5
> > >   
> > > > 
> > > > 
> > > >     
> > > > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > > > > index dfc2501c35d9..4f9aed7bce5a 100644
> > > > > --- a/net/core/page_pool.c
> > > > > +++ b/net/core/page_pool.c
> > > > > @@ -47,6 +47,13 @@ static int page_pool_init(struct page_pool *pool,
> > > > >  	    (pool->p.dma_dir != DMA_BIDIRECTIONAL))
> > > > >  		return -EINVAL;
> > > > >  
> > > > > +	/* In order to request DMA-sync-for-device the page needs to
> > > > > +	 * be mapped
> > > > > +	 */
> > > > > +	if ((pool->p.flags & PP_FLAG_DMA_SYNC_DEV) &&
> > > > > +	    !(pool->p.flags & PP_FLAG_DMA_MAP))
> > > > > +		return -EINVAL;
> > > > > +    
> > > > 
> > > > I like that you have moved this check to setup time.
> > > > 
> > > > There are two other parameters the DMA_SYNC_DEV depend on:
> > > > 
> > > >  	struct page_pool_params pp_params = {
> > > >  		.order = 0,
> > > > -		.flags = PP_FLAG_DMA_MAP,
> > > > +		.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
> > > >  		.pool_size = size,
> > > >  		.nid = cpu_to_node(0),
> > > >  		.dev = pp->dev->dev.parent,
> > > >  		.dma_dir = xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE,
> > > > +		.offset = pp->rx_offset_correction,
> > > > +		.max_len = MVNETA_MAX_RX_BUF_SIZE,
> > > >  	};
> > > > 
> > > > Can you add a check, that .max_len must not be zero.  The reason is
> > > > that I can easily see people misconfiguring this.  And the effect is
> > > > that the DMA-sync-for-device is essentially disabled, without user
> > > > realizing this. The not-realizing part is really bad, especially
> > > > because bugs that can occur from this are very rare and hard to catch.    
> > > 
> > > I guess we need to check it just if we provide PP_FLAG_DMA_SYNC_DEV.
> > > Something like:
> > > 
> > > 	if (pool->p.flags & PP_FLAG_DMA_SYNC_DEV) {
> > > 		if (!(pool->p.flags & PP_FLAG_DMA_MAP))
> > > 			return -EINVAL;
> > > 
> > > 		if (!pool->p.max_len)
> > > 			return -EINVAL;
> > >	}  
> > 
> > Yes, exactly.
> >   
> 
> ack, I will add it to v5
> 
> > > > 
> > > > I'm up for discussing if there should be a similar check for .offset.
> > > > IMHO we should also check .offset is configured, and then be open to
> > > > remove this check once a driver user want to use offset=0.  Does the
> > > > mvneta driver already have a use-case for this (in non-XDP mode)?    
> > > 
> > > With 'non-XDP mode' do you mean not loading a BPF program? If so yes, it used
> > > in __page_pool_alloc_pages_slow getting pages from page allocator.
> > > What would be a right min value for it? Just 0 or
> > > XDP_PACKET_HEADROOM/NET_SKB_PAD? I guess here it matters if a BPF program is
> > > loaded or not.  
> > 
> > I think you are saying, that we need to allow .offset==0, because it is
> > used by mvneta.  Did I understand that correctly?  
> 
> I was just wondering what is the right value for the min offset, but
> rethinking about it yes, there is a condition where  mvneta is using
> offset set 0 (it is the regression reported by Andrew, when mvneta is
> running on a hw bm device  but bm code is not compiled). Do you think
> we can skip this check for the moment until we fix XDP on that
> particular board?

Yes. I guess we just accept .offset can be zero.  It is an artificial
limitation.

The check is not important if API is used correctly. It comes from my
API design philosophy for page_pool, which is "Easy to use, and hard to
misuse".  This is a case of catching "misuse" and signaling that this
was a wrong config.  The check for pool->p.max_len should be enough,
for driver developer to notice, that they also need to set offset.
Maybe a comment close to pool->p.max_len check about "offset" will be
enough.  Given you return the "catch all" -EINVAL, we/you force driver
devel to read code for page_pool_init(), which IMHO is sufficiently
clear.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 net-next 3/3] net: mvneta: get rid of huge dma sync in mvneta_rx_refill
  2019-11-19 15:38         ` Lorenzo Bianconi
@ 2019-11-19 22:23           ` Jonathan Lemon
  2019-11-20  9:21             ` Lorenzo Bianconi
  0 siblings, 1 reply; 19+ messages in thread
From: Jonathan Lemon @ 2019-11-19 22:23 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: Jesper Dangaard Brouer, Lorenzo Bianconi, netdev, davem,
	ilias.apalodimas, mcroce

On 19 Nov 2019, at 7:38, Lorenzo Bianconi wrote:

> [...]
>>>>> -		page_pool_recycle_direct(rxq->page_pool,
>>>>> -					 virt_to_head_page(xdp->data));
>>>>> +		__page_pool_put_page(rxq->page_pool,
>>>>> +				     virt_to_head_page(xdp->data),
>>>>> +				     xdp->data_end - xdp->data_hard_start,
>>>>> +				     true);
>>>>
>>>> This does beg for the question: Should we create an API wrapper for
>>>> this in the header file?
>>>>
>>>> But what to name it?
>>>>
>>>> I know Jonathan doesn't like the "direct" part of the  previous 
>>>> function
>>>> name page_pool_recycle_direct.  (I do considered calling this 
>>>> 'napi'
>>>> instead, as it would be inline with networking use-cases, but it 
>>>> seemed
>>>> limited if other subsystem end-up using this).
>>>>
>>>> Does is 'page_pool_put_page_len' sound better?
>>>>
>>>> But I want also want hide the bool 'allow_direct' in the API name.
>>>> (As it makes it easier to identify users that uses this from 
>>>> softirq)
>>>>
>>>> Going for 'page_pool_put_page_len_napi' starts to be come rather 
>>>> long.
>>>
>>> What about removing the second 'page'? Something like:
>>> - page_pool_put_len_napi()
>>
>> Well, we (unfortunately) already have page_pool_put(), which is used
>> for refcnt on the page_pool object itself.
>
> __page_pool_put_page(pp, data, len, true) is a more generic version of
> page_pool_recycle_direct where we can specify even the length. So what 
> about:
>
> - page_pool_recycle_len_direct
> - page_pool_recycle_len_napi

I'd suggest:

/* elevated refcounts, page may seen by networking stack */
page_pool_drain(pool, page, count)              /* non napi, len = -1 */
page_pool_drain_direct(pool, page, count)       /* len = -1 */

page_pool_check_put_page(page)                  /* may not belong to 
pool */

/* recycle variants drain/expect refcount == 1 */
page_pool_recycle(pool, page, len)
page_pool_recycle_direct(pool, page, len)

page_pool_put_page(pool, page, len, mode)	    /* generic, for 
__xdp_return */


I'd rather add len as a parameter, than add more wrapper variants.
-- 
Jonathan


>
> Regards,
> Lorenzo
>
>>
>> -- 
>> Best regards,
>>   Jesper Dangaard Brouer
>>   MSc.CS, Principal Kernel Engineer at Red Hat
>>   LinkedIn: http://www.linkedin.com/in/brouer
>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 net-next 3/3] net: mvneta: get rid of huge dma sync in mvneta_rx_refill
  2019-11-19 22:23           ` Jonathan Lemon
@ 2019-11-20  9:21             ` Lorenzo Bianconi
  2019-11-20 16:29               ` Jonathan Lemon
  0 siblings, 1 reply; 19+ messages in thread
From: Lorenzo Bianconi @ 2019-11-20  9:21 UTC (permalink / raw)
  To: Jonathan Lemon
  Cc: Jesper Dangaard Brouer, Lorenzo Bianconi, netdev, davem,
	ilias.apalodimas, mcroce

[-- Attachment #1: Type: text/plain, Size: 3058 bytes --]

> On 19 Nov 2019, at 7:38, Lorenzo Bianconi wrote:
> 
> > [...]
> > > > > > -		page_pool_recycle_direct(rxq->page_pool,
> > > > > > -					 virt_to_head_page(xdp->data));
> > > > > > +		__page_pool_put_page(rxq->page_pool,
> > > > > > +				     virt_to_head_page(xdp->data),
> > > > > > +				     xdp->data_end - xdp->data_hard_start,
> > > > > > +				     true);
> > > > > 
> > > > > This does beg for the question: Should we create an API wrapper for
> > > > > this in the header file?
> > > > > 
> > > > > But what to name it?
> > > > > 
> > > > > I know Jonathan doesn't like the "direct" part of the
> > > > > previous function
> > > > > name page_pool_recycle_direct.  (I do considered calling
> > > > > this 'napi'
> > > > > instead, as it would be inline with networking use-cases,
> > > > > but it seemed
> > > > > limited if other subsystem end-up using this).
> > > > > 
> > > > > Does is 'page_pool_put_page_len' sound better?
> > > > > 
> > > > > But I want also want hide the bool 'allow_direct' in the API name.
> > > > > (As it makes it easier to identify users that uses this from
> > > > > softirq)
> > > > > 
> > > > > Going for 'page_pool_put_page_len_napi' starts to be come
> > > > > rather long.
> > > > 
> > > > What about removing the second 'page'? Something like:
> > > > - page_pool_put_len_napi()
> > > 
> > > Well, we (unfortunately) already have page_pool_put(), which is used
> > > for refcnt on the page_pool object itself.
> > 
> > __page_pool_put_page(pp, data, len, true) is a more generic version of
> > page_pool_recycle_direct where we can specify even the length. So what
> > about:
> > 
> > - page_pool_recycle_len_direct
> > - page_pool_recycle_len_napi
> 
> I'd suggest:
> 
> /* elevated refcounts, page may seen by networking stack */
> page_pool_drain(pool, page, count)              /* non napi, len = -1 */
> page_pool_drain_direct(pool, page, count)       /* len = -1 */
> 
> page_pool_check_put_page(page)                  /* may not belong to pool */
> 
> /* recycle variants drain/expect refcount == 1 */
> page_pool_recycle(pool, page, len)
> page_pool_recycle_direct(pool, page, len)
> 
> page_pool_put_page(pool, page, len, mode)	    /* generic, for __xdp_return

I am not against the suggestion but personally I would prefer to explicitate in
the routine name where/how it is actually used. Moreover page_pool_recycle_direct or
page_pool_put_page are currently used by multiple drivers and it seems to me
out of the scope of this series. I think we can address it in a follow-up series
and use __page_pool_put_page for the moment (it is actually just used by mvneta).
Agree?

Regards,
Lorenzo

> */
> 
> 
> I'd rather add len as a parameter, than add more wrapper variants.
> -- 
> Jonathan
> 
> 
> > 
> > Regards,
> > Lorenzo
> > 
> > > 
> > > -- 
> > > Best regards,
> > >   Jesper Dangaard Brouer
> > >   MSc.CS, Principal Kernel Engineer at Red Hat
> > >   LinkedIn: http://www.linkedin.com/in/brouer
> > > 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 net-next 3/3] net: mvneta: get rid of huge dma sync in mvneta_rx_refill
  2019-11-20  9:21             ` Lorenzo Bianconi
@ 2019-11-20 16:29               ` Jonathan Lemon
  0 siblings, 0 replies; 19+ messages in thread
From: Jonathan Lemon @ 2019-11-20 16:29 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: Jesper Dangaard Brouer, Lorenzo Bianconi, netdev, davem,
	ilias.apalodimas, mcroce

On 20 Nov 2019, at 1:21, Lorenzo Bianconi wrote:

>> On 19 Nov 2019, at 7:38, Lorenzo Bianconi wrote:
>>
>>> [...]
>>>>>>> -		page_pool_recycle_direct(rxq->page_pool,
>>>>>>> -					 virt_to_head_page(xdp->data));
>>>>>>> +		__page_pool_put_page(rxq->page_pool,
>>>>>>> +				     virt_to_head_page(xdp->data),
>>>>>>> +				     xdp->data_end - xdp->data_hard_start,
>>>>>>> +				     true);
>>>>>>
>>>>>> This does beg for the question: Should we create an API wrapper 
>>>>>> for
>>>>>> this in the header file?
>>>>>>
>>>>>> But what to name it?
>>>>>>
>>>>>> I know Jonathan doesn't like the "direct" part of the
>>>>>> previous function
>>>>>> name page_pool_recycle_direct.  (I do considered calling
>>>>>> this 'napi'
>>>>>> instead, as it would be inline with networking use-cases,
>>>>>> but it seemed
>>>>>> limited if other subsystem end-up using this).
>>>>>>
>>>>>> Does is 'page_pool_put_page_len' sound better?
>>>>>>
>>>>>> But I want also want hide the bool 'allow_direct' in the API 
>>>>>> name.
>>>>>> (As it makes it easier to identify users that uses this from
>>>>>> softirq)
>>>>>>
>>>>>> Going for 'page_pool_put_page_len_napi' starts to be come
>>>>>> rather long.
>>>>>
>>>>> What about removing the second 'page'? Something like:
>>>>> - page_pool_put_len_napi()
>>>>
>>>> Well, we (unfortunately) already have page_pool_put(), which is 
>>>> used
>>>> for refcnt on the page_pool object itself.
>>>
>>> __page_pool_put_page(pp, data, len, true) is a more generic version 
>>> of
>>> page_pool_recycle_direct where we can specify even the length. So 
>>> what
>>> about:
>>>
>>> - page_pool_recycle_len_direct
>>> - page_pool_recycle_len_napi
>>
>> I'd suggest:
>>
>> /* elevated refcounts, page may seen by networking stack */
>> page_pool_drain(pool, page, count)              /* non napi, len = -1 
>> */
>> page_pool_drain_direct(pool, page, count)       /* len = -1 */
>>
>> page_pool_check_put_page(page)                  /* may not belong to 
>> pool */
>>
>> /* recycle variants drain/expect refcount == 1 */
>> page_pool_recycle(pool, page, len)
>> page_pool_recycle_direct(pool, page, len)
>>
>> page_pool_put_page(pool, page, len, mode)	    /* generic, for 
>> __xdp_return
>
> I am not against the suggestion but personally I would prefer to 
> explicitate in
> the routine name where/how it is actually used. Moreover 
> page_pool_recycle_direct or
> page_pool_put_page are currently used by multiple drivers and it seems 
> to me
> out of the scope of this series. I think we can address it in a 
> follow-up series
> and use __page_pool_put_page for the moment (it is actually just used 
> by mvneta).
> Agree?

Fine with me - I have a naming cleanup patch pending, I can roll it into
this afterwards.
-- 
Jonathan

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2019-11-20 16:29 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-18 13:33 [PATCH v4 net-next 0/3] add DMA-sync-for-device capability to page_pool API Lorenzo Bianconi
2019-11-18 13:33 ` [PATCH v4 net-next 1/3] net: mvneta: rely on page_pool_recycle_direct in mvneta_run_xdp Lorenzo Bianconi
2019-11-18 13:33 ` [PATCH v4 net-next 2/3] net: page_pool: add the possibility to sync DMA memory for device Lorenzo Bianconi
2019-11-19 11:23   ` Jesper Dangaard Brouer
2019-11-19 11:33     ` Ilias Apalodimas
2019-11-19 15:11       ` Jesper Dangaard Brouer
2019-11-19 15:23         ` Ilias Apalodimas
2019-11-19 12:14     ` Lorenzo Bianconi
2019-11-19 15:13       ` Jesper Dangaard Brouer
2019-11-19 15:25         ` Lorenzo Bianconi
2019-11-19 21:17           ` Jesper Dangaard Brouer
2019-11-18 13:33 ` [PATCH v4 net-next 3/3] net: mvneta: get rid of huge dma sync in mvneta_rx_refill Lorenzo Bianconi
2019-11-19 11:38   ` Jesper Dangaard Brouer
2019-11-19 12:19     ` Lorenzo Bianconi
2019-11-19 14:51       ` Jesper Dangaard Brouer
2019-11-19 15:38         ` Lorenzo Bianconi
2019-11-19 22:23           ` Jonathan Lemon
2019-11-20  9:21             ` Lorenzo Bianconi
2019-11-20 16:29               ` Jonathan Lemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).