All of lore.kernel.org
 help / color / mirror / Atom feed
* [net-next PATCH 0/7] net: bulk alloc side and more bulk free drivers
@ 2016-03-04 13:01 Jesper Dangaard Brouer
  2016-03-04 13:01 ` [net-next PATCH 1/7] mlx5: use napi_consume_skb API to get bulk free operations Jesper Dangaard Brouer
                   ` (7 more replies)
  0 siblings, 8 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-04 13:01 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: eugenia, Alexander Duyck, alexei.starovoitov, saeedm,
	Jesper Dangaard Brouer, gerlitz.or

This patchset use the bulk ALLOC side of the kmem_cache bulk APIs, for
SKB allocations.  The bulk free side got enabled in merge commit
3134b9f019f2 ("net: mitigating kmem_cache free slowpath").

The first two patches is a followup on the free-side, which enables
bulk-free in the drivers mlx4 and mlx5 (dev_kfree_skb -> napi_consume_skb).

Rest of patchset is focused on bulk alloc-side.  We start with a
conservative bulk alloc of 8 SKB, which all drivers using the
napi_alloc_skb() call will benefit from.  Then the API is extended to,
allow driver hinting on needed SKBs (only some drivers know this
size), and mlx5 driver is the first user of hinting.

Small hint for people wanting to tune their systems. Default number of
SKB objects per slab-page is 32 objects.  This limits the bulking
sizes for the SLUB allocator.  SLUB can be tuned via kernel cmdline
boot option slub_min_objects=128.  Increasing this gives a significant
performance boost, but at the cost of more memory "waste" inside
kmem_cache/slab allocator.

Patchset based on net-next at commit 3ebeac1d0295

---

Jesper Dangaard Brouer (7):
      mlx5: use napi_consume_skb API to get bulk free operations
      mlx4: use napi_consume_skb API to get bulk free operations
      net: bulk alloc and reuse of SKBs in NAPI context
      mlx5: use napi_alloc_skb API to get SKB bulk allocations
      mlx4: use napi_alloc_skb API to get SKB bulk allocations
      net: introduce napi_alloc_skb_hint() for more use-cases
      mlx5: hint the NAPI alloc skb API about the expected bulk size


 drivers/net/ethernet/mellanox/mlx4/en_rx.c        |    7 +-
 drivers/net/ethernet/mellanox/mlx4/en_tx.c        |   19 ++++-
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |    5 +
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c   |   11 ++-
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   |    4 +
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c |    4 +
 include/linux/skbuff.h                            |   19 ++++-
 net/core/skbuff.c                                 |   75 +++++++++++++--------
 8 files changed, 92 insertions(+), 52 deletions(-)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [net-next PATCH 1/7] mlx5: use napi_consume_skb API to get bulk free operations
  2016-03-04 13:01 [net-next PATCH 0/7] net: bulk alloc side and more bulk free drivers Jesper Dangaard Brouer
@ 2016-03-04 13:01 ` Jesper Dangaard Brouer
  2016-03-04 13:01 ` [net-next PATCH 2/7] mlx4: " Jesper Dangaard Brouer
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-04 13:01 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: eugenia, Alexander Duyck, alexei.starovoitov, saeedm,
	Jesper Dangaard Brouer, gerlitz.or

Bulk free of SKBs happen transparently by the API call napi_consume_skb().
The napi budget parameter is needed by napi_consume_skb() to detect
if called from netpoll.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |    2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   |    4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c |    2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 9c0e80e64b43..a0708782eb78 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -619,7 +619,7 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev);
 void mlx5e_completion_event(struct mlx5_core_cq *mcq);
 void mlx5e_cq_error_event(struct mlx5_core_cq *mcq, enum mlx5_event event);
 int mlx5e_napi_poll(struct napi_struct *napi, int budget);
-bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq);
+bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget);
 int mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget);
 bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq);
 struct mlx5_cqe64 *mlx5e_get_cqe(struct mlx5e_cq *cq);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index c34f4f3e9537..996b13a5042f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -336,7 +336,7 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev)
 	return mlx5e_sq_xmit(sq, skb);
 }
 
-bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq)
+bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
 {
 	struct mlx5e_sq *sq;
 	u32 dma_fifo_cc;
@@ -412,7 +412,7 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq)
 			npkts++;
 			nbytes += wi->num_bytes;
 			sqcc += wi->num_wqebbs;
-			dev_kfree_skb(skb);
+			napi_consume_skb(skb, napi_budget);
 		} while (!last_wqe);
 	}
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
index 4ac8d716dbdd..d244acee63a5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
@@ -60,7 +60,7 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget)
 	clear_bit(MLX5E_CHANNEL_NAPI_SCHED, &c->flags);
 
 	for (i = 0; i < c->num_tc; i++)
-		busy |= mlx5e_poll_tx_cq(&c->sq[i].cq);
+		busy |= mlx5e_poll_tx_cq(&c->sq[i].cq, budget);
 
 	work_done = mlx5e_poll_rx_cq(&c->rq.cq, budget);
 	busy |= work_done == budget;

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [net-next PATCH 2/7] mlx4: use napi_consume_skb API to get bulk free operations
  2016-03-04 13:01 [net-next PATCH 0/7] net: bulk alloc side and more bulk free drivers Jesper Dangaard Brouer
  2016-03-04 13:01 ` [net-next PATCH 1/7] mlx5: use napi_consume_skb API to get bulk free operations Jesper Dangaard Brouer
@ 2016-03-04 13:01 ` Jesper Dangaard Brouer
  2016-03-08 19:24   ` David Miller
  2016-03-04 13:01 ` [net-next PATCH 3/7] net: bulk alloc and reuse of SKBs in NAPI context Jesper Dangaard Brouer
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-04 13:01 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: eugenia, Alexander Duyck, alexei.starovoitov, saeedm,
	Jesper Dangaard Brouer, gerlitz.or

Bulk free of SKBs happen transparently by the API call napi_consume_skb().
The napi budget parameter is needed by napi_consume_skb() to detect
if called from netpoll.

For mlx4 driver, the mlx4_en_stop_port() call cleanup the entire
TX ring via mlx4_en_free_tx_buf().  To handle this situation,
napi budget value -1 is used for indicating this call happens
outside NAPI context.  To reflect this, variable is called napi_mode
for the function call that needed this distinction.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c |   19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index e0946ab22010..b94ed84646b0 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -276,7 +276,8 @@ static void mlx4_en_stamp_wqe(struct mlx4_en_priv *priv,
 
 static u32 mlx4_en_free_tx_desc(struct mlx4_en_priv *priv,
 				struct mlx4_en_tx_ring *ring,
-				int index, u8 owner, u64 timestamp)
+				int index, u8 owner, u64 timestamp,
+				int napi_mode)
 {
 	struct mlx4_en_tx_info *tx_info = &ring->tx_info[index];
 	struct mlx4_en_tx_desc *tx_desc = ring->buf + index * TXBB_SIZE;
@@ -347,7 +348,11 @@ static u32 mlx4_en_free_tx_desc(struct mlx4_en_priv *priv,
 			}
 		}
 	}
-	dev_consume_skb_any(skb);
+	if (unlikely(napi_mode < 0))
+		dev_consume_skb_any(skb); /* none-NAPI via mlx4_en_stop_port */
+	else
+		napi_consume_skb(skb, napi_mode);
+
 	return tx_info->nr_txbb;
 }
 
@@ -371,7 +376,9 @@ int mlx4_en_free_tx_buf(struct net_device *dev, struct mlx4_en_tx_ring *ring)
 	while (ring->cons != ring->prod) {
 		ring->last_nr_txbb = mlx4_en_free_tx_desc(priv, ring,
 						ring->cons & ring->size_mask,
-						!!(ring->cons & ring->size), 0);
+						!!(ring->cons & ring->size), 0,
+						-1 /* none-NAPI caller */
+			);
 		ring->cons += ring->last_nr_txbb;
 		cnt++;
 	}
@@ -385,7 +392,7 @@ int mlx4_en_free_tx_buf(struct net_device *dev, struct mlx4_en_tx_ring *ring)
 }
 
 static bool mlx4_en_process_tx_cq(struct net_device *dev,
-				 struct mlx4_en_cq *cq)
+				  struct mlx4_en_cq *cq, int napi_budget)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_cq *mcq = &cq->mcq;
@@ -451,7 +458,7 @@ static bool mlx4_en_process_tx_cq(struct net_device *dev,
 			last_nr_txbb = mlx4_en_free_tx_desc(
 					priv, ring, ring_index,
 					!!((ring_cons + txbbs_skipped) &
-					ring->size), timestamp);
+					ring->size), timestamp, napi_budget);
 
 			mlx4_en_stamp_wqe(priv, ring, stamp_index,
 					  !!((ring_cons + txbbs_stamp) &
@@ -511,7 +518,7 @@ int mlx4_en_poll_tx_cq(struct napi_struct *napi, int budget)
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	int clean_complete;
 
-	clean_complete = mlx4_en_process_tx_cq(dev, cq);
+	clean_complete = mlx4_en_process_tx_cq(dev, cq, budget);
 	if (!clean_complete)
 		return budget;
 

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [net-next PATCH 3/7] net: bulk alloc and reuse of SKBs in NAPI context
  2016-03-04 13:01 [net-next PATCH 0/7] net: bulk alloc side and more bulk free drivers Jesper Dangaard Brouer
  2016-03-04 13:01 ` [net-next PATCH 1/7] mlx5: use napi_consume_skb API to get bulk free operations Jesper Dangaard Brouer
  2016-03-04 13:01 ` [net-next PATCH 2/7] mlx4: " Jesper Dangaard Brouer
@ 2016-03-04 13:01 ` Jesper Dangaard Brouer
  2016-03-13 14:06   ` Rana Shahout
  2016-03-04 13:01 ` [net-next PATCH 4/7] mlx5: use napi_alloc_skb API to get SKB bulk allocations Jesper Dangaard Brouer
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-04 13:01 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: eugenia, Alexander Duyck, alexei.starovoitov, saeedm,
	Jesper Dangaard Brouer, gerlitz.or

This patch introduce bulk alloc of SKBs and allow reuse of SKBs
free'ed in same softirq cycle.  SKBs are normally free'ed during TX
completion, but most high speed drivers also cleanup TX ring during
NAPI RX poll cycle.  Thus, if using napi_consume_skb/__kfree_skb_defer,
SKBs will be avail in the napi_alloc_cache->skb_cache.

If no SKBs are avail for reuse, then only bulk alloc 8 SKBs, to limit
the potential overshooting unused SKBs needed to free'ed when NAPI
cycle ends (flushed in net_rx_action via __kfree_skb_flush()).

Generalize ___build_skb() to allow passing it a preallocated SKB.

I've previously demonstrated a 1% speedup for IPv4 forwarding, when
used on the ixgbe driver.  If SKB alloc and free happens on different
CPUs (like in RPS use-case) the performance benefit is expected to
increase.

All drivers using the napi_alloc_skb() call will benefit from
this change automatically.

Joint work with Alexander Duyck.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
---
 net/core/skbuff.c |   71 ++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 45 insertions(+), 26 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 7af7ec635d90..96fb7933b614 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -281,32 +281,14 @@ nodata:
 }
 EXPORT_SYMBOL(__alloc_skb);
 
-/**
- * __build_skb - build a network buffer
- * @data: data buffer provided by caller
- * @frag_size: size of data, or 0 if head was kmalloced
- *
- * Allocate a new &sk_buff. Caller provides space holding head and
- * skb_shared_info. @data must have been allocated by kmalloc() only if
- * @frag_size is 0, otherwise data should come from the page allocator
- *  or vmalloc()
- * The return is the new skb buffer.
- * On a failure the return is %NULL, and @data is not freed.
- * Notes :
- *  Before IO, driver allocates only data buffer where NIC put incoming frame
- *  Driver should add room at head (NET_SKB_PAD) and
- *  MUST add room at tail (SKB_DATA_ALIGN(skb_shared_info))
- *  After IO, driver calls build_skb(), to allocate sk_buff and populate it
- *  before giving packet to stack.
- *  RX rings only contains data buffers, not full skbs.
- */
-struct sk_buff *__build_skb(void *data, unsigned int frag_size)
+/* Allows skb being (pre)allocated by caller */
+static inline
+struct sk_buff *___build_skb(void *data, unsigned int frag_size,
+			     struct sk_buff *skb)
 {
 	struct skb_shared_info *shinfo;
-	struct sk_buff *skb;
 	unsigned int size = frag_size ? : ksize(data);
 
-	skb = kmem_cache_alloc(skbuff_head_cache, GFP_ATOMIC);
 	if (!skb)
 		return NULL;
 
@@ -331,6 +313,34 @@ struct sk_buff *__build_skb(void *data, unsigned int frag_size)
 	return skb;
 }
 
+/**
+ * __build_skb - build a network buffer
+ * @data: data buffer provided by caller
+ * @frag_size: size of data, or 0 if head was kmalloced
+ *
+ * Allocate a new &sk_buff. Caller provides space holding head and
+ * skb_shared_info. @data must have been allocated by kmalloc() only if
+ * @frag_size is 0, otherwise data should come from the page allocator
+ *  or vmalloc()
+ * The return is the new skb buffer.
+ * On a failure the return is %NULL, and @data is not freed.
+ * Notes :
+ *  Before IO, driver allocates only data buffer where NIC put incoming frame
+ *  Driver should add room at head (NET_SKB_PAD) and
+ *  MUST add room at tail (SKB_DATA_ALIGN(skb_shared_info))
+ *  After IO, driver calls build_skb(), to allocate sk_buff and populate it
+ *  before giving packet to stack.
+ *  RX rings only contains data buffers, not full skbs.
+ */
+struct sk_buff *__build_skb(void *data, unsigned int frag_size)
+{
+	struct sk_buff *skb;
+	unsigned int size = frag_size ? : ksize(data);
+
+	skb = kmem_cache_alloc(skbuff_head_cache, GFP_ATOMIC);
+	return ___build_skb(data, size, skb);
+}
+
 /* build_skb() is wrapper over __build_skb(), that specifically
  * takes care of skb->head and skb->pfmemalloc
  * This means that if @frag_size is not zero, then @data must be backed
@@ -490,8 +500,8 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len,
 
 	len += NET_SKB_PAD + NET_IP_ALIGN;
 
-	if ((len > SKB_WITH_OVERHEAD(PAGE_SIZE)) ||
-	    (gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA))) {
+	if (unlikely((len > SKB_WITH_OVERHEAD(PAGE_SIZE)) ||
+		     (gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA)))) {
 		skb = __alloc_skb(len, gfp_mask, SKB_ALLOC_RX, NUMA_NO_NODE);
 		if (!skb)
 			goto skb_fail;
@@ -508,11 +518,20 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len,
 	if (unlikely(!data))
 		return NULL;
 
-	skb = __build_skb(data, len);
-	if (unlikely(!skb)) {
+#define BULK_ALLOC_SIZE 8
+	if (!nc->skb_count) {
+		nc->skb_count = kmem_cache_alloc_bulk(skbuff_head_cache,
+						      gfp_mask, BULK_ALLOC_SIZE,
+						      nc->skb_cache);
+	}
+	if (likely(nc->skb_count)) {
+		skb = (struct sk_buff *)nc->skb_cache[--nc->skb_count];
+	} else {
+		/* alloc bulk failed */
 		skb_free_frag(data);
 		return NULL;
 	}
+	skb = ___build_skb(data, len, skb);
 
 	/* use OR instead of assignment to avoid clearing of bits in mask */
 	if (nc->page.pfmemalloc)

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [net-next PATCH 4/7] mlx5: use napi_alloc_skb API to get SKB bulk allocations
  2016-03-04 13:01 [net-next PATCH 0/7] net: bulk alloc side and more bulk free drivers Jesper Dangaard Brouer
                   ` (2 preceding siblings ...)
  2016-03-04 13:01 ` [net-next PATCH 3/7] net: bulk alloc and reuse of SKBs in NAPI context Jesper Dangaard Brouer
@ 2016-03-04 13:01 ` Jesper Dangaard Brouer
  2016-03-04 13:02 ` [net-next PATCH 5/7] mlx4: " Jesper Dangaard Brouer
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-04 13:01 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: eugenia, Alexander Duyck, alexei.starovoitov, saeedm,
	Jesper Dangaard Brouer, gerlitz.or

Activate the bulk alloc API, simply by changing mlx5 from using
netdev_alloc_skb() to using napi_alloc_skb().

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |    2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c   |    9 +++++----
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c |    2 +-
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index a0708782eb78..38ed89b7145d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -621,7 +621,7 @@ void mlx5e_cq_error_event(struct mlx5_core_cq *mcq, enum mlx5_event event);
 int mlx5e_napi_poll(struct napi_struct *napi, int budget);
 bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget);
 int mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget);
-bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq);
+bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq, struct napi_struct *napi);
 struct mlx5_cqe64 *mlx5e_get_cqe(struct mlx5e_cq *cq);
 
 void mlx5e_update_stats(struct mlx5e_priv *priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 519a07f253f9..5ef9772ff708 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -42,12 +42,13 @@ static inline bool mlx5e_rx_hw_stamp(struct mlx5e_tstamp *tstamp)
 }
 
 static inline int mlx5e_alloc_rx_wqe(struct mlx5e_rq *rq,
-				     struct mlx5e_rx_wqe *wqe, u16 ix)
+				     struct mlx5e_rx_wqe *wqe, u16 ix,
+				     struct napi_struct *napi)
 {
 	struct sk_buff *skb;
 	dma_addr_t dma_addr;
 
-	skb = netdev_alloc_skb(rq->netdev, rq->wqe_sz);
+	skb = napi_alloc_skb(napi, rq->wqe_sz);
 	if (unlikely(!skb))
 		return -ENOMEM;
 
@@ -76,7 +77,7 @@ err_free_skb:
 	return -ENOMEM;
 }
 
-bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq)
+bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq, struct napi_struct *napi)
 {
 	struct mlx5_wq_ll *wq = &rq->wq;
 
@@ -86,7 +87,7 @@ bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq)
 	while (!mlx5_wq_ll_is_full(wq)) {
 		struct mlx5e_rx_wqe *wqe = mlx5_wq_ll_get_wqe(wq, wq->head);
 
-		if (unlikely(mlx5e_alloc_rx_wqe(rq, wqe, wq->head)))
+		if (unlikely(mlx5e_alloc_rx_wqe(rq, wqe, wq->head, napi)))
 			break;
 
 		mlx5_wq_ll_push(wq, be16_to_cpu(wqe->next.next_wqe_index));
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
index d244acee63a5..8fd07c8087e3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
@@ -64,7 +64,7 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget)
 
 	work_done = mlx5e_poll_rx_cq(&c->rq.cq, budget);
 	busy |= work_done == budget;
-	busy |= mlx5e_post_rx_wqes(&c->rq);
+	busy |= mlx5e_post_rx_wqes(&c->rq, napi);
 
 	if (busy)
 		return budget;

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [net-next PATCH 5/7] mlx4: use napi_alloc_skb API to get SKB bulk allocations
  2016-03-04 13:01 [net-next PATCH 0/7] net: bulk alloc side and more bulk free drivers Jesper Dangaard Brouer
                   ` (3 preceding siblings ...)
  2016-03-04 13:01 ` [net-next PATCH 4/7] mlx5: use napi_alloc_skb API to get SKB bulk allocations Jesper Dangaard Brouer
@ 2016-03-04 13:02 ` Jesper Dangaard Brouer
  2016-03-04 13:02 ` [net-next PATCH 6/7] net: introduce napi_alloc_skb_hint() for more use-cases Jesper Dangaard Brouer
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-04 13:02 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: eugenia, Alexander Duyck, alexei.starovoitov, saeedm,
	Jesper Dangaard Brouer, gerlitz.or

Activate the bulk alloc API, simply by changing mlx4 from using
netdev_alloc_skb() to using napi_alloc_skb().

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 41440b2b20a3..d867060ea676 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -583,14 +583,15 @@ fail:
 static struct sk_buff *mlx4_en_rx_skb(struct mlx4_en_priv *priv,
 				      struct mlx4_en_rx_desc *rx_desc,
 				      struct mlx4_en_rx_alloc *frags,
-				      unsigned int length)
+				      unsigned int length,
+				      struct napi_struct *napi)
 {
 	struct sk_buff *skb;
 	void *va;
 	int used_frags;
 	dma_addr_t dma;
 
-	skb = netdev_alloc_skb(priv->dev, SMALL_PACKET_SIZE + NET_IP_ALIGN);
+	skb = napi_alloc_skb(napi, SMALL_PACKET_SIZE + NET_IP_ALIGN);
 	if (!skb) {
 		en_dbg(RX_ERR, priv, "Failed allocating skb\n");
 		return NULL;
@@ -938,7 +939,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 		}
 
 		/* GRO not possible, complete processing here */
-		skb = mlx4_en_rx_skb(priv, rx_desc, frags, length);
+		skb = mlx4_en_rx_skb(priv, rx_desc, frags, length, &cq->napi);
 		if (!skb) {
 			priv->stats.rx_dropped++;
 			goto next;

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [net-next PATCH 6/7] net: introduce napi_alloc_skb_hint() for more use-cases
  2016-03-04 13:01 [net-next PATCH 0/7] net: bulk alloc side and more bulk free drivers Jesper Dangaard Brouer
                   ` (4 preceding siblings ...)
  2016-03-04 13:02 ` [net-next PATCH 5/7] mlx4: " Jesper Dangaard Brouer
@ 2016-03-04 13:02 ` Jesper Dangaard Brouer
  2016-03-04 13:02 ` [net-next PATCH 7/7] mlx5: hint the NAPI alloc skb API about the expected bulk size Jesper Dangaard Brouer
  2016-03-04 16:36 ` [net-next PATCH 0/7] net: bulk alloc side and more bulk free drivers Alexei Starovoitov
  7 siblings, 0 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-04 13:02 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: eugenia, Alexander Duyck, alexei.starovoitov, saeedm,
	Jesper Dangaard Brouer, gerlitz.or

The default bulk alloc size arbitrarily choosen (to be 8) might
not suit all use-cases, this introduce a function napi_alloc_skb_hint()
that allow the caller to specify a bulk size hint they are expecting.
It is a hint because __napi_alloc_skb() limits the bulk size to
the array size.

One user is the mlx5 driver, which bulk re-populate it's RX ring
with both SKBs and pages.  Thus, it would like to work with
bigger bulk alloc chunks.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 include/linux/skbuff.h |   19 +++++++++++++++----
 net/core/skbuff.c      |    8 +++-----
 2 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 797cefb888fb..f49077caedaf 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2393,14 +2393,25 @@ static inline void skb_free_frag(void *addr)
 	__free_page_frag(addr);
 }
 
+#define NAPI_SKB_CACHE_SIZE	64U /* Used in struct napi_alloc_cache */
+#define NAPI_SKB_BULK_ALLOC	 8U /* Default slab bulk alloc in NAPI */
+
 void *napi_alloc_frag(unsigned int fragsz);
-struct sk_buff *__napi_alloc_skb(struct napi_struct *napi,
-				 unsigned int length, gfp_t gfp_mask);
+struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len,
+				 unsigned int bulk_hint, gfp_t gfp_mask);
 static inline struct sk_buff *napi_alloc_skb(struct napi_struct *napi,
-					     unsigned int length)
+					     unsigned int len)
+{
+	return __napi_alloc_skb(napi, len, NAPI_SKB_BULK_ALLOC, GFP_ATOMIC);
+}
+static inline struct sk_buff *napi_alloc_skb_hint(struct napi_struct *napi,
+						  unsigned int len,
+						  unsigned int bulk_hint)
 {
-	return __napi_alloc_skb(napi, length, GFP_ATOMIC);
+	bulk_hint = bulk_hint ? : 1;
+	return __napi_alloc_skb(napi, len, bulk_hint, GFP_ATOMIC);
 }
+
 void napi_consume_skb(struct sk_buff *skb, int budget);
 
 void __kfree_skb_flush(void);
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 96fb7933b614..c770bd4391ab 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -359,8 +359,6 @@ struct sk_buff *build_skb(void *data, unsigned int frag_size)
 }
 EXPORT_SYMBOL(build_skb);
 
-#define NAPI_SKB_CACHE_SIZE	64
-
 struct napi_alloc_cache {
 	struct page_frag_cache page;
 	size_t skb_count;
@@ -492,9 +490,10 @@ EXPORT_SYMBOL(__netdev_alloc_skb);
  *	%NULL is returned if there is no free memory.
  */
 struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len,
-				 gfp_t gfp_mask)
+				 unsigned int bulk_hint, gfp_t gfp_mask)
 {
 	struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache);
+	unsigned int bulk_sz = min(bulk_hint, NAPI_SKB_CACHE_SIZE);
 	struct sk_buff *skb;
 	void *data;
 
@@ -518,10 +517,9 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len,
 	if (unlikely(!data))
 		return NULL;
 
-#define BULK_ALLOC_SIZE 8
 	if (!nc->skb_count) {
 		nc->skb_count = kmem_cache_alloc_bulk(skbuff_head_cache,
-						      gfp_mask, BULK_ALLOC_SIZE,
+						      gfp_mask, bulk_sz,
 						      nc->skb_cache);
 	}
 	if (likely(nc->skb_count)) {

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [net-next PATCH 7/7] mlx5: hint the NAPI alloc skb API about the expected bulk size
  2016-03-04 13:01 [net-next PATCH 0/7] net: bulk alloc side and more bulk free drivers Jesper Dangaard Brouer
                   ` (5 preceding siblings ...)
  2016-03-04 13:02 ` [net-next PATCH 6/7] net: introduce napi_alloc_skb_hint() for more use-cases Jesper Dangaard Brouer
@ 2016-03-04 13:02 ` Jesper Dangaard Brouer
  2016-03-04 16:36 ` [net-next PATCH 0/7] net: bulk alloc side and more bulk free drivers Alexei Starovoitov
  7 siblings, 0 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-04 13:02 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: eugenia, Alexander Duyck, alexei.starovoitov, saeedm,
	Jesper Dangaard Brouer, gerlitz.or

Use the newly introduced napi_alloc_skb_hint() API, to get the underlying
slab bulk allocation sizes to align with what mlx5 driver need for refilling
its RX ring queue.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |    3 ++-
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c   |   10 ++++++----
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c |    2 +-
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 38ed89b7145d..c2607d63b0a2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -621,7 +621,8 @@ void mlx5e_cq_error_event(struct mlx5_core_cq *mcq, enum mlx5_event event);
 int mlx5e_napi_poll(struct napi_struct *napi, int budget);
 bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget);
 int mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget);
-bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq, struct napi_struct *napi);
+bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq, struct napi_struct *napi,
+			unsigned int bulk_hint);
 struct mlx5_cqe64 *mlx5e_get_cqe(struct mlx5e_cq *cq);
 
 void mlx5e_update_stats(struct mlx5e_priv *priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 5ef9772ff708..1be5a647c16f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -43,12 +43,13 @@ static inline bool mlx5e_rx_hw_stamp(struct mlx5e_tstamp *tstamp)
 
 static inline int mlx5e_alloc_rx_wqe(struct mlx5e_rq *rq,
 				     struct mlx5e_rx_wqe *wqe, u16 ix,
-				     struct napi_struct *napi)
+				     struct napi_struct *napi,
+				     unsigned int bulk_hint)
 {
 	struct sk_buff *skb;
 	dma_addr_t dma_addr;
 
-	skb = napi_alloc_skb(napi, rq->wqe_sz);
+	skb = napi_alloc_skb_hint(napi, rq->wqe_sz, bulk_hint);
 	if (unlikely(!skb))
 		return -ENOMEM;
 
@@ -77,7 +78,8 @@ err_free_skb:
 	return -ENOMEM;
 }
 
-bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq, struct napi_struct *napi)
+bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq, struct napi_struct *napi,
+			unsigned int hint)
 {
 	struct mlx5_wq_ll *wq = &rq->wq;
 
@@ -87,7 +89,7 @@ bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq, struct napi_struct *napi)
 	while (!mlx5_wq_ll_is_full(wq)) {
 		struct mlx5e_rx_wqe *wqe = mlx5_wq_ll_get_wqe(wq, wq->head);
 
-		if (unlikely(mlx5e_alloc_rx_wqe(rq, wqe, wq->head, napi)))
+		if (unlikely(mlx5e_alloc_rx_wqe(rq, wqe, wq->head, napi, hint)))
 			break;
 
 		mlx5_wq_ll_push(wq, be16_to_cpu(wqe->next.next_wqe_index));
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
index 8fd07c8087e3..6488404edff6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
@@ -64,7 +64,7 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget)
 
 	work_done = mlx5e_poll_rx_cq(&c->rq.cq, budget);
 	busy |= work_done == budget;
-	busy |= mlx5e_post_rx_wqes(&c->rq, napi);
+	busy |= mlx5e_post_rx_wqes(&c->rq, napi, work_done);
 
 	if (busy)
 		return budget;

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [net-next PATCH 0/7] net: bulk alloc side and more bulk free drivers
  2016-03-04 13:01 [net-next PATCH 0/7] net: bulk alloc side and more bulk free drivers Jesper Dangaard Brouer
                   ` (6 preceding siblings ...)
  2016-03-04 13:02 ` [net-next PATCH 7/7] mlx5: hint the NAPI alloc skb API about the expected bulk size Jesper Dangaard Brouer
@ 2016-03-04 16:36 ` Alexei Starovoitov
  2016-03-04 19:15   ` Jesper Dangaard Brouer
  7 siblings, 1 reply; 36+ messages in thread
From: Alexei Starovoitov @ 2016-03-04 16:36 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, David S. Miller, eugenia, Alexander Duyck, saeedm, gerlitz.or

On Fri, Mar 04, 2016 at 02:01:14PM +0100, Jesper Dangaard Brouer wrote:
> This patchset use the bulk ALLOC side of the kmem_cache bulk APIs, for
> SKB allocations.  The bulk free side got enabled in merge commit
> 3134b9f019f2 ("net: mitigating kmem_cache free slowpath").
> 
> The first two patches is a followup on the free-side, which enables
> bulk-free in the drivers mlx4 and mlx5 (dev_kfree_skb -> napi_consume_skb).
> 
> Rest of patchset is focused on bulk alloc-side.  We start with a
> conservative bulk alloc of 8 SKB, which all drivers using the
> napi_alloc_skb() call will benefit from.  Then the API is extended to,
> allow driver hinting on needed SKBs (only some drivers know this
> size), and mlx5 driver is the first user of hinting.

patches 1-5 look very good to me. Should help all cases afaik.
As far as 6-7 about hints I have a question. Does this hint
actually makes the difference? The fixed bulk alloc of 8 probably
easier for the main slub, but when mlx5 starts doing 'work_done' as
a hint there will be more 'random' bulking going on.
Was wondering whether you have the perf numbers to back up 6/7

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [net-next PATCH 0/7] net: bulk alloc side and more bulk free drivers
  2016-03-04 16:36 ` [net-next PATCH 0/7] net: bulk alloc side and more bulk free drivers Alexei Starovoitov
@ 2016-03-04 19:15   ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-04 19:15 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: netdev, David S. Miller, eugenia, Alexander Duyck, saeedm,
	gerlitz.or, brouer

On Fri, 4 Mar 2016 08:36:44 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Fri, Mar 04, 2016 at 02:01:14PM +0100, Jesper Dangaard Brouer wrote:
> > This patchset use the bulk ALLOC side of the kmem_cache bulk APIs, for
> > SKB allocations.  The bulk free side got enabled in merge commit
> > 3134b9f019f2 ("net: mitigating kmem_cache free slowpath").
> > 
> > The first two patches is a followup on the free-side, which enables
> > bulk-free in the drivers mlx4 and mlx5 (dev_kfree_skb -> napi_consume_skb).
> > 
> > Rest of patchset is focused on bulk alloc-side.  We start with a
> > conservative bulk alloc of 8 SKB, which all drivers using the
> > napi_alloc_skb() call will benefit from.  Then the API is extended to,
> > allow driver hinting on needed SKBs (only some drivers know this
> > size), and mlx5 driver is the first user of hinting.  
> 
> patches 1-5 look very good to me. Should help all cases afaik.
> As far as 6-7 about hints I have a question. Does this hint
> actually makes the difference? The fixed bulk alloc of 8 probably
> easier for the main slub, but when mlx5 starts doing 'work_done' as
> a hint there will be more 'random' bulking going on.
> Was wondering whether you have the perf numbers to back up 6/7

Yes, it makes a difference.  I did some performance numbers with
dropping in the mlx5 driver, plus the RX loop cache-miss avoidance.
With all my optimizations I reached 12Mpps, with this hint optimization
I could reach 13Mpps.  It sounds nice also percentage wise (8.3%), but
in nanosec this optimization "only" corresponds to 6.4 ns.  For real
workloads, we might see a higher "nanosec" improvement, as this invoke
kmem_cache_alloc_bulk() less times resulting in less icache-misses.
So, yes it makes a difference.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [net-next PATCH 2/7] mlx4: use napi_consume_skb API to get bulk free operations
  2016-03-04 13:01 ` [net-next PATCH 2/7] mlx4: " Jesper Dangaard Brouer
@ 2016-03-08 19:24   ` David Miller
  2016-03-09 11:00     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 36+ messages in thread
From: David Miller @ 2016-03-08 19:24 UTC (permalink / raw)
  To: brouer
  Cc: netdev, eugenia, alexander.duyck, alexei.starovoitov, saeedm, gerlitz.or

From: Jesper Dangaard Brouer <brouer@redhat.com>
Date: Fri, 04 Mar 2016 14:01:33 +0100

> @@ -276,7 +276,8 @@ static void mlx4_en_stamp_wqe(struct mlx4_en_priv *priv,
>  
>  static u32 mlx4_en_free_tx_desc(struct mlx4_en_priv *priv,
>  				struct mlx4_en_tx_ring *ring,
> -				int index, u8 owner, u64 timestamp)
> +				int index, u8 owner, u64 timestamp,
> +				int napi_mode)
>  {
>  	struct mlx4_en_tx_info *tx_info = &ring->tx_info[index];
>  	struct mlx4_en_tx_desc *tx_desc = ring->buf + index * TXBB_SIZE;
> @@ -347,7 +348,11 @@ static u32 mlx4_en_free_tx_desc(struct mlx4_en_priv *priv,
>  			}
>  		}
>  	}
> -	dev_consume_skb_any(skb);
> +	if (unlikely(napi_mode < 0))
> +		dev_consume_skb_any(skb); /* none-NAPI via mlx4_en_stop_port */
> +	else
> +		napi_consume_skb(skb, napi_mode);
> +
>  	return tx_info->nr_txbb;
>  }

If '0' is the signal that napi_consume_skb() uses to detect the case where we
can't bulk, just pass that instead of having a special test here on yet another
special value "-1".

If it makes any nicer, you can define a NAPI_BUDGET_FROM_NETPOLL macro
or similar.

I also wonder if passing the budget around all the way down to
napi_consume_skb() is the cleanest thing to do, as we just want to
know if bulk freeing is possible or not.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [net-next PATCH 2/7] mlx4: use napi_consume_skb API to get bulk free operations
  2016-03-08 19:24   ` David Miller
@ 2016-03-09 11:00     ` Jesper Dangaard Brouer
  2016-03-09 16:47       ` Alexander Duyck
  0 siblings, 1 reply; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-09 11:00 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, eugenia, alexander.duyck, alexei.starovoitov, saeedm,
	gerlitz.or, brouer

On Tue, 08 Mar 2016 14:24:22 -0500 (EST)
David Miller <davem@davemloft.net> wrote:

> From: Jesper Dangaard Brouer <brouer@redhat.com>
> Date: Fri, 04 Mar 2016 14:01:33 +0100
> 
> > @@ -276,7 +276,8 @@ static void mlx4_en_stamp_wqe(struct mlx4_en_priv *priv,
> >  
> >  static u32 mlx4_en_free_tx_desc(struct mlx4_en_priv *priv,
> >  				struct mlx4_en_tx_ring *ring,
> > -				int index, u8 owner, u64 timestamp)
> > +				int index, u8 owner, u64 timestamp,
> > +				int napi_mode)
> >  {
> >  	struct mlx4_en_tx_info *tx_info = &ring->tx_info[index];
> >  	struct mlx4_en_tx_desc *tx_desc = ring->buf + index * TXBB_SIZE;
> > @@ -347,7 +348,11 @@ static u32 mlx4_en_free_tx_desc(struct mlx4_en_priv *priv,
> >  			}
> >  		}
> >  	}
> > -	dev_consume_skb_any(skb);
> > +	if (unlikely(napi_mode < 0))
> > +		dev_consume_skb_any(skb); /* none-NAPI via mlx4_en_stop_port */
> > +	else
> > +		napi_consume_skb(skb, napi_mode);
> > +
> >  	return tx_info->nr_txbb;
> >  }  
> 
> If '0' is the signal that napi_consume_skb() uses to detect the case where we
> can't bulk, just pass that instead of having a special test here on yet another
> special value "-1".

Cannot use '0' to signal this, because napi_consume_skb() invoke
dev_consume_skb_irq(), and here we need dev_consume_skb_any(), as
mlx4_en_stop_port() assume memory is released immediately.

I guess, we can (in napi_consume_skb) just replace the
dev_consume_skb_irq() with dev_consume_skb_any(), and then use '0' to
signal both situations?


> If it makes any nicer, you can define a NAPI_BUDGET_FROM_NETPOLL macro
> or similar.
> 
> I also wonder if passing the budget around all the way down to
> napi_consume_skb() is the cleanest thing to do, as we just want to
> know if bulk freeing is possible or not.

Passing the budget down was Alex'es design.  Axel any thoughts?

Perhaps we can use another way to detect if bulk freeing is possible?
E.g. using test in_serving_softirq() ?

   if (!in_serving_softirq())
       dev_consume_skb_any(skb); /* cannot bulk free */

Or maybe in_softirq() is enough? (to also allows callers having bh disabled).

I do wonder how expensive this check is... as it goes into a code
hotpath, which is very unlikely.  The good thing would be, that we
handle if buggy drivers call this function from a none softirq context
(as these bugs could be hard to catch).

Can netpoll ever be called from softirq or with BH disabled? (It
disables IRQs, which would break calling kmem_cache_free_bulk).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [net-next PATCH 2/7] mlx4: use napi_consume_skb API to get bulk free operations
  2016-03-09 11:00     ` Jesper Dangaard Brouer
@ 2016-03-09 16:47       ` Alexander Duyck
  2016-03-09 21:03         ` David Miller
  0 siblings, 1 reply; 36+ messages in thread
From: Alexander Duyck @ 2016-03-09 16:47 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: David Miller, Netdev, eugenia, Alexei Starovoitov,
	Saeed Mahameed, Or Gerlitz

On Wed, Mar 9, 2016 at 3:00 AM, Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
> On Tue, 08 Mar 2016 14:24:22 -0500 (EST)
> David Miller <davem@davemloft.net> wrote:
>
>> From: Jesper Dangaard Brouer <brouer@redhat.com>
>> Date: Fri, 04 Mar 2016 14:01:33 +0100
>>
>> > @@ -276,7 +276,8 @@ static void mlx4_en_stamp_wqe(struct mlx4_en_priv *priv,
>> >
>> >  static u32 mlx4_en_free_tx_desc(struct mlx4_en_priv *priv,
>> >                             struct mlx4_en_tx_ring *ring,
>> > -                           int index, u8 owner, u64 timestamp)
>> > +                           int index, u8 owner, u64 timestamp,
>> > +                           int napi_mode)
>> >  {
>> >     struct mlx4_en_tx_info *tx_info = &ring->tx_info[index];
>> >     struct mlx4_en_tx_desc *tx_desc = ring->buf + index * TXBB_SIZE;
>> > @@ -347,7 +348,11 @@ static u32 mlx4_en_free_tx_desc(struct mlx4_en_priv *priv,
>> >                     }
>> >             }
>> >     }
>> > -   dev_consume_skb_any(skb);
>> > +   if (unlikely(napi_mode < 0))
>> > +           dev_consume_skb_any(skb); /* none-NAPI via mlx4_en_stop_port */
>> > +   else
>> > +           napi_consume_skb(skb, napi_mode);
>> > +
>> >     return tx_info->nr_txbb;
>> >  }
>>
>> If '0' is the signal that napi_consume_skb() uses to detect the case where we
>> can't bulk, just pass that instead of having a special test here on yet another
>> special value "-1".
>
> Cannot use '0' to signal this, because napi_consume_skb() invoke
> dev_consume_skb_irq(), and here we need dev_consume_skb_any(), as
> mlx4_en_stop_port() assume memory is released immediately.
>
> I guess, we can (in napi_consume_skb) just replace the
> dev_consume_skb_irq() with dev_consume_skb_any(), and then use '0' to
> signal both situations?
>
>
>> If it makes any nicer, you can define a NAPI_BUDGET_FROM_NETPOLL macro
>> or similar.
>>
>> I also wonder if passing the budget around all the way down to
>> napi_consume_skb() is the cleanest thing to do, as we just want to
>> know if bulk freeing is possible or not.
>
> Passing the budget down was Alex'es design.  Axel any thoughts?

I'd say just use dev_consume_skb_any in the bulk free instead of
dev_consume_skb_irq.  This is slow path, as you said, so it shouldn't
come up often.

> Perhaps we can use another way to detect if bulk freeing is possible?
> E.g. using test in_serving_softirq() ?
>
>    if (!in_serving_softirq())
>        dev_consume_skb_any(skb); /* cannot bulk free */
>
> Or maybe in_softirq() is enough? (to also allows callers having bh disabled).

I wasn't so much concerned about the check as us getting it mixed up.
For example the Tx path has soft IRQ disabled if I recall correctly.
Is this called from there?  At least with the "any" approach you can
guarantee you don't leave stale buffers sitting in the lists until
someone wakes up NAPI.

> I do wonder how expensive this check is... as it goes into a code
> hotpath, which is very unlikely.  The good thing would be, that we
> handle if buggy drivers call this function from a none softirq context
> (as these bugs could be hard to catch).
>
> Can netpoll ever be called from softirq or with BH disabled? (It
> disables IRQs, which would break calling kmem_cache_free_bulk).

It is better for us to switch things out so that the napi_consume_skb
is the fast path with dev_consume_skb_any as the slow.  There are too
many scenarios where we could be invoking something that makes use of
this within the Tx path so it is probably easiest to just solve it
that way so we don't have to deal with it again in the future.

- Alex

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [net-next PATCH 2/7] mlx4: use napi_consume_skb API to get bulk free operations
  2016-03-09 16:47       ` Alexander Duyck
@ 2016-03-09 21:03         ` David Miller
  2016-03-09 21:36           ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 36+ messages in thread
From: David Miller @ 2016-03-09 21:03 UTC (permalink / raw)
  To: alexander.duyck
  Cc: brouer, netdev, eugenia, alexei.starovoitov, saeedm, gerlitz.or

From: Alexander Duyck <alexander.duyck@gmail.com>
Date: Wed, 9 Mar 2016 08:47:58 -0800

> On Wed, Mar 9, 2016 at 3:00 AM, Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
>> Passing the budget down was Alex'es design.  Axel any thoughts?
> 
> I'd say just use dev_consume_skb_any in the bulk free instead of
> dev_consume_skb_irq.  This is slow path, as you said, so it shouldn't
> come up often.

Agreed.

>> I do wonder how expensive this check is... as it goes into a code
>> hotpath, which is very unlikely.  The good thing would be, that we
>> handle if buggy drivers call this function from a none softirq context
>> (as these bugs could be hard to catch).
>>
>> Can netpoll ever be called from softirq or with BH disabled? (It
>> disables IRQs, which would break calling kmem_cache_free_bulk).
> 
> It is better for us to switch things out so that the napi_consume_skb
> is the fast path with dev_consume_skb_any as the slow.  There are too
> many scenarios where we could be invoking something that makes use of
> this within the Tx path so it is probably easiest to just solve it
> that way so we don't have to deal with it again in the future.

Indeed.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [net-next PATCH 2/7] mlx4: use napi_consume_skb API to get bulk free operations
  2016-03-09 21:03         ` David Miller
@ 2016-03-09 21:36           ` Jesper Dangaard Brouer
  2016-03-09 21:43             ` Alexander Duyck
  0 siblings, 1 reply; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-09 21:36 UTC (permalink / raw)
  To: David Miller
  Cc: alexander.duyck, netdev, eugenia, alexei.starovoitov, saeedm,
	gerlitz.or, brouer

On Wed, 09 Mar 2016 16:03:20 -0500 (EST)
David Miller <davem@davemloft.net> wrote:

> From: Alexander Duyck <alexander.duyck@gmail.com>
> Date: Wed, 9 Mar 2016 08:47:58 -0800
> 
> > On Wed, Mar 9, 2016 at 3:00 AM, Jesper Dangaard Brouer
> > <brouer@redhat.com> wrote:  
> >> Passing the budget down was Alex'es design.  Axel any thoughts?  
> > 
> > I'd say just use dev_consume_skb_any in the bulk free instead of
> > dev_consume_skb_irq.  This is slow path, as you said, so it shouldn't
> > come up often.  
> 
> Agreed.
> 
> >> I do wonder how expensive this check is... as it goes into a code
> >> hotpath, which is very unlikely.  The good thing would be, that we
> >> handle if buggy drivers call this function from a none softirq context
> >> (as these bugs could be hard to catch).
> >>
> >> Can netpoll ever be called from softirq or with BH disabled? (It
> >> disables IRQs, which would break calling kmem_cache_free_bulk).  
> > 
> > It is better for us to switch things out so that the napi_consume_skb
> > is the fast path with dev_consume_skb_any as the slow.  There are too
> > many scenarios where we could be invoking something that makes use of
> > this within the Tx path so it is probably easiest to just solve it
> > that way so we don't have to deal with it again in the future.  
> 
> Indeed.

So, if I understand you correctly, then we drop the budget parameter
and check for in_softirq(), like:

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 7af7ec635d90..a3c61a9b65d2 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -796,14 +796,14 @@ void __kfree_skb_defer(struct sk_buff *skb)
        _kfree_skb_defer(skb);
 }
 
-void napi_consume_skb(struct sk_buff *skb, int budget)
+void napi_consume_skb(struct sk_buff *skb)
 {
        if (unlikely(!skb))
                return;
 
-       /* if budget is 0 assume netpoll w/ IRQs disabled */
-       if (unlikely(!budget)) {
-               dev_consume_skb_irq(skb);
+       /* Handle if not called from NAPI context, and netpoll invocation */
+       if (unlikely(!in_softirq())) {
+               dev_consume_skb_any(skb);
                return;
        }


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [net-next PATCH 2/7] mlx4: use napi_consume_skb API to get bulk free operations
  2016-03-09 21:36           ` Jesper Dangaard Brouer
@ 2016-03-09 21:43             ` Alexander Duyck
  2016-03-09 21:47               ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 36+ messages in thread
From: Alexander Duyck @ 2016-03-09 21:43 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: David Miller, Netdev, eugenia, Alexei Starovoitov,
	Saeed Mahameed, Or Gerlitz

On Wed, Mar 9, 2016 at 1:36 PM, Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
> On Wed, 09 Mar 2016 16:03:20 -0500 (EST)
> David Miller <davem@davemloft.net> wrote:
>
>> From: Alexander Duyck <alexander.duyck@gmail.com>
>> Date: Wed, 9 Mar 2016 08:47:58 -0800
>>
>> > On Wed, Mar 9, 2016 at 3:00 AM, Jesper Dangaard Brouer
>> > <brouer@redhat.com> wrote:
>> >> Passing the budget down was Alex'es design.  Axel any thoughts?
>> >
>> > I'd say just use dev_consume_skb_any in the bulk free instead of
>> > dev_consume_skb_irq.  This is slow path, as you said, so it shouldn't
>> > come up often.
>>
>> Agreed.
>>
>> >> I do wonder how expensive this check is... as it goes into a code
>> >> hotpath, which is very unlikely.  The good thing would be, that we
>> >> handle if buggy drivers call this function from a none softirq context
>> >> (as these bugs could be hard to catch).
>> >>
>> >> Can netpoll ever be called from softirq or with BH disabled? (It
>> >> disables IRQs, which would break calling kmem_cache_free_bulk).
>> >
>> > It is better for us to switch things out so that the napi_consume_skb
>> > is the fast path with dev_consume_skb_any as the slow.  There are too
>> > many scenarios where we could be invoking something that makes use of
>> > this within the Tx path so it is probably easiest to just solve it
>> > that way so we don't have to deal with it again in the future.
>>
>> Indeed.
>
> So, if I understand you correctly, then we drop the budget parameter
> and check for in_softirq(), like:
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 7af7ec635d90..a3c61a9b65d2 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -796,14 +796,14 @@ void __kfree_skb_defer(struct sk_buff *skb)
>         _kfree_skb_defer(skb);
>  }
>
> -void napi_consume_skb(struct sk_buff *skb, int budget)
> +void napi_consume_skb(struct sk_buff *skb)
>  {
>         if (unlikely(!skb))
>                 return;
>
> -       /* if budget is 0 assume netpoll w/ IRQs disabled */
> -       if (unlikely(!budget)) {
> -               dev_consume_skb_irq(skb);
> +       /* Handle if not called from NAPI context, and netpoll invocation */
> +       if (unlikely(!in_softirq())) {
> +               dev_consume_skb_any(skb);
>                 return;
>         }
>

No.  We still need to have the budget value.  What we do though is
have that feed into dev_consume_skb_any.

The problem with using in_softirq is that it will trigger if softirqs
are just disabled so there are more possible paths where it is
possible.  For example the transmit path has bottom halves disabled so
I am pretty sure it might trigger this as well.  We want this to only
execute when we are running from a NAPI polling routine with a
non-zero budget.

- Alex

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [net-next PATCH 2/7] mlx4: use napi_consume_skb API to get bulk free operations
  2016-03-09 21:43             ` Alexander Duyck
@ 2016-03-09 21:47               ` Jesper Dangaard Brouer
  2016-03-09 22:07                 ` Alexander Duyck
  0 siblings, 1 reply; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-09 21:47 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David Miller, Netdev, eugenia, Alexei Starovoitov,
	Saeed Mahameed, Or Gerlitz, brouer

On Wed, 9 Mar 2016 13:43:59 -0800
Alexander Duyck <alexander.duyck@gmail.com> wrote:

> On Wed, Mar 9, 2016 at 1:36 PM, Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
> > On Wed, 09 Mar 2016 16:03:20 -0500 (EST)
> > David Miller <davem@davemloft.net> wrote:
> >  
> >> From: Alexander Duyck <alexander.duyck@gmail.com>
> >> Date: Wed, 9 Mar 2016 08:47:58 -0800
> >>  
> >> > On Wed, Mar 9, 2016 at 3:00 AM, Jesper Dangaard Brouer
> >> > <brouer@redhat.com> wrote:  
> >> >> Passing the budget down was Alex'es design.  Axel any thoughts?  
> >> >
> >> > I'd say just use dev_consume_skb_any in the bulk free instead of
> >> > dev_consume_skb_irq.  This is slow path, as you said, so it shouldn't
> >> > come up often.  
> >>
> >> Agreed.
> >>  
> >> >> I do wonder how expensive this check is... as it goes into a code
> >> >> hotpath, which is very unlikely.  The good thing would be, that we
> >> >> handle if buggy drivers call this function from a none softirq context
> >> >> (as these bugs could be hard to catch).
> >> >>
> >> >> Can netpoll ever be called from softirq or with BH disabled? (It
> >> >> disables IRQs, which would break calling kmem_cache_free_bulk).  
> >> >
> >> > It is better for us to switch things out so that the napi_consume_skb
> >> > is the fast path with dev_consume_skb_any as the slow.  There are too
> >> > many scenarios where we could be invoking something that makes use of
> >> > this within the Tx path so it is probably easiest to just solve it
> >> > that way so we don't have to deal with it again in the future.  
> >>
> >> Indeed.  
> >
> > So, if I understand you correctly, then we drop the budget parameter
> > and check for in_softirq(), like:
> >
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 7af7ec635d90..a3c61a9b65d2 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -796,14 +796,14 @@ void __kfree_skb_defer(struct sk_buff *skb)
> >         _kfree_skb_defer(skb);
> >  }
> >
> > -void napi_consume_skb(struct sk_buff *skb, int budget)
> > +void napi_consume_skb(struct sk_buff *skb)
> >  {
> >         if (unlikely(!skb))
> >                 return;
> >
> > -       /* if budget is 0 assume netpoll w/ IRQs disabled */
> > -       if (unlikely(!budget)) {
> > -               dev_consume_skb_irq(skb);
> > +       /* Handle if not called from NAPI context, and netpoll invocation */
> > +       if (unlikely(!in_softirq())) {
> > +               dev_consume_skb_any(skb);
> >                 return;
> >         }
> >  
> 
> No.  We still need to have the budget value.  What we do though is
> have that feed into dev_consume_skb_any.
> 
> The problem with using in_softirq is that it will trigger if softirqs
> are just disabled so there are more possible paths where it is
> possible.  For example the transmit path has bottom halves disabled so
> I am pretty sure it might trigger this as well.  We want this to only
> execute when we are running from a NAPI polling routine with a
> non-zero budget.

What about using in_serving_softirq() instead of in_softirq() ?
(would that allow us to drop the budget parameter?)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [net-next PATCH 2/7] mlx4: use napi_consume_skb API to get bulk free operations
  2016-03-09 21:47               ` Jesper Dangaard Brouer
@ 2016-03-09 22:07                 ` Alexander Duyck
  2016-03-10 12:15                   ` [net-next PATCH V2 0/3] net: bulk free adjustment and two driver use-cases Jesper Dangaard Brouer
  0 siblings, 1 reply; 36+ messages in thread
From: Alexander Duyck @ 2016-03-09 22:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: David Miller, Netdev, eugenia, Alexei Starovoitov,
	Saeed Mahameed, Or Gerlitz

On Wed, Mar 9, 2016 at 1:47 PM, Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
> On Wed, 9 Mar 2016 13:43:59 -0800
> Alexander Duyck <alexander.duyck@gmail.com> wrote:
>
>> On Wed, Mar 9, 2016 at 1:36 PM, Jesper Dangaard Brouer
>> <brouer@redhat.com> wrote:
>> > On Wed, 09 Mar 2016 16:03:20 -0500 (EST)
>> > David Miller <davem@davemloft.net> wrote:
>> >
>> >> From: Alexander Duyck <alexander.duyck@gmail.com>
>> >> Date: Wed, 9 Mar 2016 08:47:58 -0800
>> >>
>> >> > On Wed, Mar 9, 2016 at 3:00 AM, Jesper Dangaard Brouer
>> >> > <brouer@redhat.com> wrote:
>> >> >> Passing the budget down was Alex'es design.  Axel any thoughts?
>> >> >
>> >> > I'd say just use dev_consume_skb_any in the bulk free instead of
>> >> > dev_consume_skb_irq.  This is slow path, as you said, so it shouldn't
>> >> > come up often.
>> >>
>> >> Agreed.
>> >>
>> >> >> I do wonder how expensive this check is... as it goes into a code
>> >> >> hotpath, which is very unlikely.  The good thing would be, that we
>> >> >> handle if buggy drivers call this function from a none softirq context
>> >> >> (as these bugs could be hard to catch).
>> >> >>
>> >> >> Can netpoll ever be called from softirq or with BH disabled? (It
>> >> >> disables IRQs, which would break calling kmem_cache_free_bulk).
>> >> >
>> >> > It is better for us to switch things out so that the napi_consume_skb
>> >> > is the fast path with dev_consume_skb_any as the slow.  There are too
>> >> > many scenarios where we could be invoking something that makes use of
>> >> > this within the Tx path so it is probably easiest to just solve it
>> >> > that way so we don't have to deal with it again in the future.
>> >>
>> >> Indeed.
>> >
>> > So, if I understand you correctly, then we drop the budget parameter
>> > and check for in_softirq(), like:
>> >
>> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> > index 7af7ec635d90..a3c61a9b65d2 100644
>> > --- a/net/core/skbuff.c
>> > +++ b/net/core/skbuff.c
>> > @@ -796,14 +796,14 @@ void __kfree_skb_defer(struct sk_buff *skb)
>> >         _kfree_skb_defer(skb);
>> >  }
>> >
>> > -void napi_consume_skb(struct sk_buff *skb, int budget)
>> > +void napi_consume_skb(struct sk_buff *skb)
>> >  {
>> >         if (unlikely(!skb))
>> >                 return;
>> >
>> > -       /* if budget is 0 assume netpoll w/ IRQs disabled */
>> > -       if (unlikely(!budget)) {
>> > -               dev_consume_skb_irq(skb);
>> > +       /* Handle if not called from NAPI context, and netpoll invocation */
>> > +       if (unlikely(!in_softirq())) {
>> > +               dev_consume_skb_any(skb);
>> >                 return;
>> >         }
>> >
>>
>> No.  We still need to have the budget value.  What we do though is
>> have that feed into dev_consume_skb_any.
>>
>> The problem with using in_softirq is that it will trigger if softirqs
>> are just disabled so there are more possible paths where it is
>> possible.  For example the transmit path has bottom halves disabled so
>> I am pretty sure it might trigger this as well.  We want this to only
>> execute when we are running from a NAPI polling routine with a
>> non-zero budget.
>
> What about using in_serving_softirq() instead of in_softirq() ?
> (would that allow us to drop the budget parameter?)

The problem is there are multiple softirq handlers you could be
referring to.  NAPI is just one.  What if for example someone sets up
a tasklet that has to perform some sort of reset with RTNL held.  I am
pretty sure we don't want the tasklet using the NAPI free context
since there will be nothing to actually free the buffers at the end of
it.  We want to avoid that.  That is why I was using the budget value.

- Alex

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [net-next PATCH V2 0/3] net: bulk free adjustment and two driver use-cases
  2016-03-09 22:07                 ` Alexander Duyck
@ 2016-03-10 12:15                   ` Jesper Dangaard Brouer
  2016-03-10 12:15                     ` [net-next PATCH V2 1/3] net: adjust napi_consume_skb to handle none-NAPI callers Jesper Dangaard Brouer
                                       ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-10 12:15 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: eugenia, Alexander Duyck, alexei.starovoitov, saeedm,
	Jesper Dangaard Brouer, gerlitz.or

I've split out the bulk free adjustments, from the bulk alloc patches,
as I want the adjustment to napi_consume_skb to be in same kernel cycle
the API was introduced.

Adjustments based on discussion:
 Subj: "mlx4: use napi_consume_skb API to get bulk free operations"
 http://thread.gmane.org/gmane.linux.network/402503/focus=403386

Patchset based on net-next at commit 3ebeac1d0295

---

Jesper Dangaard Brouer (3):
      net: adjust napi_consume_skb to handle none-NAPI callers
      mlx4: use napi_consume_skb API to get bulk free operations
      mlx5: use napi_consume_skb API to get bulk free operations


 drivers/net/ethernet/mellanox/mlx4/en_tx.c        |   16 ++++++++++------
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |    2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   |    4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c |    2 +-
 net/core/skbuff.c                                 |    4 ++--
 5 files changed, 16 insertions(+), 12 deletions(-)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [net-next PATCH V2 1/3] net: adjust napi_consume_skb to handle none-NAPI callers
  2016-03-10 12:15                   ` [net-next PATCH V2 0/3] net: bulk free adjustment and two driver use-cases Jesper Dangaard Brouer
@ 2016-03-10 12:15                     ` Jesper Dangaard Brouer
  2016-03-10 12:15                     ` [net-next PATCH V2 2/3] mlx4: use napi_consume_skb API to get bulk free operations Jesper Dangaard Brouer
  2016-03-10 12:15                     ` [net-next PATCH V2 " Jesper Dangaard Brouer
  2 siblings, 0 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-10 12:15 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: eugenia, Alexander Duyck, alexei.starovoitov, saeedm,
	Jesper Dangaard Brouer, gerlitz.or

Some drivers reuse/share code paths that free SKBs between NAPI
and none-NAPI calls. Adjust napi_consume_skb to handle this
use-case.

Before, calls from netpoll (w/ IRQs disabled) was handled and
indicated with a budget zero indication.  Use the same zero
indication to handle calls not originating from NAPI/softirq.
Simply handled by using dev_consume_skb_any().

This adds an extra branch+call for the netpoll case (checking
in_irq() + irqs_disabled()), but that is okay as this is a slowpath.

Suggested-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 net/core/skbuff.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 7af7ec635d90..bc62baa54ceb 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -801,9 +801,9 @@ void napi_consume_skb(struct sk_buff *skb, int budget)
 	if (unlikely(!skb))
 		return;
 
-	/* if budget is 0 assume netpoll w/ IRQs disabled */
+	/* Zero budget indicate none-NAPI context called us, like netpoll */
 	if (unlikely(!budget)) {
-		dev_consume_skb_irq(skb);
+		dev_consume_skb_any(skb);
 		return;
 	}
 

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [net-next PATCH V2 2/3] mlx4: use napi_consume_skb API to get bulk free operations
  2016-03-10 12:15                   ` [net-next PATCH V2 0/3] net: bulk free adjustment and two driver use-cases Jesper Dangaard Brouer
  2016-03-10 12:15                     ` [net-next PATCH V2 1/3] net: adjust napi_consume_skb to handle none-NAPI callers Jesper Dangaard Brouer
@ 2016-03-10 12:15                     ` Jesper Dangaard Brouer
  2016-03-10 13:59                       ` Sergei Shtylyov
  2016-03-10 12:15                     ` [net-next PATCH V2 " Jesper Dangaard Brouer
  2 siblings, 1 reply; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-10 12:15 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: eugenia, Alexander Duyck, alexei.starovoitov, saeedm,
	Jesper Dangaard Brouer, gerlitz.or

Bulk free of SKBs happen transparently by the API call napi_consume_skb().
The napi budget parameter is usually needed by napi_consume_skb()
to detect if called from netpoll.  In this patch it have an extra meaning.

For mlx4 driver, the mlx4_en_stop_port() call is done outside
NAPI/softirq context, and cleanup the entire TX ring via
mlx4_en_free_tx_buf().  The code mlx4_en_free_tx_desc() for
freeing SKBs are shared with NAPI calls.

To handle this shared use the zero budget indication is reused,
and handled appropiately in napi_consume_skb(). To reflect this,
variable is called napi_mode for the function call that needed
this distinction.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c |   16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index e0946ab22010..1b41feafce9e 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -276,7 +276,8 @@ static void mlx4_en_stamp_wqe(struct mlx4_en_priv *priv,
 
 static u32 mlx4_en_free_tx_desc(struct mlx4_en_priv *priv,
 				struct mlx4_en_tx_ring *ring,
-				int index, u8 owner, u64 timestamp)
+				int index, u8 owner, u64 timestamp,
+				int napi_mode)
 {
 	struct mlx4_en_tx_info *tx_info = &ring->tx_info[index];
 	struct mlx4_en_tx_desc *tx_desc = ring->buf + index * TXBB_SIZE;
@@ -347,7 +348,8 @@ static u32 mlx4_en_free_tx_desc(struct mlx4_en_priv *priv,
 			}
 		}
 	}
-	dev_consume_skb_any(skb);
+	napi_consume_skb(skb, napi_mode);
+
 	return tx_info->nr_txbb;
 }
 
@@ -371,7 +373,9 @@ int mlx4_en_free_tx_buf(struct net_device *dev, struct mlx4_en_tx_ring *ring)
 	while (ring->cons != ring->prod) {
 		ring->last_nr_txbb = mlx4_en_free_tx_desc(priv, ring,
 						ring->cons & ring->size_mask,
-						!!(ring->cons & ring->size), 0);
+						!!(ring->cons & ring->size), 0,
+						0 /* none-NAPI caller */
+			);
 		ring->cons += ring->last_nr_txbb;
 		cnt++;
 	}
@@ -385,7 +389,7 @@ int mlx4_en_free_tx_buf(struct net_device *dev, struct mlx4_en_tx_ring *ring)
 }
 
 static bool mlx4_en_process_tx_cq(struct net_device *dev,
-				 struct mlx4_en_cq *cq)
+				  struct mlx4_en_cq *cq, int napi_budget)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_cq *mcq = &cq->mcq;
@@ -451,7 +455,7 @@ static bool mlx4_en_process_tx_cq(struct net_device *dev,
 			last_nr_txbb = mlx4_en_free_tx_desc(
 					priv, ring, ring_index,
 					!!((ring_cons + txbbs_skipped) &
-					ring->size), timestamp);
+					ring->size), timestamp, napi_budget);
 
 			mlx4_en_stamp_wqe(priv, ring, stamp_index,
 					  !!((ring_cons + txbbs_stamp) &
@@ -511,7 +515,7 @@ int mlx4_en_poll_tx_cq(struct napi_struct *napi, int budget)
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	int clean_complete;
 
-	clean_complete = mlx4_en_process_tx_cq(dev, cq);
+	clean_complete = mlx4_en_process_tx_cq(dev, cq, budget);
 	if (!clean_complete)
 		return budget;
 

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [net-next PATCH V2 3/3] mlx5: use napi_consume_skb API to get bulk free operations
  2016-03-10 12:15                   ` [net-next PATCH V2 0/3] net: bulk free adjustment and two driver use-cases Jesper Dangaard Brouer
  2016-03-10 12:15                     ` [net-next PATCH V2 1/3] net: adjust napi_consume_skb to handle none-NAPI callers Jesper Dangaard Brouer
  2016-03-10 12:15                     ` [net-next PATCH V2 2/3] mlx4: use napi_consume_skb API to get bulk free operations Jesper Dangaard Brouer
@ 2016-03-10 12:15                     ` Jesper Dangaard Brouer
  2 siblings, 0 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-10 12:15 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: eugenia, Alexander Duyck, alexei.starovoitov, saeedm,
	Jesper Dangaard Brouer, gerlitz.or

Bulk free of SKBs happen transparently by the API call napi_consume_skb().
The napi budget parameter is needed by napi_consume_skb() to detect
if called from netpoll.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |    2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   |    4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c |    2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 9c0e80e64b43..a0708782eb78 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -619,7 +619,7 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev);
 void mlx5e_completion_event(struct mlx5_core_cq *mcq);
 void mlx5e_cq_error_event(struct mlx5_core_cq *mcq, enum mlx5_event event);
 int mlx5e_napi_poll(struct napi_struct *napi, int budget);
-bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq);
+bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget);
 int mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget);
 bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq);
 struct mlx5_cqe64 *mlx5e_get_cqe(struct mlx5e_cq *cq);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index c34f4f3e9537..996b13a5042f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -336,7 +336,7 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev)
 	return mlx5e_sq_xmit(sq, skb);
 }
 
-bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq)
+bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
 {
 	struct mlx5e_sq *sq;
 	u32 dma_fifo_cc;
@@ -412,7 +412,7 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq)
 			npkts++;
 			nbytes += wi->num_bytes;
 			sqcc += wi->num_wqebbs;
-			dev_kfree_skb(skb);
+			napi_consume_skb(skb, napi_budget);
 		} while (!last_wqe);
 	}
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
index 4ac8d716dbdd..d244acee63a5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
@@ -60,7 +60,7 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget)
 	clear_bit(MLX5E_CHANNEL_NAPI_SCHED, &c->flags);
 
 	for (i = 0; i < c->num_tc; i++)
-		busy |= mlx5e_poll_tx_cq(&c->sq[i].cq);
+		busy |= mlx5e_poll_tx_cq(&c->sq[i].cq, budget);
 
 	work_done = mlx5e_poll_rx_cq(&c->rq.cq, budget);
 	busy |= work_done == budget;

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [net-next PATCH V2 2/3] mlx4: use napi_consume_skb API to get bulk free operations
  2016-03-10 12:15                     ` [net-next PATCH V2 2/3] mlx4: use napi_consume_skb API to get bulk free operations Jesper Dangaard Brouer
@ 2016-03-10 13:59                       ` Sergei Shtylyov
  2016-03-10 14:59                         ` [net-next PATCH V3 0/3] net: bulk free adjustment and two driver use-cases Jesper Dangaard Brouer
  0 siblings, 1 reply; 36+ messages in thread
From: Sergei Shtylyov @ 2016-03-10 13:59 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, netdev, David S. Miller
  Cc: eugenia, Alexander Duyck, alexei.starovoitov, saeedm, gerlitz.or

Hello.

On 3/10/2016 3:15 PM, Jesper Dangaard Brouer wrote:

> Bulk free of SKBs happen transparently by the API call napi_consume_skb().
> The napi budget parameter is usually needed by napi_consume_skb()
> to detect if called from netpoll.  In this patch it have an extra meaning.

    It has.

> For mlx4 driver, the mlx4_en_stop_port() call is done outside
> NAPI/softirq context, and cleanup the entire TX ring via
> mlx4_en_free_tx_buf().  The code mlx4_en_free_tx_desc() for
> freeing SKBs are shared with NAPI calls.
>
> To handle this shared use the zero budget indication is reused,
> and handled appropiately in napi_consume_skb(). To reflect this,

    Appropriately.

> variable is called napi_mode for the function call that needed
> this distinction.
>
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---
>   drivers/net/ethernet/mellanox/mlx4/en_tx.c |   16 ++++++++++------
>   1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> index e0946ab22010..1b41feafce9e 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
[...]
> @@ -371,7 +373,9 @@ int mlx4_en_free_tx_buf(struct net_device *dev, struct mlx4_en_tx_ring *ring)
>   	while (ring->cons != ring->prod) {
>   		ring->last_nr_txbb = mlx4_en_free_tx_desc(priv, ring,
>   						ring->cons & ring->size_mask,
> -						!!(ring->cons & ring->size), 0);
> +						!!(ring->cons & ring->size), 0,
> +						0 /* none-NAPI caller */

    Non-NAPI, perhaps?

[...]

MBR, Sergei

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [net-next PATCH V3 0/3] net: bulk free adjustment and two driver use-cases
  2016-03-10 13:59                       ` Sergei Shtylyov
@ 2016-03-10 14:59                         ` Jesper Dangaard Brouer
  2016-03-10 14:59                           ` [net-next PATCH V3 1/3] net: adjust napi_consume_skb to handle none-NAPI callers Jesper Dangaard Brouer
                                             ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-10 14:59 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: sergei.shtylyov, eugenia, Alexander Duyck, alexei.starovoitov,
	saeedm, Jesper Dangaard Brouer, gerlitz.or

I've split out the bulk free adjustments, from the bulk alloc patches,
as I want the adjustment to napi_consume_skb be in same kernel cycle
the API was introduced.

Adjustments based on discussion:
 Subj: "mlx4: use napi_consume_skb API to get bulk free operations"
 http://thread.gmane.org/gmane.linux.network/402503/focus=403386

Patchset based on net-next at commit 3ebeac1d0295

V3: spelling fixes from Sergei
---

Jesper Dangaard Brouer (3):
      net: adjust napi_consume_skb to handle none-NAPI callers
      mlx4: use napi_consume_skb API to get bulk free operations
      mlx5: use napi_consume_skb API to get bulk free operations


 drivers/net/ethernet/mellanox/mlx4/en_tx.c        |   16 ++++++++++------
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |    2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   |    4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c |    2 +-
 net/core/skbuff.c                                 |    4 ++--
 5 files changed, 16 insertions(+), 12 deletions(-)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [net-next PATCH V3 1/3] net: adjust napi_consume_skb to handle none-NAPI callers
  2016-03-10 14:59                         ` [net-next PATCH V3 0/3] net: bulk free adjustment and two driver use-cases Jesper Dangaard Brouer
@ 2016-03-10 14:59                           ` Jesper Dangaard Brouer
  2016-03-10 17:21                             ` Sergei Shtylyov
  2016-03-10 14:59                           ` [net-next PATCH V3 2/3] mlx4: use napi_consume_skb API to get bulk free operations Jesper Dangaard Brouer
  2016-03-10 14:59                           ` [net-next PATCH V3 3/3] mlx5: " Jesper Dangaard Brouer
  2 siblings, 1 reply; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-10 14:59 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: sergei.shtylyov, eugenia, Alexander Duyck, alexei.starovoitov,
	saeedm, Jesper Dangaard Brouer, gerlitz.or

Some drivers reuse/share code paths that free SKBs between NAPI
and none-NAPI calls. Adjust napi_consume_skb to handle this
use-case.

Before, calls from netpoll (w/ IRQs disabled) was handled and
indicated with a budget zero indication.  Use the same zero
indication to handle calls not originating from NAPI/softirq.
Simply handled by using dev_consume_skb_any().

This adds an extra branch+call for the netpoll case (checking
in_irq() + irqs_disabled()), but that is okay as this is a slowpath.

Suggested-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 net/core/skbuff.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 7af7ec635d90..bc62baa54ceb 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -801,9 +801,9 @@ void napi_consume_skb(struct sk_buff *skb, int budget)
 	if (unlikely(!skb))
 		return;
 
-	/* if budget is 0 assume netpoll w/ IRQs disabled */
+	/* Zero budget indicate none-NAPI context called us, like netpoll */
 	if (unlikely(!budget)) {
-		dev_consume_skb_irq(skb);
+		dev_consume_skb_any(skb);
 		return;
 	}
 

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [net-next PATCH V3 2/3] mlx4: use napi_consume_skb API to get bulk free operations
  2016-03-10 14:59                         ` [net-next PATCH V3 0/3] net: bulk free adjustment and two driver use-cases Jesper Dangaard Brouer
  2016-03-10 14:59                           ` [net-next PATCH V3 1/3] net: adjust napi_consume_skb to handle none-NAPI callers Jesper Dangaard Brouer
@ 2016-03-10 14:59                           ` Jesper Dangaard Brouer
  2016-03-10 14:59                           ` [net-next PATCH V3 3/3] mlx5: " Jesper Dangaard Brouer
  2 siblings, 0 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-10 14:59 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: sergei.shtylyov, eugenia, Alexander Duyck, alexei.starovoitov,
	saeedm, Jesper Dangaard Brouer, gerlitz.or

Bulk free of SKBs happen transparently by the API call napi_consume_skb().
The napi budget parameter is usually needed by napi_consume_skb()
to detect if called from netpoll.  In this patch it has an extra meaning.

For mlx4 driver, the mlx4_en_stop_port() call is done outside
NAPI/softirq context, and cleanup the entire TX ring via
mlx4_en_free_tx_buf().  The code mlx4_en_free_tx_desc() for
freeing SKBs are shared with NAPI calls.

To handle this shared use the zero budget indication is reused,
and handled appropriately in napi_consume_skb(). To reflect this,
variable is called napi_mode for the function call that needed
this distinction.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c |   16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index e0946ab22010..001a3d78168e 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -276,7 +276,8 @@ static void mlx4_en_stamp_wqe(struct mlx4_en_priv *priv,
 
 static u32 mlx4_en_free_tx_desc(struct mlx4_en_priv *priv,
 				struct mlx4_en_tx_ring *ring,
-				int index, u8 owner, u64 timestamp)
+				int index, u8 owner, u64 timestamp,
+				int napi_mode)
 {
 	struct mlx4_en_tx_info *tx_info = &ring->tx_info[index];
 	struct mlx4_en_tx_desc *tx_desc = ring->buf + index * TXBB_SIZE;
@@ -347,7 +348,8 @@ static u32 mlx4_en_free_tx_desc(struct mlx4_en_priv *priv,
 			}
 		}
 	}
-	dev_consume_skb_any(skb);
+	napi_consume_skb(skb, napi_mode);
+
 	return tx_info->nr_txbb;
 }
 
@@ -371,7 +373,9 @@ int mlx4_en_free_tx_buf(struct net_device *dev, struct mlx4_en_tx_ring *ring)
 	while (ring->cons != ring->prod) {
 		ring->last_nr_txbb = mlx4_en_free_tx_desc(priv, ring,
 						ring->cons & ring->size_mask,
-						!!(ring->cons & ring->size), 0);
+						!!(ring->cons & ring->size), 0,
+						0 /* Non-NAPI caller */
+			);
 		ring->cons += ring->last_nr_txbb;
 		cnt++;
 	}
@@ -385,7 +389,7 @@ int mlx4_en_free_tx_buf(struct net_device *dev, struct mlx4_en_tx_ring *ring)
 }
 
 static bool mlx4_en_process_tx_cq(struct net_device *dev,
-				 struct mlx4_en_cq *cq)
+				  struct mlx4_en_cq *cq, int napi_budget)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_cq *mcq = &cq->mcq;
@@ -451,7 +455,7 @@ static bool mlx4_en_process_tx_cq(struct net_device *dev,
 			last_nr_txbb = mlx4_en_free_tx_desc(
 					priv, ring, ring_index,
 					!!((ring_cons + txbbs_skipped) &
-					ring->size), timestamp);
+					ring->size), timestamp, napi_budget);
 
 			mlx4_en_stamp_wqe(priv, ring, stamp_index,
 					  !!((ring_cons + txbbs_stamp) &
@@ -511,7 +515,7 @@ int mlx4_en_poll_tx_cq(struct napi_struct *napi, int budget)
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	int clean_complete;
 
-	clean_complete = mlx4_en_process_tx_cq(dev, cq);
+	clean_complete = mlx4_en_process_tx_cq(dev, cq, budget);
 	if (!clean_complete)
 		return budget;
 

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [net-next PATCH V3 3/3] mlx5: use napi_consume_skb API to get bulk free operations
  2016-03-10 14:59                         ` [net-next PATCH V3 0/3] net: bulk free adjustment and two driver use-cases Jesper Dangaard Brouer
  2016-03-10 14:59                           ` [net-next PATCH V3 1/3] net: adjust napi_consume_skb to handle none-NAPI callers Jesper Dangaard Brouer
  2016-03-10 14:59                           ` [net-next PATCH V3 2/3] mlx4: use napi_consume_skb API to get bulk free operations Jesper Dangaard Brouer
@ 2016-03-10 14:59                           ` Jesper Dangaard Brouer
  2 siblings, 0 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-10 14:59 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: sergei.shtylyov, eugenia, Alexander Duyck, alexei.starovoitov,
	saeedm, Jesper Dangaard Brouer, gerlitz.or

Bulk free of SKBs happen transparently by the API call napi_consume_skb().
The napi budget parameter is needed by napi_consume_skb() to detect
if called from netpoll.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |    2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   |    4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c |    2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 9c0e80e64b43..a0708782eb78 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -619,7 +619,7 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev);
 void mlx5e_completion_event(struct mlx5_core_cq *mcq);
 void mlx5e_cq_error_event(struct mlx5_core_cq *mcq, enum mlx5_event event);
 int mlx5e_napi_poll(struct napi_struct *napi, int budget);
-bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq);
+bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget);
 int mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget);
 bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq);
 struct mlx5_cqe64 *mlx5e_get_cqe(struct mlx5e_cq *cq);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index c34f4f3e9537..996b13a5042f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -336,7 +336,7 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev)
 	return mlx5e_sq_xmit(sq, skb);
 }
 
-bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq)
+bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
 {
 	struct mlx5e_sq *sq;
 	u32 dma_fifo_cc;
@@ -412,7 +412,7 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq)
 			npkts++;
 			nbytes += wi->num_bytes;
 			sqcc += wi->num_wqebbs;
-			dev_kfree_skb(skb);
+			napi_consume_skb(skb, napi_budget);
 		} while (!last_wqe);
 	}
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
index 4ac8d716dbdd..d244acee63a5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
@@ -60,7 +60,7 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget)
 	clear_bit(MLX5E_CHANNEL_NAPI_SCHED, &c->flags);
 
 	for (i = 0; i < c->num_tc; i++)
-		busy |= mlx5e_poll_tx_cq(&c->sq[i].cq);
+		busy |= mlx5e_poll_tx_cq(&c->sq[i].cq, budget);
 
 	work_done = mlx5e_poll_rx_cq(&c->rq.cq, budget);
 	busy |= work_done == budget;

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [net-next PATCH V3 1/3] net: adjust napi_consume_skb to handle none-NAPI callers
  2016-03-10 14:59                           ` [net-next PATCH V3 1/3] net: adjust napi_consume_skb to handle none-NAPI callers Jesper Dangaard Brouer
@ 2016-03-10 17:21                             ` Sergei Shtylyov
  2016-03-11  7:45                               ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 36+ messages in thread
From: Sergei Shtylyov @ 2016-03-10 17:21 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, netdev, David S. Miller
  Cc: eugenia, Alexander Duyck, alexei.starovoitov, saeedm, gerlitz.or

Hello.

On 03/10/2016 05:59 PM, Jesper Dangaard Brouer wrote:

> Some drivers reuse/share code paths that free SKBs between NAPI
> and none-NAPI calls. Adjust napi_consume_skb to handle this
> use-case.
>
> Before, calls from netpoll (w/ IRQs disabled) was handled and
> indicated with a budget zero indication.  Use the same zero
> indication to handle calls not originating from NAPI/softirq.
> Simply handled by using dev_consume_skb_any().
>
> This adds an extra branch+call for the netpoll case (checking
> in_irq() + irqs_disabled()), but that is okay as this is a slowpath.
>
> Suggested-by: Alexander Duyck <aduyck@mirantis.com>
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---
>   net/core/skbuff.c |    4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 7af7ec635d90..bc62baa54ceb 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -801,9 +801,9 @@ void napi_consume_skb(struct sk_buff *skb, int budget)
>   	if (unlikely(!skb))
>   		return;
>
> -	/* if budget is 0 assume netpoll w/ IRQs disabled */
> +	/* Zero budget indicate none-NAPI context called us, like netpoll */

    Non-NAPI?

[...]

MBR, Sergei

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [net-next PATCH V3 1/3] net: adjust napi_consume_skb to handle none-NAPI callers
  2016-03-10 17:21                             ` Sergei Shtylyov
@ 2016-03-11  7:45                               ` Jesper Dangaard Brouer
  2016-03-11  8:43                                 ` [net-next PATCH V4 0/3] net: bulk free adjustment and two driver use-cases Jesper Dangaard Brouer
  0 siblings, 1 reply; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-11  7:45 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: netdev, David S. Miller, eugenia, Alexander Duyck,
	alexei.starovoitov, saeedm, gerlitz.or, brouer


On Thu, 10 Mar 2016 20:21:55 +0300
Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> wrote:

> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -801,9 +801,9 @@ void napi_consume_skb(struct sk_buff *skb, int budget)
> >   	if (unlikely(!skb))
> >   		return;
> >
> > -	/* if budget is 0 assume netpoll w/ IRQs disabled */
> > +	/* Zero budget indicate none-NAPI context called us, like netpoll */  
> 
>     Non-NAPI?

Okay, I'll send a V4.  Hope there are no more nitpicking changes...
I'll also adjust the subj none-NAPI -> non-NAPI, and hope that does not
disturb patchwork.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [net-next PATCH V4 0/3] net: bulk free adjustment and two driver use-cases
  2016-03-11  7:45                               ` Jesper Dangaard Brouer
@ 2016-03-11  8:43                                 ` Jesper Dangaard Brouer
  2016-03-11  8:43                                   ` [net-next PATCH V4 1/3] net: adjust napi_consume_skb to handle non-NAPI callers Jesper Dangaard Brouer
                                                     ` (3 more replies)
  0 siblings, 4 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-11  8:43 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: sergei.shtylyov, eugenia, Alexander Duyck, alexei.starovoitov,
	saeedm, Jesper Dangaard Brouer, gerlitz.or

I've split out the bulk free adjustments, from the bulk alloc patches,
as I want the adjustment to napi_consume_skb be in same kernel cycle
the API was introduced.

Adjustments based on discussion:
 Subj: "mlx4: use napi_consume_skb API to get bulk free operations"
 http://thread.gmane.org/gmane.linux.network/402503/focus=403386

Patchset based on net-next at commit 3ebeac1d0295

V4: more nitpicks from Sergei
V3: spelling fixes from Sergei

---

Jesper Dangaard Brouer (3):
      net: adjust napi_consume_skb to handle non-NAPI callers
      mlx4: use napi_consume_skb API to get bulk free operations
      mlx5: use napi_consume_skb API to get bulk free operations


 drivers/net/ethernet/mellanox/mlx4/en_tx.c        |   15 +++++++++------
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |    2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   |    4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c |    2 +-
 net/core/skbuff.c                                 |    4 ++--
 5 files changed, 15 insertions(+), 12 deletions(-)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [net-next PATCH V4 1/3] net: adjust napi_consume_skb to handle non-NAPI callers
  2016-03-11  8:43                                 ` [net-next PATCH V4 0/3] net: bulk free adjustment and two driver use-cases Jesper Dangaard Brouer
@ 2016-03-11  8:43                                   ` Jesper Dangaard Brouer
  2016-03-11  8:44                                   ` [net-next PATCH V4 2/3] mlx4: use napi_consume_skb API to get bulk free operations Jesper Dangaard Brouer
                                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-11  8:43 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: sergei.shtylyov, eugenia, Alexander Duyck, alexei.starovoitov,
	saeedm, Jesper Dangaard Brouer, gerlitz.or

Some drivers reuse/share code paths that free SKBs between NAPI
and non-NAPI calls. Adjust napi_consume_skb to handle this
use-case.

Before, calls from netpoll (w/ IRQs disabled) was handled and
indicated with a budget zero indication.  Use the same zero
indication to handle calls not originating from NAPI/softirq.
Simply handled by using dev_consume_skb_any().

This adds an extra branch+call for the netpoll case (checking
in_irq() + irqs_disabled()), but that is okay as this is a slowpath.

Suggested-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 net/core/skbuff.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 7af7ec635d90..cf63a6000405 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -801,9 +801,9 @@ void napi_consume_skb(struct sk_buff *skb, int budget)
 	if (unlikely(!skb))
 		return;
 
-	/* if budget is 0 assume netpoll w/ IRQs disabled */
+	/* Zero budget indicate non-NAPI context called us, like netpoll */
 	if (unlikely(!budget)) {
-		dev_consume_skb_irq(skb);
+		dev_consume_skb_any(skb);
 		return;
 	}
 

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [net-next PATCH V4 2/3] mlx4: use napi_consume_skb API to get bulk free operations
  2016-03-11  8:43                                 ` [net-next PATCH V4 0/3] net: bulk free adjustment and two driver use-cases Jesper Dangaard Brouer
  2016-03-11  8:43                                   ` [net-next PATCH V4 1/3] net: adjust napi_consume_skb to handle non-NAPI callers Jesper Dangaard Brouer
@ 2016-03-11  8:44                                   ` Jesper Dangaard Brouer
  2016-03-11  8:44                                   ` [net-next PATCH V4 3/3] mlx5: " Jesper Dangaard Brouer
  2016-03-14  2:35                                   ` [net-next PATCH V4 0/3] net: bulk free adjustment and two driver use-cases David Miller
  3 siblings, 0 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-11  8:44 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: sergei.shtylyov, eugenia, Alexander Duyck, alexei.starovoitov,
	saeedm, Jesper Dangaard Brouer, gerlitz.or

Bulk free of SKBs happen transparently by the API call napi_consume_skb().
The napi budget parameter is usually needed by napi_consume_skb()
to detect if called from netpoll.  In this patch it has an extra meaning.

For mlx4 driver, the mlx4_en_stop_port() call is done outside
NAPI/softirq context, and cleanup the entire TX ring via
mlx4_en_free_tx_buf().  The code mlx4_en_free_tx_desc() for
freeing SKBs are shared with NAPI calls.

To handle this shared use the zero budget indication is reused,
and handled appropriately in napi_consume_skb(). To reflect this,
variable is called napi_mode for the function call that needed
this distinction.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c |   15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index e0946ab22010..c0d7b7296236 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -276,7 +276,8 @@ static void mlx4_en_stamp_wqe(struct mlx4_en_priv *priv,
 
 static u32 mlx4_en_free_tx_desc(struct mlx4_en_priv *priv,
 				struct mlx4_en_tx_ring *ring,
-				int index, u8 owner, u64 timestamp)
+				int index, u8 owner, u64 timestamp,
+				int napi_mode)
 {
 	struct mlx4_en_tx_info *tx_info = &ring->tx_info[index];
 	struct mlx4_en_tx_desc *tx_desc = ring->buf + index * TXBB_SIZE;
@@ -347,7 +348,8 @@ static u32 mlx4_en_free_tx_desc(struct mlx4_en_priv *priv,
 			}
 		}
 	}
-	dev_consume_skb_any(skb);
+	napi_consume_skb(skb, napi_mode);
+
 	return tx_info->nr_txbb;
 }
 
@@ -371,7 +373,8 @@ int mlx4_en_free_tx_buf(struct net_device *dev, struct mlx4_en_tx_ring *ring)
 	while (ring->cons != ring->prod) {
 		ring->last_nr_txbb = mlx4_en_free_tx_desc(priv, ring,
 						ring->cons & ring->size_mask,
-						!!(ring->cons & ring->size), 0);
+						!!(ring->cons & ring->size), 0,
+						0 /* Non-NAPI caller */);
 		ring->cons += ring->last_nr_txbb;
 		cnt++;
 	}
@@ -385,7 +388,7 @@ int mlx4_en_free_tx_buf(struct net_device *dev, struct mlx4_en_tx_ring *ring)
 }
 
 static bool mlx4_en_process_tx_cq(struct net_device *dev,
-				 struct mlx4_en_cq *cq)
+				  struct mlx4_en_cq *cq, int napi_budget)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_cq *mcq = &cq->mcq;
@@ -451,7 +454,7 @@ static bool mlx4_en_process_tx_cq(struct net_device *dev,
 			last_nr_txbb = mlx4_en_free_tx_desc(
 					priv, ring, ring_index,
 					!!((ring_cons + txbbs_skipped) &
-					ring->size), timestamp);
+					ring->size), timestamp, napi_budget);
 
 			mlx4_en_stamp_wqe(priv, ring, stamp_index,
 					  !!((ring_cons + txbbs_stamp) &
@@ -511,7 +514,7 @@ int mlx4_en_poll_tx_cq(struct napi_struct *napi, int budget)
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	int clean_complete;
 
-	clean_complete = mlx4_en_process_tx_cq(dev, cq);
+	clean_complete = mlx4_en_process_tx_cq(dev, cq, budget);
 	if (!clean_complete)
 		return budget;
 

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [net-next PATCH V4 3/3] mlx5: use napi_consume_skb API to get bulk free operations
  2016-03-11  8:43                                 ` [net-next PATCH V4 0/3] net: bulk free adjustment and two driver use-cases Jesper Dangaard Brouer
  2016-03-11  8:43                                   ` [net-next PATCH V4 1/3] net: adjust napi_consume_skb to handle non-NAPI callers Jesper Dangaard Brouer
  2016-03-11  8:44                                   ` [net-next PATCH V4 2/3] mlx4: use napi_consume_skb API to get bulk free operations Jesper Dangaard Brouer
@ 2016-03-11  8:44                                   ` Jesper Dangaard Brouer
  2016-03-14  2:35                                   ` [net-next PATCH V4 0/3] net: bulk free adjustment and two driver use-cases David Miller
  3 siblings, 0 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-11  8:44 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: sergei.shtylyov, eugenia, Alexander Duyck, alexei.starovoitov,
	saeedm, Jesper Dangaard Brouer, gerlitz.or

Bulk free of SKBs happen transparently by the API call napi_consume_skb().
The napi budget parameter is needed by napi_consume_skb() to detect
if called from netpoll.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |    2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   |    4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c |    2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 9c0e80e64b43..a0708782eb78 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -619,7 +619,7 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev);
 void mlx5e_completion_event(struct mlx5_core_cq *mcq);
 void mlx5e_cq_error_event(struct mlx5_core_cq *mcq, enum mlx5_event event);
 int mlx5e_napi_poll(struct napi_struct *napi, int budget);
-bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq);
+bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget);
 int mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget);
 bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq);
 struct mlx5_cqe64 *mlx5e_get_cqe(struct mlx5e_cq *cq);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index c34f4f3e9537..996b13a5042f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -336,7 +336,7 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev)
 	return mlx5e_sq_xmit(sq, skb);
 }
 
-bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq)
+bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
 {
 	struct mlx5e_sq *sq;
 	u32 dma_fifo_cc;
@@ -412,7 +412,7 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq)
 			npkts++;
 			nbytes += wi->num_bytes;
 			sqcc += wi->num_wqebbs;
-			dev_kfree_skb(skb);
+			napi_consume_skb(skb, napi_budget);
 		} while (!last_wqe);
 	}
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
index 4ac8d716dbdd..d244acee63a5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
@@ -60,7 +60,7 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget)
 	clear_bit(MLX5E_CHANNEL_NAPI_SCHED, &c->flags);
 
 	for (i = 0; i < c->num_tc; i++)
-		busy |= mlx5e_poll_tx_cq(&c->sq[i].cq);
+		busy |= mlx5e_poll_tx_cq(&c->sq[i].cq, budget);
 
 	work_done = mlx5e_poll_rx_cq(&c->rq.cq, budget);
 	busy |= work_done == budget;

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [net-next PATCH 3/7] net: bulk alloc and reuse of SKBs in NAPI context
  2016-03-04 13:01 ` [net-next PATCH 3/7] net: bulk alloc and reuse of SKBs in NAPI context Jesper Dangaard Brouer
@ 2016-03-13 14:06   ` Rana Shahout
  2016-03-14  6:55     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 36+ messages in thread
From: Rana Shahout @ 2016-03-13 14:06 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, David S. Miller, eugenia, Alexander Duyck,
	alexei.starovoitov, saeedm, gerlitz.or

On Fri, Mar 4, 2016 at 3:01 PM, Jesper Dangaard Brouer
<brouer@redhat.com> wrote:

>  /* build_skb() is wrapper over __build_skb(), that specifically
>   * takes care of skb->head and skb->pfmemalloc
>   * This means that if @frag_size is not zero, then @data must be backed
> @@ -490,8 +500,8 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len,
>
>         len += NET_SKB_PAD + NET_IP_ALIGN;
>
> -       if ((len > SKB_WITH_OVERHEAD(PAGE_SIZE)) ||
> -           (gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA))) {
> +       if (unlikely((len > SKB_WITH_OVERHEAD(PAGE_SIZE)) ||
> +                    (gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA)))) {

Why unlikely? I know it is better for the common case where most
likely linear SKBs are << SKB_WITH_OVERHEAD(PAGE_SIZE)), but what
about the case of Hardware LRO,  where linear SKB is likely to be >>
SKB_WITH_OVERHEAD(PAGE_SIZE)).


>                 skb = __alloc_skb(len, gfp_mask, SKB_ALLOC_RX, NUMA_NO_NODE);
>                 if (!skb)
>                         goto skb_fail;
> @@ -508,11 +518,20 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len,
>         if (unlikely(!data))
>                 return NULL;
>
> -       skb = __build_skb(data, len);
> -       if (unlikely(!skb)) {
> +#define BULK_ALLOC_SIZE 8
> +       if (!nc->skb_count) {
> +               nc->skb_count = kmem_cache_alloc_bulk(skbuff_head_cache,
> +                                                     gfp_mask, BULK_ALLOC_SIZE,
> +                                                     nc->skb_cache);
> +       }
> +       if (likely(nc->skb_count)) {
> +               skb = (struct sk_buff *)nc->skb_cache[--nc->skb_count];
> +       } else {
> +               /* alloc bulk failed */
>                 skb_free_frag(data);
>                 return NULL;
>         }
> +       skb = ___build_skb(data, len, skb);
>
>         /* use OR instead of assignment to avoid clearing of bits in mask */
>         if (nc->page.pfmemalloc)
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [net-next PATCH V4 0/3] net: bulk free adjustment and two driver use-cases
  2016-03-11  8:43                                 ` [net-next PATCH V4 0/3] net: bulk free adjustment and two driver use-cases Jesper Dangaard Brouer
                                                     ` (2 preceding siblings ...)
  2016-03-11  8:44                                   ` [net-next PATCH V4 3/3] mlx5: " Jesper Dangaard Brouer
@ 2016-03-14  2:35                                   ` David Miller
  3 siblings, 0 replies; 36+ messages in thread
From: David Miller @ 2016-03-14  2:35 UTC (permalink / raw)
  To: brouer
  Cc: netdev, sergei.shtylyov, eugenia, alexander.duyck,
	alexei.starovoitov, saeedm, gerlitz.or

From: Jesper Dangaard Brouer <brouer@redhat.com>
Date: Fri, 11 Mar 2016 09:43:48 +0100

> I've split out the bulk free adjustments, from the bulk alloc patches,
> as I want the adjustment to napi_consume_skb be in same kernel cycle
> the API was introduced.
> 
> Adjustments based on discussion:
>  Subj: "mlx4: use napi_consume_skb API to get bulk free operations"
>  http://thread.gmane.org/gmane.linux.network/402503/focus=403386
> 
> Patchset based on net-next at commit 3ebeac1d0295
> 
> V4: more nitpicks from Sergei
> V3: spelling fixes from Sergei

Series applied, thanks Jesper.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [net-next PATCH 3/7] net: bulk alloc and reuse of SKBs in NAPI context
  2016-03-13 14:06   ` Rana Shahout
@ 2016-03-14  6:55     ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 36+ messages in thread
From: Jesper Dangaard Brouer @ 2016-03-14  6:55 UTC (permalink / raw)
  To: Rana Shahout
  Cc: netdev, David S. Miller, eugenia, Alexander Duyck,
	alexei.starovoitov, saeedm, gerlitz.or, brouer

On Sun, 13 Mar 2016 16:06:17 +0200
Rana Shahout <rana.shahot@gmail.com> wrote:

> On Fri, Mar 4, 2016 at 3:01 PM, Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
> 
> >  /* build_skb() is wrapper over __build_skb(), that specifically
> >   * takes care of skb->head and skb->pfmemalloc
> >   * This means that if @frag_size is not zero, then @data must be backed
> > @@ -490,8 +500,8 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len,
> >
> >         len += NET_SKB_PAD + NET_IP_ALIGN;
> >
> > -       if ((len > SKB_WITH_OVERHEAD(PAGE_SIZE)) ||
> > -           (gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA))) {
> > +       if (unlikely((len > SKB_WITH_OVERHEAD(PAGE_SIZE)) ||
> > +                    (gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA)))) {  
> 
> Why unlikely? I know it is better for the common case where most
> likely linear SKBs are << SKB_WITH_OVERHEAD(PAGE_SIZE)), but what
> about the case of Hardware LRO,  where linear SKB is likely to be >>
> SKB_WITH_OVERHEAD(PAGE_SIZE)).

You said it yourself, this is better for the common case.  With
unlikely() I'm asking the compiler to layout the code for the common
case.  This helps the CPU instruction cache prefetcher.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2016-03-14  6:55 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-04 13:01 [net-next PATCH 0/7] net: bulk alloc side and more bulk free drivers Jesper Dangaard Brouer
2016-03-04 13:01 ` [net-next PATCH 1/7] mlx5: use napi_consume_skb API to get bulk free operations Jesper Dangaard Brouer
2016-03-04 13:01 ` [net-next PATCH 2/7] mlx4: " Jesper Dangaard Brouer
2016-03-08 19:24   ` David Miller
2016-03-09 11:00     ` Jesper Dangaard Brouer
2016-03-09 16:47       ` Alexander Duyck
2016-03-09 21:03         ` David Miller
2016-03-09 21:36           ` Jesper Dangaard Brouer
2016-03-09 21:43             ` Alexander Duyck
2016-03-09 21:47               ` Jesper Dangaard Brouer
2016-03-09 22:07                 ` Alexander Duyck
2016-03-10 12:15                   ` [net-next PATCH V2 0/3] net: bulk free adjustment and two driver use-cases Jesper Dangaard Brouer
2016-03-10 12:15                     ` [net-next PATCH V2 1/3] net: adjust napi_consume_skb to handle none-NAPI callers Jesper Dangaard Brouer
2016-03-10 12:15                     ` [net-next PATCH V2 2/3] mlx4: use napi_consume_skb API to get bulk free operations Jesper Dangaard Brouer
2016-03-10 13:59                       ` Sergei Shtylyov
2016-03-10 14:59                         ` [net-next PATCH V3 0/3] net: bulk free adjustment and two driver use-cases Jesper Dangaard Brouer
2016-03-10 14:59                           ` [net-next PATCH V3 1/3] net: adjust napi_consume_skb to handle none-NAPI callers Jesper Dangaard Brouer
2016-03-10 17:21                             ` Sergei Shtylyov
2016-03-11  7:45                               ` Jesper Dangaard Brouer
2016-03-11  8:43                                 ` [net-next PATCH V4 0/3] net: bulk free adjustment and two driver use-cases Jesper Dangaard Brouer
2016-03-11  8:43                                   ` [net-next PATCH V4 1/3] net: adjust napi_consume_skb to handle non-NAPI callers Jesper Dangaard Brouer
2016-03-11  8:44                                   ` [net-next PATCH V4 2/3] mlx4: use napi_consume_skb API to get bulk free operations Jesper Dangaard Brouer
2016-03-11  8:44                                   ` [net-next PATCH V4 3/3] mlx5: " Jesper Dangaard Brouer
2016-03-14  2:35                                   ` [net-next PATCH V4 0/3] net: bulk free adjustment and two driver use-cases David Miller
2016-03-10 14:59                           ` [net-next PATCH V3 2/3] mlx4: use napi_consume_skb API to get bulk free operations Jesper Dangaard Brouer
2016-03-10 14:59                           ` [net-next PATCH V3 3/3] mlx5: " Jesper Dangaard Brouer
2016-03-10 12:15                     ` [net-next PATCH V2 " Jesper Dangaard Brouer
2016-03-04 13:01 ` [net-next PATCH 3/7] net: bulk alloc and reuse of SKBs in NAPI context Jesper Dangaard Brouer
2016-03-13 14:06   ` Rana Shahout
2016-03-14  6:55     ` Jesper Dangaard Brouer
2016-03-04 13:01 ` [net-next PATCH 4/7] mlx5: use napi_alloc_skb API to get SKB bulk allocations Jesper Dangaard Brouer
2016-03-04 13:02 ` [net-next PATCH 5/7] mlx4: " Jesper Dangaard Brouer
2016-03-04 13:02 ` [net-next PATCH 6/7] net: introduce napi_alloc_skb_hint() for more use-cases Jesper Dangaard Brouer
2016-03-04 13:02 ` [net-next PATCH 7/7] mlx5: hint the NAPI alloc skb API about the expected bulk size Jesper Dangaard Brouer
2016-03-04 16:36 ` [net-next PATCH 0/7] net: bulk alloc side and more bulk free drivers Alexei Starovoitov
2016-03-04 19:15   ` Jesper Dangaard Brouer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.