All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: netdev@vger.kernel.org
Cc: Alexander Duyck <alexander.duyck@gmail.com>,
	linux-mm@kvack.org, Jesper Dangaard Brouer <brouer@redhat.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Lameter <cl@linux.com>
Subject: [PATCH 4/4] net: bulk alloc and reuse of SKBs in NAPI context
Date: Fri, 23 Oct 2015 14:46:22 +0200	[thread overview]
Message-ID: <20151023124621.17364.15109.stgit@firesoul> (raw)
In-Reply-To: <20151023124451.17364.14594.stgit@firesoul>

Think twice before applying
 - This patch can potentially introduce added latency in some workloads

This patch introduce bulk alloc of SKBs and allow reuse of SKBs
free'ed in same softirq cycle.  SKBs are normally free'ed during TX
completion, but most high speed drivers also cleanup TX ring during
NAPI RX poll cycle.  Thus, if using napi_consume_skb/__kfree_skb_defer,
SKBs will be avail in the napi_alloc_cache->skb_cache.

If no SKBs are avail for reuse, then only bulk alloc 8 SKBs, to limit
the potential overshooting unused SKBs needed to free'ed when NAPI
cycle ends (flushed in net_rx_action via __kfree_skb_flush()).

Benchmarking IPv4-forwarding, on CPU i7-4790K @4.2GHz (no turbo boost)
(GCC version 5.1.1 20150618 (Red Hat 5.1.1-4))
 Allocator SLUB:
  Single CPU/flow numbers: before: 2064446 pps -> after: 2083031 pps
  Improvement: +18585 pps, -4.3 nanosec, +0.9%
 Allocator SLAB:
  Single CPU/flow numbers: before: 2035949 pps -> after: 2033567 pps
  Regression: -2382 pps, +0.57 nanosec, -0.1 %

Even-though benchmarking does show an improvement for SLUB(+0.9%), I'm
not convinced bulk alloc will be a win in all situations:
 * I see stalls on walking the SLUB freelist (normal hidden by prefetch)
 * In case RX queue is not full, alloc and free more SKBs than needed

More testing is needed with more real life benchmarks.

Joint work with Alexander Duyck.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
---
 net/core/skbuff.c |   35 +++++++++++++++++++++++++++++++----
 1 file changed, 31 insertions(+), 4 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 2ffcb014e00b..d0e0ccccfb11 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -483,13 +483,14 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len,
 				 gfp_t gfp_mask)
 {
 	struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache);
+	struct skb_shared_info *shinfo;
 	struct sk_buff *skb;
 	void *data;
 
 	len += NET_SKB_PAD + NET_IP_ALIGN;
 
-	if ((len > SKB_WITH_OVERHEAD(PAGE_SIZE)) ||
-	    (gfp_mask & (__GFP_WAIT | GFP_DMA))) {
+	if (unlikely((len > SKB_WITH_OVERHEAD(PAGE_SIZE)) ||
+		     (gfp_mask & (__GFP_WAIT | GFP_DMA)))) {
 		skb = __alloc_skb(len, gfp_mask, SKB_ALLOC_RX, NUMA_NO_NODE);
 		if (!skb)
 			goto skb_fail;
@@ -506,12 +507,38 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len,
 	if (unlikely(!data))
 		return NULL;
 
-	skb = __build_skb(data, len);
-	if (unlikely(!skb)) {
+#define BULK_ALLOC_SIZE 8
+	if (!nc->skb_count &&
+	    kmem_cache_alloc_bulk(skbuff_head_cache, gfp_mask,
+				  BULK_ALLOC_SIZE, nc->skb_cache)) {
+		nc->skb_count = BULK_ALLOC_SIZE;
+	}
+	if (likely(nc->skb_count)) {
+		skb = (struct sk_buff *)nc->skb_cache[--nc->skb_count];
+	} else {
+		/* alloc bulk failed */
 		skb_free_frag(data);
 		return NULL;
 	}
 
+	len -= SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+
+	memset(skb, 0, offsetof(struct sk_buff, tail));
+	skb->truesize = SKB_TRUESIZE(len);
+	atomic_set(&skb->users, 1);
+	skb->head = data;
+	skb->data = data;
+	skb_reset_tail_pointer(skb);
+	skb->end = skb->tail + len;
+	skb->mac_header = (typeof(skb->mac_header))~0U;
+	skb->transport_header = (typeof(skb->transport_header))~0U;
+
+	/* make sure we initialize shinfo sequentially */
+	shinfo = skb_shinfo(skb);
+	memset(shinfo, 0, offsetof(struct skb_shared_info, dataref));
+	atomic_set(&shinfo->dataref, 1);
+	kmemcheck_annotate_variable(shinfo->destructor_arg);
+
 	/* use OR instead of assignment to avoid clearing of bits in mask */
 	if (nc->page.pfmemalloc)
 		skb->pfmemalloc = 1;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2015-10-23 12:46 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-23 12:46 [PATCH 0/4] net: mitigating kmem_cache slowpath for network stack in NAPI context Jesper Dangaard Brouer
2015-10-23 12:46 ` Jesper Dangaard Brouer
2015-10-23 12:46 ` [PATCH 1/4] net: bulk free infrastructure for NAPI context, use napi_consume_skb Jesper Dangaard Brouer
2015-10-23 12:46   ` Jesper Dangaard Brouer
2015-10-23 12:46 ` [PATCH 2/4] net: bulk free SKBs that were delay free'ed due to IRQ context Jesper Dangaard Brouer
2015-10-23 12:46 ` [PATCH 3/4] ixgbe: bulk free SKBs during TX completion cleanup cycle Jesper Dangaard Brouer
2015-10-23 12:46   ` Jesper Dangaard Brouer
2015-10-23 12:46 ` Jesper Dangaard Brouer [this message]
2015-10-27  1:09 ` [PATCH 0/4] net: mitigating kmem_cache slowpath for network stack in NAPI context David Miller
2016-02-02 21:11 ` [net-next PATCH 00/11] net: mitigating kmem_cache slowpath and BoF discussion patches Jesper Dangaard Brouer
2016-02-02 21:11   ` [net-next PATCH 01/11] net: bulk free infrastructure for NAPI context, use napi_consume_skb Jesper Dangaard Brouer
2016-02-02 21:11   ` [net-next PATCH 02/11] net: bulk free SKBs that were delay free'ed due to IRQ context Jesper Dangaard Brouer
2016-02-02 21:11   ` [net-next PATCH 03/11] ixgbe: bulk free SKBs during TX completion cleanup cycle Jesper Dangaard Brouer
2016-02-02 21:12   ` [net-next PATCH 04/11] net: bulk alloc and reuse of SKBs in NAPI context Jesper Dangaard Brouer
2016-02-03  0:52     ` Alexei Starovoitov
2016-02-03 10:38       ` Jesper Dangaard Brouer
2016-02-02 21:12   ` [net-next PATCH 05/11] mlx5: use napi_*_skb APIs to get bulk alloc and free Jesper Dangaard Brouer
2016-02-02 21:13   ` [net-next PATCH 06/11] RFC: mlx5: RX bulking or bundling of packets before calling network stack Jesper Dangaard Brouer
2016-02-09 11:57     ` Saeed Mahameed
2016-02-10 20:26       ` Jesper Dangaard Brouer
2016-02-16  0:01         ` Saeed Mahameed
2016-02-02 21:13   ` [net-next PATCH 07/11] net: introduce napi_alloc_skb_hint() for more use-cases Jesper Dangaard Brouer
2016-02-02 22:29     ` kbuild test robot
2016-02-02 21:14   ` [net-next PATCH 08/11] mlx5: hint the NAPI alloc skb API about the expected bulk size Jesper Dangaard Brouer
2016-02-02 21:14   ` [net-next PATCH 09/11] RFC: dummy: bulk free SKBs Jesper Dangaard Brouer
2016-02-02 21:15   ` [net-next PATCH 10/11] RFC: net: API for RX handover of multiple SKBs to stack Jesper Dangaard Brouer
2016-02-02 21:15   ` [net-next PATCH 11/11] RFC: net: RPS bulk enqueue to backlog Jesper Dangaard Brouer
2016-02-07 19:25   ` [net-next PATCH 00/11] net: mitigating kmem_cache slowpath and BoF discussion patches David Miller
2016-02-08 12:14     ` [net-next PATCH V2 0/3] net: mitigating kmem_cache free slowpath Jesper Dangaard Brouer
2016-02-08 12:14       ` Jesper Dangaard Brouer
2016-02-08 12:14       ` [net-next PATCH V2 1/3] net: bulk free infrastructure for NAPI context, use napi_consume_skb Jesper Dangaard Brouer
2016-02-08 12:15       ` [net-next PATCH V2 2/3] net: bulk free SKBs that were delay free'ed due to IRQ context Jesper Dangaard Brouer
2016-02-08 12:15       ` [net-next PATCH V2 3/3] ixgbe: bulk free SKBs during TX completion cleanup cycle Jesper Dangaard Brouer
2016-02-11 16:59       ` [net-next PATCH V2 0/3] net: mitigating kmem_cache free slowpath David Miller
2016-02-13 11:12       ` Tilman Schmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151023124621.17364.15109.stgit@firesoul \
    --to=brouer@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.duyck@gmail.com \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.