netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Lobakin <alobakin@pm.me>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: "Alexander Lobakin" <alobakin@pm.me>,
	"David S. Miller" <davem@davemloft.net>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Jonathan Lemon" <jonathan.lemon@gmail.com>,
	"Eric Dumazet" <edumazet@google.com>,
	"Dmitry Vyukov" <dvyukov@google.com>,
	"Willem de Bruijn" <willemb@google.com>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	"Kevin Hao" <haokexin@gmail.com>,
	"Pablo Neira Ayuso" <pablo@netfilter.org>,
	"Jakub Sitnicki" <jakub@cloudflare.com>,
	"Marco Elver" <elver@google.com>,
	"Dexuan Cui" <decui@microsoft.com>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Andrii Nakryiko" <andriin@fb.com>,
	"Taehee Yoo" <ap420073@gmail.com>,
	"Cong Wang" <xiyou.wangcong@gmail.com>,
	"Björn Töpel" <bjorn@kernel.org>,
	"Miaohe Lin" <linmiaohe@huawei.com>,
	"Guillaume Nault" <gnault@redhat.com>,
	"Yonghong Song" <yhs@fb.com>, zhudi <zhudi21@huawei.com>,
	"Michal Kubecek" <mkubecek@suse.cz>,
	"Marcelo Ricardo Leitner" <marcelo.leitner@gmail.com>,
	"Dmitry Safonov" <0x7f454c46@gmail.com>,
	"Yang Yingliang" <yangyingliang@huawei.com>,
	"Florian Westphal" <fw@strlen.de>,
	"Edward Cree" <ecree.xilinx@gmail.com>,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH v4 net-next 08/11] skbuff: introduce {,__}napi_build_skb() which reuses NAPI cache heads
Date: Thu, 11 Feb 2021 14:38:34 +0000	[thread overview]
Message-ID: <20210211143818.2078-1-alobakin@pm.me> (raw)
In-Reply-To: <20210211135459.075d954b@carbon>

From: Jesper Dangaard Brouer <brouer@redhat.com>
Date: Thu, 11 Feb 2021 13:54:59 +0100

> On Wed, 10 Feb 2021 16:30:23 +0000
> Alexander Lobakin <alobakin@pm.me> wrote:
> 
> > Instead of just bulk-flushing skbuff_heads queued up through
> > napi_consume_skb() or __kfree_skb_defer(), try to reuse them
> > on allocation path.
> 
> Maybe you are already aware of this dynamics, but high speed NICs will
> usually run the TX "cleanup" (opportunistic DMA-completion) in the napi
> poll function call, and often before processing RX packets. Like
> ixgbe_poll[1] calls ixgbe_clean_tx_irq() before ixgbe_clean_rx_irq().

Sure. 1G MIPS is my home project (I'll likely migrate to ARM64 cluster
in 2-3 months). I mostly work with 10-100G NICs at work.

> If traffic is symmetric (or is routed-back same interface) then this
> SKB recycle scheme will be highly efficient. (I had this part of my
> initial patchset and tested it on ixgbe).
> 
> [1] https://elixir.bootlin.com/linux/v5.11-rc7/source/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c#L3149

That's exactly why I introduced this feature. Firstly driver enriches
the cache with the consumed skbs from Tx completion queue, and then
it just decaches them back on Rx completion cycle. That's how things
worked most of the time on my test setup.

The reason why Paolo proposed this as an option, and why I agreed
it's safer to do instead of unconditional switching, is that
different platforms and setup may react differently on this.
We don't have an ability to test the entire zoo, so we propose
an option for driver and network core developers to test and use
"on demand".
As I wrote in reply to Paolo, there might be cases when even the
core networking code may benefit from this.

> > If the cache is empty on allocation, bulk-allocate the first
> > 16 elements, which is more efficient than per-skb allocation.
> > If the cache is full on freeing, bulk-wipe the second half of
> > the cache (32 elements).
> > This also includes custom KASAN poisoning/unpoisoning to be
> > double sure there are no use-after-free cases.
> > 
> > To not change current behaviour, introduce a new function,
> > napi_build_skb(), to optionally use a new approach later
> > in drivers.
> > 
> > Note on selected bulk size, 16:
> >  - this equals to XDP_BULK_QUEUE_SIZE, DEV_MAP_BULK_SIZE
> >    and especially VETH_XDP_BATCH, which is also used to
> >    bulk-allocate skbuff_heads and was tested on powerful
> >    setups;
> >  - this also showed the best performance in the actual
> >    test series (from the array of {8, 16, 32}).
> > 
> > Suggested-by: Edward Cree <ecree.xilinx@gmail.com> # Divide on two halves
> > Suggested-by: Eric Dumazet <edumazet@google.com>   # KASAN poisoning
> > Cc: Dmitry Vyukov <dvyukov@google.com>             # Help with KASAN
> > Cc: Paolo Abeni <pabeni@redhat.com>                # Reduced batch size
> > Signed-off-by: Alexander Lobakin <alobakin@pm.me>
> > ---
> >  include/linux/skbuff.h |  2 +
> >  net/core/skbuff.c      | 94 ++++++++++++++++++++++++++++++++++++------
> >  2 files changed, 83 insertions(+), 13 deletions(-)
> > 
> > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> > index 0e0707296098..906122eac82a 100644
> > --- a/include/linux/skbuff.h
> > +++ b/include/linux/skbuff.h
> > @@ -1087,6 +1087,8 @@ struct sk_buff *build_skb(void *data, unsigned int frag_size);
> >  struct sk_buff *build_skb_around(struct sk_buff *skb,
> >  				 void *data, unsigned int frag_size);
> >  
> > +struct sk_buff *napi_build_skb(void *data, unsigned int frag_size);
> > +
> >  /**
> >   * alloc_skb - allocate a network buffer
> >   * @size: size to allocate
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 860a9d4f752f..9e1a8ded4acc 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -120,6 +120,8 @@ static void skb_under_panic(struct sk_buff *skb, unsigned int sz, void *addr)
> >  }
> >  
> >  #define NAPI_SKB_CACHE_SIZE	64
> > +#define NAPI_SKB_CACHE_BULK	16
> > +#define NAPI_SKB_CACHE_HALF	(NAPI_SKB_CACHE_SIZE / 2)
> >  
> 
> 
> -- 
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer

Thanks,
Al


  reply	other threads:[~2021-02-11 14:48 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-10 16:28 [PATCH v4 net-next 00/11] skbuff: introduce skbuff_heads bulking and reusing Alexander Lobakin
2021-02-10 16:28 ` [PATCH v4 net-next 01/11] skbuff: move __alloc_skb() next to the other skb allocation functions Alexander Lobakin
2021-02-10 16:28 ` [PATCH v4 net-next 02/11] skbuff: simplify kmalloc_reserve() Alexander Lobakin
2021-02-10 16:29 ` [PATCH v4 net-next 03/11] skbuff: make __build_skb_around() return void Alexander Lobakin
2021-02-10 16:29 ` [PATCH v4 net-next 04/11] skbuff: simplify __alloc_skb() a bit Alexander Lobakin
2021-02-10 16:29 ` [PATCH v4 net-next 05/11] skbuff: use __build_skb_around() in __alloc_skb() Alexander Lobakin
2021-02-10 16:29 ` [PATCH v4 net-next 06/11] skbuff: remove __kfree_skb_flush() Alexander Lobakin
2021-02-10 20:22   ` kernel test robot
2021-02-11  0:19   ` kernel test robot
2021-02-10 16:30 ` [PATCH v4 net-next 07/11] skbuff: move NAPI cache declarations upper in the file Alexander Lobakin
2021-02-10 16:30 ` [PATCH v4 net-next 08/11] skbuff: introduce {,__}napi_build_skb() which reuses NAPI cache heads Alexander Lobakin
2021-02-11 12:54   ` Jesper Dangaard Brouer
2021-02-11 14:38     ` Alexander Lobakin [this message]
2021-02-10 16:30 ` [PATCH v4 net-next 09/11] skbuff: allow to optionally use NAPI cache from __alloc_skb() Alexander Lobakin
2021-02-11 10:16   ` Paolo Abeni
2021-02-11 14:28     ` Alexander Lobakin
2021-02-11 14:55       ` Paolo Abeni
2021-02-11 15:26         ` Alexander Lobakin
2021-02-10 16:30 ` [PATCH v4 net-next 10/11] skbuff: allow to use NAPI cache from __napi_alloc_skb() Alexander Lobakin
2021-02-10 16:31 ` [PATCH v4 net-next 11/11] skbuff: queue NAPI_MERGED_FREE skbs into NAPI cache instead of freeing Alexander Lobakin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210211143818.2078-1-alobakin@pm.me \
    --to=alobakin@pm.me \
    --cc=0x7f454c46@gmail.com \
    --cc=andriin@fb.com \
    --cc=ap420073@gmail.com \
    --cc=ast@kernel.org \
    --cc=bjorn@kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=decui@microsoft.com \
    --cc=dvyukov@google.com \
    --cc=ecree.xilinx@gmail.com \
    --cc=edumazet@google.com \
    --cc=elver@google.com \
    --cc=fw@strlen.de \
    --cc=gnault@redhat.com \
    --cc=haokexin@gmail.com \
    --cc=jakub@cloudflare.com \
    --cc=jonathan.lemon@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcelo.leitner@gmail.com \
    --cc=mkubecek@suse.cz \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pablo@netfilter.org \
    --cc=rdunlap@infradead.org \
    --cc=willemb@google.com \
    --cc=xiyou.wangcong@gmail.com \
    --cc=yangyingliang@huawei.com \
    --cc=yhs@fb.com \
    --cc=zhudi21@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).