All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Borkmann <daniel@iogearbox.net>
To: Jesper Dangaard Brouer <brouer@redhat.com>, netdev@vger.kernel.org
Cc: jakub.kicinski@netronome.com,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	mchan@broadcom.com, John Fastabend <john.fastabend@gmail.com>,
	peter.waskiewicz.jr@intel.com,
	Daniel Borkmann <borkmann@iogearbox.net>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Andy Gospodarek <andy@greyhouse.net>,
	edumazet@google.com
Subject: Re: [net-next PATCH 0/5] New bpf cpumap type for XDP_REDIRECT
Date: Fri, 29 Sep 2017 00:45:40 +0200	[thread overview]
Message-ID: <59CD7B94.8010103@iogearbox.net> (raw)
In-Reply-To: <150660339205.2808.7084136789768233829.stgit@firesoul>

On 09/28/2017 02:57 PM, Jesper Dangaard Brouer wrote:
> Introducing a new way to redirect XDP frames.  Notice how no driver
> changes are necessary given the design of XDP_REDIRECT.
>
> This redirect map type is called 'cpumap', as it allows redirection
> XDP frames to remote CPUs.  The remote CPU will do the SKB allocation
> and start the network stack invocation on that CPU.
>
> This is a scalability and isolation mechanism, that allow separating
> the early driver network XDP layer, from the rest of the netstack, and
> assigning dedicated CPUs for this stage.  The sysadm control/configure
> the RX-CPU to NIC-RX queue (as usual) via procfs smp_affinity and how
> many queues are configured via ethtool --set-channels.  Benchmarks
> show that a single CPU can handle approx 11Mpps.  Thus, only assigning
> two NIC RX-queues (and two CPUs) is sufficient for handling 10Gbit/s
> wirespeed smallest packet 14.88Mpps.  Reducing the number of queues
> have the advantage that more packets being "bulk" available per hard
> interrupt[1].
>
> [1] https://www.netdevconf.org/2.1/papers/BusyPollingNextGen.pdf
>
> Use-cases:
>
> 1. End-host based pre-filtering for DDoS mitigation.  This is fast
>     enough to allow software to see and filter all packets wirespeed.
>     Thus, no packets getting silently dropped by hardware.
>
> 2. Given NIC HW unevenly distributes packets across RX queue, this
>     mechanism can be used for redistribution load across CPUs.  This
>     usually happens when HW is unaware of a new protocol.  This
>     resembles RPS (Receive Packet Steering), just faster, but with more
>     responsibility placed on the BPF program for correct steering.
>
> 3. Auto-scaling or power saving via only activating the appropriate
>     number of remote CPUs for handling the current load.  The cpumap
>     tracepoints can function as a feedback loop for this purpose.

Interesting work, thanks! Still digesting the code a bit. I think
it pretty much goes into the direction that Eric describes in his
netdev paper quoted above; not on a generic level though but specific
to XDP at least; theoretically XDP could just run transparently on
the CPU doing the filtering, and raw buffers are handed to remote
CPU with similar batching, but it would need some different config
interface at minimum.

Shouldn't we take the CPU(s) running XDP on the RX queues out from
the normal process scheduler, so that we have a guarantee that user
space or unrelated kernel tasks cannot interfere with them anymore,
and we could then turn them into busy polling eventually (e.g. as
long as XDP is running there and once off could put them back into
normal scheduling domain transparently)?

What about RPS/RFS in the sense that once you punt them to remote
CPU, could we reuse application locality information so they'd end
up on the right CPU in the first place (w/o backlog detour), or is
the intent to rather disable it and have some own orchestration
with relation to the CPU map?

Cheers,
Daniel

  parent reply	other threads:[~2017-09-28 22:45 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-28 12:57 [net-next PATCH 0/5] New bpf cpumap type for XDP_REDIRECT Jesper Dangaard Brouer
2017-09-28 12:57 ` [net-next PATCH 1/5] bpf: introduce new bpf cpu map type BPF_MAP_TYPE_CPUMAP Jesper Dangaard Brouer
2017-09-29  3:21   ` Alexei Starovoitov
2017-09-29  7:56     ` Hannes Frederic Sowa
2017-09-29  9:37       ` Paolo Abeni
2017-09-29  9:40         ` Hannes Frederic Sowa
2017-09-29  9:14     ` Jesper Dangaard Brouer
2017-09-28 12:57 ` [net-next PATCH 2/5] bpf: XDP_REDIRECT enable use of cpumap Jesper Dangaard Brouer
2017-09-28 12:57 ` [net-next PATCH 3/5] bpf: cpumap xdp_buff to skb conversion and allocation Jesper Dangaard Brouer
2017-09-28 23:21   ` Daniel Borkmann
2017-09-29  7:46     ` Jesper Dangaard Brouer
2017-09-29  9:49   ` Jason Wang
2017-09-29 13:05     ` Jesper Dangaard Brouer
2017-09-28 12:57 ` [net-next PATCH 4/5] bpf: cpumap add tracepoints Jesper Dangaard Brouer
2017-09-28 12:57 ` [net-next PATCH 5/5] samples/bpf: add cpumap sample program xdp_redirect_cpu Jesper Dangaard Brouer
2017-09-28 22:45 ` Daniel Borkmann [this message]
2017-09-29  6:53   ` [net-next PATCH 0/5] New bpf cpumap type for XDP_REDIRECT Jesper Dangaard Brouer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=59CD7B94.8010103@iogearbox.net \
    --to=daniel@iogearbox.net \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andy@greyhouse.net \
    --cc=borkmann@iogearbox.net \
    --cc=brouer@redhat.com \
    --cc=edumazet@google.com \
    --cc=jakub.kicinski@netronome.com \
    --cc=jasowang@redhat.com \
    --cc=john.fastabend@gmail.com \
    --cc=mchan@broadcom.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=peter.waskiewicz.jr@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.