netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Magnus Karlsson <magnus.karlsson@gmail.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: "Björn Töpel" <bjorn.topel@intel.com>,
	"Björn Töpel" <bjorn.topel@gmail.com>,
	"Karlsson, Magnus" <magnus.karlsson@intel.com>,
	ast@kernel.org, "Daniel Borkmann" <daniel@iogearbox.net>,
	"Network Development" <netdev@vger.kernel.org>,
	"William Tu" <u9012063@gmail.com>,
	"Zhang, Qi Z" <qi.z.zhang@intel.com>,
	"Jakub Kicinski" <jakub.kicinski@netronome.com>,
	andrew@lunn.ch
Subject: Re: [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH
Date: Mon, 17 Dec 2018 15:55:02 +0100	[thread overview]
Message-ID: <CAJ8uoz1PVbEuBsWYi7xe+zmbkHfRi=Rp115pRfa5bnyq+j3Utg@mail.gmail.com> (raw)
In-Reply-To: <20181217151017.17d708a6@redhat.com>

On Mon, Dec 17, 2018 at 3:10 PM Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
>
> On Mon, 17 Dec 2018 14:39:57 +0100
> Björn Töpel <bjorn.topel@intel.com> wrote:
>
> > On 2018-12-17 13:50, Jesper Dangaard Brouer wrote:
> > > On Mon, 17 Dec 2018 11:24:54 +0100
> > > Björn Töpel <bjorn.topel@gmail.com> wrote:
> > >
> > >> XDP_ATTACH associates an XDP socket to a specific netdev Rx queue. To
> > >> redirect a packet to an attached socket from XDP, the bpf_xsk_redirect
> > >> helper is used. The bpf_xsk_redirect helper is also introduced in this
> > >> series.
> > >>
> > >> Many XDP socket users just need a simple way of creating/binding a
> > >> socket and receiving frames right away without a complicated XDP
> > >> program. "Attached" XDP sockets removes the need for the XSKMAP, and
> > >> allows for a trivial XDP program, e.g.:
> > >>
> > >>    SEC("xdp")
> > >>    int xdp_prog(struct xdp_md *ctx)
> > >>    {
> > >>          return bpf_xsk_redirect(ctx);
> > >>    }
> > >>
> > >> An attached XDP socket also has better performance than the XSKMAP
> > >> based sockets (performance numbers below).
> > >
> > > I still have a general problem with this approach.
> > >
> >
> > And I appreciate that you have comments on the design/code! :-) Thank you!
> >
> >
> > > The AF_XDP socket is build around (and gets its performance) from
> > > being tied to a specific RX-queue.  That design begs to have an XDP
> > > program per RX-queue.
> > >
> > > Patchset-v1 moved towards this goal.  But in this patchset-v2 you
> > > steer away from this again, and work-around the issue with the current
> > > limitations of 1-XDP program per netdev.  (Which result in; if a
> > > single AF_XDP socket is used in the system, which can ONLY be for a
> > > single RX-queue by design, then ALL other XDP_PASS traffic also have
> > > to take the overhead of indirect BPF call).
> > >
> >
> > I agree that a per-queue-program would be a good fit, but I think it
> > still makes sense to add XDP_ATTACH and the helper, to make it easier
> > for the XDP program authors to use AF_XDP. Would you prefer that this
> > functionality was help back, until per-queue programs are introduced?
>
> Yes, for the reasons you yourself listed in next section:
>
> > OTOH the implementation would probably look different if there was a
> > per-q program, because this would open up for more optimizations. One
> > could even imagine that the socket fd was part of the program (part of
> > relocation process) when loading the program. That could get rid of yet
> > another if-statement that check for socket existence. :-)
>
> Yes, exactly.  The implementation would probably look different, when
> we look at it from a more generic angle, with per-q programs.
>
> > > IMHO with this use-case, now is the time to introduce XDP programs per
> > > RX-queue.  Yes, it will be more work, but I will offer to helpout.
> > > This should be generalized as XDP programs per RX-queue can be used by
> > > other use-cases too:
> > >    In general terms: We can setup a NIC hardware filter to deliver
> > > frame matching some criteria, then we can avoid rechecking these
> > > criterias in on the (RX) CPU when/if we can attach an XDP prog to this
> > > specific RX-queue directly.  This *IS* exactly what AF_XDP does, but it
> > > is in general useful for others, like CPUMAP redirect.
> > >
> >
> > Fair enough, and thank you for offering to help out. And the fact that
> > *other than AF_XDP* can benefit from that is good. Before we dive down
> > this hole, what are the opinions of the BPF maintainers? Maybe it's
> > better to hack an RFC, and then take the discussion?
>
> As XDP grows, and more use-cases are added, then I fear that the single
> XDP program per netdev is going to be a performance bottleneck.  As the
> single XDP program, will have to perform a lot of common checks before
> it knows what use-case this packet match.  With an XDP program per
> RX-queue, we can instead leverage the hardware to pre-filter/sort
> packets, and thus simplify the XDP programs.
>   And this patchset already do shows a performance advantage of
> simplifying the XDP prog and allowing to store info per RX-queue (the
> xsk-sock) that allows you do take a more direct action (avoiding exec
> some of the redirect-core code).

Instead of introducing the XDP_ATTACH option to the bind call, can we
just make this association between rx queue id and xsk every single
time we bind? Then it is up to the user via the XDP program if he/she
wants to use this by calling xsk_redirect. No new option needed.

We could also make the setup of AF_XDP easier by just hiding the
loading and creation of the XSKMAP and the XDP program behind
xsk_socket__create in the patch set I am working on. It could
take a parameter in the configuration struct stating if libbpf
should load a predefined program (with xsk_redirect alternatively
XSKMAP that directs all traffic on a queue to a specific AF_XDP
socket), or if the user desires to set this up manually. The
built-in program would be supplied inside libbpf and would be
very small. We could start with the current XSKMAP and just setup
everything in the same way the sample program does. If people
think xsk_redirect is a good idea, then we could move over to
that, as it has higher performance.

Please let me know what you think.

BTW, I like the per RX queue XDP program idea. Could see a number of
performance optimizations that could be done given that that existed.

Thanks: Magnus


> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer

  reply	other threads:[~2018-12-17 14:55 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-17 10:24 [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 1/7] xsk: simplify AF_XDP socket teardown Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 2/7] xsk: add XDP_ATTACH bind() flag Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 3/7] bpf: add bpf_xsk_redirect function Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 4/7] tools/bpf: sync kernel uapi bpf.h to tools directory Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 5/7] libbpf: initial support for builtin BPF programs Björn Töpel
2018-12-17 10:25 ` [PATCH bpf-next v2 6/7] samples: bpf: simplify/cleanup xdpsock Björn Töpel
2018-12-17 10:25 ` [PATCH bpf-next v2 7/7] samples: bpf: add support for XDP_ATTACH to xdpsock Björn Töpel
2018-12-17 12:50 ` [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Jesper Dangaard Brouer
2018-12-17 13:39   ` Björn Töpel
2018-12-17 14:10     ` Jesper Dangaard Brouer
2018-12-17 14:55       ` Magnus Karlsson [this message]
2018-12-17 15:30         ` Björn Töpel
2018-12-18 23:04           ` Alexei Starovoitov
2018-12-19  9:34             ` Björn Töpel
2018-12-19 16:23               ` Jesper Dangaard Brouer
2018-12-20  0:48               ` Alexei Starovoitov
2018-12-18  2:36     ` Jakub Kicinski
2018-12-18  8:59       ` Björn Töpel
2018-12-18 18:18         ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJ8uoz1PVbEuBsWYi7xe+zmbkHfRi=Rp115pRfa5bnyq+j3Utg@mail.gmail.com' \
    --to=magnus.karlsson@gmail.com \
    --cc=andrew@lunn.ch \
    --cc=ast@kernel.org \
    --cc=bjorn.topel@gmail.com \
    --cc=bjorn.topel@intel.com \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=jakub.kicinski@netronome.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=qi.z.zhang@intel.com \
    --cc=u9012063@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).