netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Magnus Karlsson <magnus.karlsson@gmail.com>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: "Jesper Dangaard Brouer" <brouer@redhat.com>,
	"Jonathan Lemon" <jonathan.lemon@gmail.com>,
	"Magnus Karlsson" <magnus.karlsson@intel.com>,
	"Björn Töpel" <bjorn.topel@intel.com>,
	ast@kernel.org, "Network Development" <netdev@vger.kernel.org>,
	"Jakub Kicinski" <jakub.kicinski@netronome.com>,
	"Björn Töpel" <bjorn.topel@gmail.com>,
	"Zhang, Qi Z" <qi.z.zhang@intel.com>,
	xiaolong.ye@intel.com,
	"xdp-newbies@vger.kernel.org" <xdp-newbies@vger.kernel.org>
Subject: Re: [PATCH bpf-next v4 0/2] libbpf: adding AF_XDP support
Date: Mon, 18 Feb 2019 11:09:36 +0100	[thread overview]
Message-ID: <CAJ8uoz1XUE_8vjO8foa16NN1GnMp8ME8VfKxv_ovdNaHGibLuA@mail.gmail.com> (raw)
In-Reply-To: <5ed22245-fe6b-14a9-9c93-f039828a02b6@iogearbox.net>

On Mon, Feb 18, 2019 at 10:38 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 02/18/2019 09:20 AM, Magnus Karlsson wrote:
> > On Fri, Feb 15, 2019 at 5:48 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>
> >> On 02/13/2019 12:55 PM, Jesper Dangaard Brouer wrote:
> >>> On Wed, 13 Feb 2019 12:32:47 +0100
> >>> Magnus Karlsson <magnus.karlsson@gmail.com> wrote:
> >>>> On Mon, Feb 11, 2019 at 9:44 PM Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
> >>>>> On 8 Feb 2019, at 5:05, Magnus Karlsson wrote:
> >>>>>
> >>>>>> This patch proposes to add AF_XDP support to libbpf. The main reason
> >>>>>> for this is to facilitate writing applications that use AF_XDP by
> >>>>>> offering higher-level APIs that hide many of the details of the AF_XDP
> >>>>>> uapi. This is in the same vein as libbpf facilitates XDP adoption by
> >>>>>> offering easy-to-use higher level interfaces of XDP
> >>>>>> functionality. Hopefully this will facilitate adoption of AF_XDP, make
> >>>>>> applications using it simpler and smaller, and finally also make it
> >>>>>> possible for applications to benefit from optimizations in the AF_XDP
> >>>>>> user space access code. Previously, people just copied and pasted the
> >>>>>> code from the sample application into their application, which is not
> >>>>>> desirable.
> >>>>>
> >>>>> I like the idea of encapsulating the boilerplate logic in a library.
> >>>>>
> >>>>> I do think there is an important missing piece though - there should be
> >>>>> some code which queries the netdev for how many queues are attached, and
> >>>>> create the appropriate number of umem/AF_XDP sockets.
> >>>>>
> >>>>> I ran into this issue when testing the current AF_XDP code - on my test
> >>>>> boxes, the mlx5 card has 55 channels (aka queues), so when the test program
> >>>>> binds only to channel 0, nothing works as expected, since not all traffic
> >>>>> is being intercepted.  While obvious in hindsight, this took a while to
> >>>>> track down.
> >>>>
> >>>> Yes, agreed. You are not the first one to stumble upon this problem
> >>>> :-). Let me think a little bit on how to solve this in a good way. We
> >>>> need this to be simple and intuitive, as you say.
> >>>
> >>> I see people hitting this with AF_XDP all the time... I had some
> >>> backup-slides[2] in our FOSDEM presentation[1] that describe the issue,
> >>> give the performance reason why and propose a workaround.
> >>
> >> Magnus, I presume you're going to address this for the initial libbpf merge
> >> since the plan is to make it easier to consume for users?
> >
> > I think the first thing we need is education and documentation. Have a
> > FAQ or "common mistakes" section in the Documentation. And of course,
> > sending Jesper around the world reminding people about this ;-).
> >
> > To address this on a libbpf interface level, I think the best way is
> > to reprogram the NIC to send all traffic to the queue that you
> > provided in the xsk_socket__create call. This "set up NIC routing"
> > behavior can then be disable with a flag, just as the XDP program
> > loading can be disabled. The standard config of xsk_socket__create
> > will then set up as many things for the user as possible just to get
> > up and running quickly. More advanced users can then disable parts of
> > it to gain more flexibility. Does this sound OK? Do not want to go the
> > route of polling multiple sockets and aggregating the traffic as this
> > will have significant negative performance implications.
>
> I think that is fine, I would probably make this one a dedicated API call
> in order to have some more flexibility than just simple flag. E.g. once
> nfp AF_XDP support lands at some point, I could imagine that this call
> resp. a drop-in replacement API call for more advanced steering could
> also take an offloaded BPF prog fd, for example, which then would program
> the steering on the NIC [0]. Seems at least there's enough complexity on
> its own to have a dedicated API for it. Thoughts?

I agree that there is probably enough complexity to warrant adding a
higher level API to deal with this problem (flow steering). But there
are likely a number of cases we have not thought that would complicate
it even further. This is why I suggest that this functionality should
be in its own patch set that I can devote some time and thought to.
IMO, the current patch set and functionality does already lower the
bar of entry significantly and has a value even without hiding or
controlling the steering of traffic. What I would like to do in this
patch set is to add a FAQ section in
Documentation/networking/af_xdp.rst explaining this problem. Something
like: "Q: Why am I not seeing any traffic? A: Check these four
things.....". Could add some text in the libbpf README referring to
this document also. Opinions?

Thanks: Magnus

> Thanks,
> Daniel
>
>   [0] https://patchwork.ozlabs.org/cover/910614/

  reply	other threads:[~2019-02-18 10:09 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-08 13:05 [PATCH bpf-next v4 0/2] libbpf: adding AF_XDP support Magnus Karlsson
2019-02-08 13:05 ` [PATCH bpf-next v4 1/2] libbpf: add support for using AF_XDP sockets Magnus Karlsson
2019-02-15 16:37   ` Daniel Borkmann
2019-02-18  8:59     ` Magnus Karlsson
2019-02-18 11:21       ` Maciej Fijalkowski
2019-02-08 13:05 ` [PATCH bpf-next v4 2/2] samples/bpf: convert xdpsock to use libbpf for AF_XDP access Magnus Karlsson
2019-02-11  6:33 ` [PATCH bpf-next v4 0/2] libbpf: adding AF_XDP support Jean-Mickael Guerin
2019-02-11  7:52   ` Magnus Karlsson
2019-02-11 19:48 ` Jonathan Lemon
2019-02-13 11:32   ` Magnus Karlsson
2019-02-13 11:55     ` Jesper Dangaard Brouer
2019-02-15 16:20       ` Daniel Borkmann
2019-02-18  8:20         ` Magnus Karlsson
2019-02-18  9:38           ` Daniel Borkmann
2019-02-18 10:09             ` Magnus Karlsson [this message]
2019-02-13 20:49     ` Jonathan Lemon
2019-02-14  8:25       ` Magnus Karlsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJ8uoz1XUE_8vjO8foa16NN1GnMp8ME8VfKxv_ovdNaHGibLuA@mail.gmail.com \
    --to=magnus.karlsson@gmail.com \
    --cc=ast@kernel.org \
    --cc=bjorn.topel@gmail.com \
    --cc=bjorn.topel@intel.com \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=jakub.kicinski@netronome.com \
    --cc=jonathan.lemon@gmail.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=qi.z.zhang@intel.com \
    --cc=xdp-newbies@vger.kernel.org \
    --cc=xiaolong.ye@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).