netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Björn Töpel" <bjorn.topel@intel.com>
To: Magnus Karlsson <magnus.karlsson@gmail.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>
Cc: "Björn Töpel" <bjorn.topel@gmail.com>,
	"Karlsson, Magnus" <magnus.karlsson@intel.com>,
	ast@kernel.org, "Daniel Borkmann" <daniel@iogearbox.net>,
	"Network Development" <netdev@vger.kernel.org>,
	"William Tu" <u9012063@gmail.com>,
	"Zhang, Qi Z" <qi.z.zhang@intel.com>,
	"Jakub Kicinski" <jakub.kicinski@netronome.com>,
	andrew@lunn.ch
Subject: Re: [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH
Date: Mon, 17 Dec 2018 16:30:05 +0100	[thread overview]
Message-ID: <cd952f99-6bad-e0c8-5bcd-f0010218238c@intel.com> (raw)
In-Reply-To: <CAJ8uoz1PVbEuBsWYi7xe+zmbkHfRi=Rp115pRfa5bnyq+j3Utg@mail.gmail.com>



On 2018-12-17 15:55, Magnus Karlsson wrote:
> On Mon, Dec 17, 2018 at 3:10 PM Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
>>
>> On Mon, 17 Dec 2018 14:39:57 +0100
>> Björn Töpel <bjorn.topel@intel.com> wrote:
>>
>>> On 2018-12-17 13:50, Jesper Dangaard Brouer wrote:
>>>> On Mon, 17 Dec 2018 11:24:54 +0100
>>>> Björn Töpel <bjorn.topel@gmail.com> wrote:
>>>>
>>>>> XDP_ATTACH associates an XDP socket to a specific netdev Rx queue. To
>>>>> redirect a packet to an attached socket from XDP, the bpf_xsk_redirect
>>>>> helper is used. The bpf_xsk_redirect helper is also introduced in this
>>>>> series.
>>>>>
>>>>> Many XDP socket users just need a simple way of creating/binding a
>>>>> socket and receiving frames right away without a complicated XDP
>>>>> program. "Attached" XDP sockets removes the need for the XSKMAP, and
>>>>> allows for a trivial XDP program, e.g.:
>>>>>
>>>>>     SEC("xdp")
>>>>>     int xdp_prog(struct xdp_md *ctx)
>>>>>     {
>>>>>           return bpf_xsk_redirect(ctx);
>>>>>     }
>>>>>
>>>>> An attached XDP socket also has better performance than the XSKMAP
>>>>> based sockets (performance numbers below).
>>>>
>>>> I still have a general problem with this approach.
>>>>
>>>
>>> And I appreciate that you have comments on the design/code! :-) Thank you!
>>>
>>>
>>>> The AF_XDP socket is build around (and gets its performance) from
>>>> being tied to a specific RX-queue.  That design begs to have an XDP
>>>> program per RX-queue.
>>>>
>>>> Patchset-v1 moved towards this goal.  But in this patchset-v2 you
>>>> steer away from this again, and work-around the issue with the current
>>>> limitations of 1-XDP program per netdev.  (Which result in; if a
>>>> single AF_XDP socket is used in the system, which can ONLY be for a
>>>> single RX-queue by design, then ALL other XDP_PASS traffic also have
>>>> to take the overhead of indirect BPF call).
>>>>
>>>
>>> I agree that a per-queue-program would be a good fit, but I think it
>>> still makes sense to add XDP_ATTACH and the helper, to make it easier
>>> for the XDP program authors to use AF_XDP. Would you prefer that this
>>> functionality was help back, until per-queue programs are introduced?
>>
>> Yes, for the reasons you yourself listed in next section:
>>
>>> OTOH the implementation would probably look different if there was a
>>> per-q program, because this would open up for more optimizations. One
>>> could even imagine that the socket fd was part of the program (part of
>>> relocation process) when loading the program. That could get rid of yet
>>> another if-statement that check for socket existence. :-)
>>
>> Yes, exactly.  The implementation would probably look different, when
>> we look at it from a more generic angle, with per-q programs.
>>
>>>> IMHO with this use-case, now is the time to introduce XDP programs per
>>>> RX-queue.  Yes, it will be more work, but I will offer to helpout.
>>>> This should be generalized as XDP programs per RX-queue can be used by
>>>> other use-cases too:
>>>>     In general terms: We can setup a NIC hardware filter to deliver
>>>> frame matching some criteria, then we can avoid rechecking these
>>>> criterias in on the (RX) CPU when/if we can attach an XDP prog to this
>>>> specific RX-queue directly.  This *IS* exactly what AF_XDP does, but it
>>>> is in general useful for others, like CPUMAP redirect.
>>>>
>>>
>>> Fair enough, and thank you for offering to help out. And the fact that
>>> *other than AF_XDP* can benefit from that is good. Before we dive down
>>> this hole, what are the opinions of the BPF maintainers? Maybe it's
>>> better to hack an RFC, and then take the discussion?
>>
>> As XDP grows, and more use-cases are added, then I fear that the single
>> XDP program per netdev is going to be a performance bottleneck.  As the
>> single XDP program, will have to perform a lot of common checks before
>> it knows what use-case this packet match.  With an XDP program per
>> RX-queue, we can instead leverage the hardware to pre-filter/sort
>> packets, and thus simplify the XDP programs.
>>    And this patchset already do shows a performance advantage of
>> simplifying the XDP prog and allowing to store info per RX-queue (the
>> xsk-sock) that allows you do take a more direct action (avoiding exec
>> some of the redirect-core code).
> 
> Instead of introducing the XDP_ATTACH option to the bind call, can we
> just make this association between rx queue id and xsk every single
> time we bind? Then it is up to the user via the XDP program if he/she
> wants to use this by calling xsk_redirect. No new option needed.
>

Nice! Then it would simply be a matter of adding the helper. Much better
than extending the uapi. Thank you for pointing this out!


> We could also make the setup of AF_XDP easier by just hiding the
> loading and creation of the XSKMAP and the XDP program behind
> xsk_socket__create in the patch set I am working on. It could
> take a parameter in the configuration struct stating if libbpf
> should load a predefined program (with xsk_redirect alternatively
> XSKMAP that directs all traffic on a queue to a specific AF_XDP
> socket), or if the user desires to set this up manually. The
> built-in program would be supplied inside libbpf and would be
> very small. We could start with the current XSKMAP and just setup
> everything in the same way the sample program does. If people
> think xsk_redirect is a good idea, then we could move over to
> that, as it has higher performance.
> 
> Please let me know what you think.
>

Hmm, let's start of with this, making sure AF_XDP is simple to use from
a libbpf perspective, and then (maybe) move towards the
bpf_xsk_redirect, depending on where the per-queue work moves.

Let's drop this series for now, and focus on libbpf and the per-queue 
XDP programs.


> BTW, I like the per RX queue XDP program idea. Could see a number of
> performance optimizations that could be done given that that existed.
>

Yes! It would be really cool do use the socket file descriptor in the
BPF program, so that the bpf_xsk_redirect could use the socket directly
(analogous to bpf_map in redirect map).

Another idea with per-queue programs is that the socket could be 
implicity bound to queue/dev when installing the program to an Rx queue.


Björn

> Thanks: Magnus
> 
> 
>> --
>> Best regards,
>>    Jesper Dangaard Brouer
>>    MSc.CS, Principal Kernel Engineer at Red Hat
>>    LinkedIn: http://www.linkedin.com/in/brouer

  reply	other threads:[~2018-12-17 15:30 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-17 10:24 [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 1/7] xsk: simplify AF_XDP socket teardown Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 2/7] xsk: add XDP_ATTACH bind() flag Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 3/7] bpf: add bpf_xsk_redirect function Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 4/7] tools/bpf: sync kernel uapi bpf.h to tools directory Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 5/7] libbpf: initial support for builtin BPF programs Björn Töpel
2018-12-17 10:25 ` [PATCH bpf-next v2 6/7] samples: bpf: simplify/cleanup xdpsock Björn Töpel
2018-12-17 10:25 ` [PATCH bpf-next v2 7/7] samples: bpf: add support for XDP_ATTACH to xdpsock Björn Töpel
2018-12-17 12:50 ` [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Jesper Dangaard Brouer
2018-12-17 13:39   ` Björn Töpel
2018-12-17 14:10     ` Jesper Dangaard Brouer
2018-12-17 14:55       ` Magnus Karlsson
2018-12-17 15:30         ` Björn Töpel [this message]
2018-12-18 23:04           ` Alexei Starovoitov
2018-12-19  9:34             ` Björn Töpel
2018-12-19 16:23               ` Jesper Dangaard Brouer
2018-12-20  0:48               ` Alexei Starovoitov
2018-12-18  2:36     ` Jakub Kicinski
2018-12-18  8:59       ` Björn Töpel
2018-12-18 18:18         ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cd952f99-6bad-e0c8-5bcd-f0010218238c@intel.com \
    --to=bjorn.topel@intel.com \
    --cc=andrew@lunn.ch \
    --cc=ast@kernel.org \
    --cc=bjorn.topel@gmail.com \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=jakub.kicinski@netronome.com \
    --cc=magnus.karlsson@gmail.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=qi.z.zhang@intel.com \
    --cc=u9012063@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).