From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= Subject: Re: [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Date: Mon, 17 Dec 2018 16:30:05 +0100 Message-ID: References: <20181217102501.22019-1-bjorn.topel@gmail.com> <20181217135035.0d75aa32@redhat.com> <0e9c13f2-190a-18c0-efa6-a7fe0096f98f@intel.com> <20181217151017.17d708a6@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Cc: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , "Karlsson, Magnus" , ast@kernel.org, Daniel Borkmann , Network Development , William Tu , "Zhang, Qi Z" , Jakub Kicinski , andrew@lunn.ch To: Magnus Karlsson , Jesper Dangaard Brouer Return-path: Received: from mga12.intel.com ([192.55.52.136]:24194 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727738AbeLQPaL (ORCPT ); Mon, 17 Dec 2018 10:30:11 -0500 In-Reply-To: Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 2018-12-17 15:55, Magnus Karlsson wrote: > On Mon, Dec 17, 2018 at 3:10 PM Jesper Dangaard Brouer > wrote: >> >> On Mon, 17 Dec 2018 14:39:57 +0100 >> Björn Töpel wrote: >> >>> On 2018-12-17 13:50, Jesper Dangaard Brouer wrote: >>>> On Mon, 17 Dec 2018 11:24:54 +0100 >>>> Björn Töpel wrote: >>>> >>>>> XDP_ATTACH associates an XDP socket to a specific netdev Rx queue. To >>>>> redirect a packet to an attached socket from XDP, the bpf_xsk_redirect >>>>> helper is used. The bpf_xsk_redirect helper is also introduced in this >>>>> series. >>>>> >>>>> Many XDP socket users just need a simple way of creating/binding a >>>>> socket and receiving frames right away without a complicated XDP >>>>> program. "Attached" XDP sockets removes the need for the XSKMAP, and >>>>> allows for a trivial XDP program, e.g.: >>>>> >>>>> SEC("xdp") >>>>> int xdp_prog(struct xdp_md *ctx) >>>>> { >>>>> return bpf_xsk_redirect(ctx); >>>>> } >>>>> >>>>> An attached XDP socket also has better performance than the XSKMAP >>>>> based sockets (performance numbers below). >>>> >>>> I still have a general problem with this approach. >>>> >>> >>> And I appreciate that you have comments on the design/code! :-) Thank you! >>> >>> >>>> The AF_XDP socket is build around (and gets its performance) from >>>> being tied to a specific RX-queue. That design begs to have an XDP >>>> program per RX-queue. >>>> >>>> Patchset-v1 moved towards this goal. But in this patchset-v2 you >>>> steer away from this again, and work-around the issue with the current >>>> limitations of 1-XDP program per netdev. (Which result in; if a >>>> single AF_XDP socket is used in the system, which can ONLY be for a >>>> single RX-queue by design, then ALL other XDP_PASS traffic also have >>>> to take the overhead of indirect BPF call). >>>> >>> >>> I agree that a per-queue-program would be a good fit, but I think it >>> still makes sense to add XDP_ATTACH and the helper, to make it easier >>> for the XDP program authors to use AF_XDP. Would you prefer that this >>> functionality was help back, until per-queue programs are introduced? >> >> Yes, for the reasons you yourself listed in next section: >> >>> OTOH the implementation would probably look different if there was a >>> per-q program, because this would open up for more optimizations. One >>> could even imagine that the socket fd was part of the program (part of >>> relocation process) when loading the program. That could get rid of yet >>> another if-statement that check for socket existence. :-) >> >> Yes, exactly. The implementation would probably look different, when >> we look at it from a more generic angle, with per-q programs. >> >>>> IMHO with this use-case, now is the time to introduce XDP programs per >>>> RX-queue. Yes, it will be more work, but I will offer to helpout. >>>> This should be generalized as XDP programs per RX-queue can be used by >>>> other use-cases too: >>>> In general terms: We can setup a NIC hardware filter to deliver >>>> frame matching some criteria, then we can avoid rechecking these >>>> criterias in on the (RX) CPU when/if we can attach an XDP prog to this >>>> specific RX-queue directly. This *IS* exactly what AF_XDP does, but it >>>> is in general useful for others, like CPUMAP redirect. >>>> >>> >>> Fair enough, and thank you for offering to help out. And the fact that >>> *other than AF_XDP* can benefit from that is good. Before we dive down >>> this hole, what are the opinions of the BPF maintainers? Maybe it's >>> better to hack an RFC, and then take the discussion? >> >> As XDP grows, and more use-cases are added, then I fear that the single >> XDP program per netdev is going to be a performance bottleneck. As the >> single XDP program, will have to perform a lot of common checks before >> it knows what use-case this packet match. With an XDP program per >> RX-queue, we can instead leverage the hardware to pre-filter/sort >> packets, and thus simplify the XDP programs. >> And this patchset already do shows a performance advantage of >> simplifying the XDP prog and allowing to store info per RX-queue (the >> xsk-sock) that allows you do take a more direct action (avoiding exec >> some of the redirect-core code). > > Instead of introducing the XDP_ATTACH option to the bind call, can we > just make this association between rx queue id and xsk every single > time we bind? Then it is up to the user via the XDP program if he/she > wants to use this by calling xsk_redirect. No new option needed. > Nice! Then it would simply be a matter of adding the helper. Much better than extending the uapi. Thank you for pointing this out! > We could also make the setup of AF_XDP easier by just hiding the > loading and creation of the XSKMAP and the XDP program behind > xsk_socket__create in the patch set I am working on. It could > take a parameter in the configuration struct stating if libbpf > should load a predefined program (with xsk_redirect alternatively > XSKMAP that directs all traffic on a queue to a specific AF_XDP > socket), or if the user desires to set this up manually. The > built-in program would be supplied inside libbpf and would be > very small. We could start with the current XSKMAP and just setup > everything in the same way the sample program does. If people > think xsk_redirect is a good idea, then we could move over to > that, as it has higher performance. > > Please let me know what you think. > Hmm, let's start of with this, making sure AF_XDP is simple to use from a libbpf perspective, and then (maybe) move towards the bpf_xsk_redirect, depending on where the per-queue work moves. Let's drop this series for now, and focus on libbpf and the per-queue XDP programs. > BTW, I like the per RX queue XDP program idea. Could see a number of > performance optimizations that could be done given that that existed. > Yes! It would be really cool do use the socket file descriptor in the BPF program, so that the bpf_xsk_redirect could use the socket directly (analogous to bpf_map in redirect map). Another idea with per-queue programs is that the socket could be implicity bound to queue/dev when installing the program to an Rx queue. Björn > Thanks: Magnus > > >> -- >> Best regards, >> Jesper Dangaard Brouer >> MSc.CS, Principal Kernel Engineer at Red Hat >> LinkedIn: http://www.linkedin.com/in/brouer