From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.9 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6297C43381 for ; Mon, 18 Feb 2019 10:09:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9A6D02147A for ; Mon, 18 Feb 2019 10:09:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="SqAdxFf7" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729660AbfBRKJt (ORCPT ); Mon, 18 Feb 2019 05:09:49 -0500 Received: from mail-ot1-f67.google.com ([209.85.210.67]:35534 "EHLO mail-ot1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728908AbfBRKJt (ORCPT ); Mon, 18 Feb 2019 05:09:49 -0500 Received: by mail-ot1-f67.google.com with SMTP id z19so27409956otm.2; Mon, 18 Feb 2019 02:09:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=EyG2jVuozVFTs4qOeX3S+Iu5C9tN+mQyGkmubjU/354=; b=SqAdxFf7xBl1cAEHzJIViwlnm0aGfNKPedGRta/B7W2zdu9q0fd1A9AoKhZ4D3IuWY iWDJM/o0euajU3Lc/3SKftRo6auKzgFRt2mlg9NEW6zeo5PnDVKhZCDAbxbC85SbC6mI d1ZAdSh7F0kE+pPHzH+GdfDdpQlEaVQadFvsfGS+BR41PYfE+blmrQx761fj1+yFwkoP ZuXmw6T6zT4u72EfNVDWqg8g6TyhpgwtnMGqEt550hAenHwdV4RmR/9pV0omfOjREERE 20Z7rn8BWOanfqABgqesPo7sKTaF43yWoOSFW/BU5W1uf80hkGbIp9Vpx/eFIXwQjY+A vyhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=EyG2jVuozVFTs4qOeX3S+Iu5C9tN+mQyGkmubjU/354=; b=I2qzOjBZO+oHvlm7moA9v3pfnei/zYX3sGi0R7yMlU/iOFQv1nISkMC0ote574GhA4 1K6Fk9qsNeb149vz6VbUsIqvQggs3bxKORpj7EQTkhQLrL/7jcB+8NyN5uvQPV6l+T/6 Y9P5Dll2XVIBokJiCJnVLw7olE0XQBUykOI1sXydm1mjzyckoDW0BULHlZAyA800o9uu bf3Zj8SIpPdjTP65WKd0NZcDz17tjy3eeBF2n13lj8Pig0wZQOXSqXKyuikY6CAXhX0O bsjFUw4QGfCHD6F84tGnNYg0aeAJxkynw5yRyhwA0ZQgfTaLPNg2d4stA6X9Qa7xa7Y2 i5vQ== X-Gm-Message-State: AHQUAuYIsXd8uk/Sb1reRnzwm5oSCU8jCmkShXt66MFqLC+H526qpwU1 Vo/miDPCy8K7OGg+wvDWomPX7zxTa3a3+YTqXig= X-Google-Smtp-Source: AHgI3Ib6wpGyOGJjgtzpT7GdmD2UGskK9q755eUINzma7VoqjJFCcupT/lioajErOU0L9B1yaOws12RMmFZBiDmUjdQ= X-Received: by 2002:a9d:5f9e:: with SMTP id g30mr14559204oti.62.1550484587709; Mon, 18 Feb 2019 02:09:47 -0800 (PST) MIME-Version: 1.0 References: <1549631126-29067-1-git-send-email-magnus.karlsson@intel.com> <36557463-D23A-432E-AA18-7731F43CEBA6@gmail.com> <20190213125530.4a7fb8bc@carbon> <20ba7719-b660-462c-a6bf-6c749e1f2f30@iogearbox.net> <5ed22245-fe6b-14a9-9c93-f039828a02b6@iogearbox.net> In-Reply-To: <5ed22245-fe6b-14a9-9c93-f039828a02b6@iogearbox.net> From: Magnus Karlsson Date: Mon, 18 Feb 2019 11:09:36 +0100 Message-ID: Subject: Re: [PATCH bpf-next v4 0/2] libbpf: adding AF_XDP support To: Daniel Borkmann Cc: Jesper Dangaard Brouer , Jonathan Lemon , Magnus Karlsson , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , ast@kernel.org, Network Development , Jakub Kicinski , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , "Zhang, Qi Z" , xiaolong.ye@intel.com, "xdp-newbies@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Mon, Feb 18, 2019 at 10:38 AM Daniel Borkmann wrote: > > On 02/18/2019 09:20 AM, Magnus Karlsson wrote: > > On Fri, Feb 15, 2019 at 5:48 PM Daniel Borkmann wrote: > >> > >> On 02/13/2019 12:55 PM, Jesper Dangaard Brouer wrote: > >>> On Wed, 13 Feb 2019 12:32:47 +0100 > >>> Magnus Karlsson wrote: > >>>> On Mon, Feb 11, 2019 at 9:44 PM Jonathan Lemon wrote: > >>>>> On 8 Feb 2019, at 5:05, Magnus Karlsson wrote: > >>>>> > >>>>>> This patch proposes to add AF_XDP support to libbpf. The main reason > >>>>>> for this is to facilitate writing applications that use AF_XDP by > >>>>>> offering higher-level APIs that hide many of the details of the AF_XDP > >>>>>> uapi. This is in the same vein as libbpf facilitates XDP adoption by > >>>>>> offering easy-to-use higher level interfaces of XDP > >>>>>> functionality. Hopefully this will facilitate adoption of AF_XDP, make > >>>>>> applications using it simpler and smaller, and finally also make it > >>>>>> possible for applications to benefit from optimizations in the AF_XDP > >>>>>> user space access code. Previously, people just copied and pasted the > >>>>>> code from the sample application into their application, which is not > >>>>>> desirable. > >>>>> > >>>>> I like the idea of encapsulating the boilerplate logic in a library. > >>>>> > >>>>> I do think there is an important missing piece though - there should be > >>>>> some code which queries the netdev for how many queues are attached, and > >>>>> create the appropriate number of umem/AF_XDP sockets. > >>>>> > >>>>> I ran into this issue when testing the current AF_XDP code - on my test > >>>>> boxes, the mlx5 card has 55 channels (aka queues), so when the test program > >>>>> binds only to channel 0, nothing works as expected, since not all traffic > >>>>> is being intercepted. While obvious in hindsight, this took a while to > >>>>> track down. > >>>> > >>>> Yes, agreed. You are not the first one to stumble upon this problem > >>>> :-). Let me think a little bit on how to solve this in a good way. We > >>>> need this to be simple and intuitive, as you say. > >>> > >>> I see people hitting this with AF_XDP all the time... I had some > >>> backup-slides[2] in our FOSDEM presentation[1] that describe the issue, > >>> give the performance reason why and propose a workaround. > >> > >> Magnus, I presume you're going to address this for the initial libbpf merge > >> since the plan is to make it easier to consume for users? > > > > I think the first thing we need is education and documentation. Have a > > FAQ or "common mistakes" section in the Documentation. And of course, > > sending Jesper around the world reminding people about this ;-). > > > > To address this on a libbpf interface level, I think the best way is > > to reprogram the NIC to send all traffic to the queue that you > > provided in the xsk_socket__create call. This "set up NIC routing" > > behavior can then be disable with a flag, just as the XDP program > > loading can be disabled. The standard config of xsk_socket__create > > will then set up as many things for the user as possible just to get > > up and running quickly. More advanced users can then disable parts of > > it to gain more flexibility. Does this sound OK? Do not want to go the > > route of polling multiple sockets and aggregating the traffic as this > > will have significant negative performance implications. > > I think that is fine, I would probably make this one a dedicated API call > in order to have some more flexibility than just simple flag. E.g. once > nfp AF_XDP support lands at some point, I could imagine that this call > resp. a drop-in replacement API call for more advanced steering could > also take an offloaded BPF prog fd, for example, which then would program > the steering on the NIC [0]. Seems at least there's enough complexity on > its own to have a dedicated API for it. Thoughts? I agree that there is probably enough complexity to warrant adding a higher level API to deal with this problem (flow steering). But there are likely a number of cases we have not thought that would complicate it even further. This is why I suggest that this functionality should be in its own patch set that I can devote some time and thought to. IMO, the current patch set and functionality does already lower the bar of entry significantly and has a value even without hiding or controlling the steering of traffic. What I would like to do in this patch set is to add a FAQ section in Documentation/networking/af_xdp.rst explaining this problem. Something like: "Q: Why am I not seeing any traffic? A: Check these four things.....". Could add some text in the libbpf README referring to this document also. Opinions? Thanks: Magnus > Thanks, > Daniel > > [0] https://patchwork.ozlabs.org/cover/910614/