From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2442C433F1 for ; Tue, 28 Jul 2020 07:18:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B40E320792 for ; Tue, 28 Jul 2020 07:18:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727908AbgG1HS6 (ORCPT ); Tue, 28 Jul 2020 03:18:58 -0400 Received: from mga07.intel.com ([134.134.136.100]:17242 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727877AbgG1HS6 (ORCPT ); Tue, 28 Jul 2020 03:18:58 -0400 IronPort-SDR: 9+QqGbS8+aFByoCknMWRGdx5/mSWhUU0vGRl6TFqXztlokBHT7c2Ulb1YYesJjzYy6SgjALLMd gi7AjTyt2+5g== X-IronPort-AV: E=McAfee;i="6000,8403,9695"; a="215690863" X-IronPort-AV: E=Sophos;i="5.75,405,1589266800"; d="scan'208";a="215690863" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Jul 2020 00:18:53 -0700 IronPort-SDR: EunS8WH/e+dDiDLFkEoRVofO7DdYWDO0fAu42vKeFxTWgqLvYk4Sm67BPd4Z7n4tBqz6OWD4zI FrxLINJkJHyw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,405,1589266800"; d="scan'208";a="322093688" Received: from nheyde-mobl1.ger.corp.intel.com (HELO btopel-mobl.ger.intel.com) ([10.252.57.223]) by fmsmga002.fm.intel.com with ESMTP; 28 Jul 2020 00:18:49 -0700 Subject: Re: [PATCH bpf-next v4 14/14] xsk: documentation for XDP_SHARED_UMEM between queues and netdevs To: Magnus Karlsson , ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org, jonathan.lemon@gmail.com, maximmi@mellanox.com Cc: bpf@vger.kernel.org, jeffrey.t.kirsher@intel.com, anthony.l.nguyen@intel.com, maciej.fijalkowski@intel.com, maciejromanfijalkowski@gmail.com, cristian.dumitrescu@intel.com References: <1595307848-20719-1-git-send-email-magnus.karlsson@intel.com> <1595307848-20719-15-git-send-email-magnus.karlsson@intel.com> From: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= Message-ID: <59554482-dc1d-7758-c090-9fada8c66151@intel.com> Date: Tue, 28 Jul 2020 09:18:48 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <1595307848-20719-15-git-send-email-magnus.karlsson@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org On 2020-07-21 07:04, Magnus Karlsson wrote: > Add documentation for the XDP_SHARED_UMEM feature when a UMEM is > shared between different queues and/or netdevs. > > Signed-off-by: Magnus Karlsson Acked-by: Björn Töpel > --- > Documentation/networking/af_xdp.rst | 68 +++++++++++++++++++++++++++++++------ > 1 file changed, 58 insertions(+), 10 deletions(-) > > diff --git a/Documentation/networking/af_xdp.rst b/Documentation/networking/af_xdp.rst > index 5bc55a4..2ccc564 100644 > --- a/Documentation/networking/af_xdp.rst > +++ b/Documentation/networking/af_xdp.rst > @@ -258,14 +258,21 @@ socket into zero-copy mode or fail. > XDP_SHARED_UMEM bind flag > ------------------------- > > -This flag enables you to bind multiple sockets to the same UMEM, but > -only if they share the same queue id. In this mode, each socket has > -their own RX and TX rings, but the UMEM (tied to the fist socket > -created) only has a single FILL ring and a single COMPLETION > -ring. To use this mode, create the first socket and bind it in the normal > -way. Create a second socket and create an RX and a TX ring, or at > -least one of them, but no FILL or COMPLETION rings as the ones from > -the first socket will be used. In the bind call, set he > +This flag enables you to bind multiple sockets to the same UMEM. It > +works on the same queue id, between queue ids and between > +netdevs/devices. In this mode, each socket has their own RX and TX > +rings as usual, but you are going to have one or more FILL and > +COMPLETION ring pairs. You have to create one of these pairs per > +unique netdev and queue id tuple that you bind to. > + > +Starting with the case were we would like to share a UMEM between > +sockets bound to the same netdev and queue id. The UMEM (tied to the > +fist socket created) will only have a single FILL ring and a single > +COMPLETION ring as there is only on unique netdev,queue_id tuple that > +we have bound to. To use this mode, create the first socket and bind > +it in the normal way. Create a second socket and create an RX and a TX > +ring, or at least one of them, but no FILL or COMPLETION rings as the > +ones from the first socket will be used. In the bind call, set he > XDP_SHARED_UMEM option and provide the initial socket's fd in the > sxdp_shared_umem_fd field. You can attach an arbitrary number of extra > sockets this way. > @@ -305,11 +312,41 @@ concurrently. There are no synchronization primitives in the > libbpf code that protects multiple users at this point in time. > > Libbpf uses this mode if you create more than one socket tied to the > -same umem. However, note that you need to supply the > +same UMEM. However, note that you need to supply the > XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the > xsk_socket__create calls and load your own XDP program as there is no > built in one in libbpf that will route the traffic for you. > > +The second case is when you share a UMEM between sockets that are > +bound to different queue ids and/or netdevs. In this case you have to > +create one FILL ring and one COMPLETION ring for each unique > +netdev,queue_id pair. Let us say you want to create two sockets bound > +to two different queue ids on the same netdev. Create the first socket > +and bind it in the normal way. Create a second socket and create an RX > +and a TX ring, or at least one of them, and then one FILL and > +COMPLETION ring for this socket. Then in the bind call, set he > +XDP_SHARED_UMEM option and provide the initial socket's fd in the > +sxdp_shared_umem_fd field as you registered the UMEM on that > +socket. These two sockets will now share one and the same UMEM. > + > +There is no need to supply an XDP program like the one in the previous > +case where sockets were bound to the same queue id and > +device. Instead, use the NIC's packet steering capabilities to steer > +the packets to the right queue. In the previous example, there is only > +one queue shared among sockets, so the NIC cannot do this steering. It > +can only steer between queues. > + > +In libbpf, you need to use the xsk_socket__create_shared() API as it > +takes a reference to a FILL ring and a COMPLETION ring that will be > +created for you and bound to the shared UMEM. You can use this > +function for all the sockets you create, or you can use it for the > +second and following ones and use xsk_socket__create() for the first > +one. Both methods yield the same result. > + > +Note that a UMEM can be shared between sockets on the same queue id > +and device, as well as between queues on the same device and between > +devices at the same time. > + > XDP_USE_NEED_WAKEUP bind flag > ----------------------------- > > @@ -364,7 +401,7 @@ resources by only setting up one of them. Both the FILL ring and the > COMPLETION ring are mandatory as you need to have a UMEM tied to your > socket. But if the XDP_SHARED_UMEM flag is used, any socket after the > first one does not have a UMEM and should in that case not have any > -FILL or COMPLETION rings created as the ones from the shared umem will > +FILL or COMPLETION rings created as the ones from the shared UMEM will > be used. Note, that the rings are single-producer single-consumer, so > do not try to access them from multiple processes at the same > time. See the XDP_SHARED_UMEM section. > @@ -567,6 +604,17 @@ A: The short answer is no, that is not supported at the moment. The > switch, or other distribution mechanism, in your NIC to direct > traffic to the correct queue id and socket. > > +Q: My packets are sometimes corrupted. What is wrong? > + > +A: Care has to be taken not to feed the same buffer in the UMEM into > + more than one ring at the same time. If you for example feed the > + same buffer into the FILL ring and the TX ring at the same time, the > + NIC might receive data into the buffer at the same time it is > + sending it. This will cause some packets to become corrupted. Same > + thing goes for feeding the same buffer into the FILL rings > + belonging to different queue ids or netdevs bound with the > + XDP_SHARED_UMEM flag. > + > Credits > ======= > >