From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jesper Dangaard Brouer Subject: Re: [PATCH bpf-next 11/11] samples/bpf: add -c/--copy -z/--zero-copy flags to xdpsock Date: Wed, 29 Aug 2018 14:44:46 +0200 Message-ID: <20180829144446.72509a96@redhat.com> References: <20180828124435.30578-1-bjorn.topel@gmail.com> <20180828124435.30578-12-bjorn.topel@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Cc: magnus.karlsson@intel.com, magnus.karlsson@gmail.com, alexander.h.duyck@intel.com, alexander.duyck@gmail.com, ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org, jesse.brandeburg@intel.com, anjali.singhai@intel.com, peter.waskiewicz.jr@intel.com, =?UTF-8?B?Qmo=?= =?UTF-8?B?w7ZybiBUw7ZwZWw=?= , michael.lundkvist@ericsson.com, willemdebruijn.kernel@gmail.com, john.fastabend@gmail.com, jakub.kicinski@netronome.com, neerav.parikh@intel.com, mykyta.iziumtsev@linaro.org, francois.ozog@linaro.org, ilias.apalodimas@linaro.org, brian.brooks@linaro.org, u9012063@gmail.com, pavel@fastnetmon.com, qi.z.zhang@intel.com, brouer@redhat.com To: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= Return-path: Received: from mx3-rdu2.redhat.com ([66.187.233.73]:56358 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727590AbeH2Qln (ORCPT ); Wed, 29 Aug 2018 12:41:43 -0400 In-Reply-To: <20180828124435.30578-12-bjorn.topel@gmail.com> Sender: netdev-owner@vger.kernel.org List-ID: On Tue, 28 Aug 2018 14:44:35 +0200 Björn Töpel wrote: > From: Björn Töpel > > The -c/--copy -z/--zero-copy flags enforces either copy or zero-copy > mode. Nice, thanks for adding this. It allows me to quickly test the difference between normal-copy vs zero-copy modes. (Kernel bpf-next without RETPOLINE). AF_XDP RX-drop: Normal-copy mode: rx 13,070,318 pps - 76.5 ns Zero-copy mode: rx 26,132,328 pps - 38.3 ns Compare to XDP_DROP: 34,251,464 pps - 29.2 ns XDP_DROP + read : 30,756,664 pps - 32.5 ns The normal-copy mode is surprisingly fast (and it works for every driver implemeting the regular XDP_REDIRECT action). It is still faster to do in-kernel XDP_DROP than AF_XDP zero-copy mode dropping, which was expected given frames travel to a remote CPU before returned (don't think remote CPU reads payload?). The gap in nanosec is actually quite small, thus I'm impressed by the SPSC-queue implementation working across these CPUs. AF_XDP layer2-fwd: Normal-copy mode: rx 3,200,885 tx 3,200,892 Zero-copy mode: rx 17,026,300 tx 17,026,269 Compare to XDP_TX: rx 14,529,079 tx 14,529,850 - 68.82 ns XDP_REDIRECT: rx 13,235,785 tx 13,235,784 - 75.55 ns The copy-mode is slow because it allocates SKBs internally (I do wonder if we could speed it up by using ndo_xdp_xmit + disable-BH). More intersting is that the zero-copy is faster than XDP_TX and XDP_REDIRECT. I think the speedup comes from avoiding some DMA mapping calls with ZC. Side-note: XDP_TX vs. REDIRECT: 75.55 - 68.82 = 6.73 ns. The cost of going through the xdp_do_redirect_map core is actually quite small :-) (I have some micro optimizations that should help ~2ns). AF_XDP TX-only: Normal-copy mode: tx 2,853,461 pps Zero-copy mode: tx 22,255,311 pps (There is not XDP mode that does TX to compare against) -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer