All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Magnus Karlsson <magnus.karlsson@gmail.com>
Cc: Lorenzo Bianconi <lorenzo@kernel.org>, bpf <bpf@vger.kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	Lorenzo Bianconi <lorenzo.bianconi@redhat.com>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	shayagr@amazon.com, sameehj@amazon.com,
	John Fastabend <john.fastabend@gmail.com>,
	David Ahern <dsahern@kernel.org>,
	Eelco Chaudron <echaudro@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	Alexander Duyck <alexander.duyck@gmail.com>,
	Saeed Mahameed <saeed@kernel.org>,
	"Fijalkowski, Maciej" <maciej.fijalkowski@intel.com>,
	Tirthendu <tirthendu.sarkar@intel.com>,
	brouer@redhat.com
Subject: Re: [PATCH v8 bpf-next 00/14] mvneta: introduce XDP multi-buffer support
Date: Thu, 29 Apr 2021 14:49:10 +0200	[thread overview]
Message-ID: <20210429144910.27aebab2@carbon> (raw)
In-Reply-To: <CAJ8uoz3=RiTLf_MY-8=hZpib8ds8HJFraVpjJs_K0QEzjfbEhA@mail.gmail.com>

On Wed, 28 Apr 2021 09:41:52 +0200
Magnus Karlsson <magnus.karlsson@gmail.com> wrote:

> On Tue, Apr 27, 2021 at 8:28 PM Lorenzo Bianconi <lorenzo@kernel.org> wrote:
> >
> > [...]
> >  
> > > Took your patches for a test run with the AF_XDP sample xdpsock on an
> > > i40e card and the throughput degradation is between 2 to 6% depending
> > > on the setup and microbenchmark within xdpsock that is executed. And
> > > this is without sending any multi frame packets. Just single frame
> > > ones. Tirtha made changes to the i40e driver to support this new
> > > interface so that is being included in the measurements.
> > >
> > > What performance do you see with the mvneta card? How much are we
> > > willing to pay for this feature when it is not being used or can we in
> > > some way selectively turn it on only when needed?  
> >
> > Hi Magnus,
> >
> > Today I carried out some comparison tests between bpf-next and bpf-next +
> > xdp_multibuff series on mvneta running xdp_rxq_info sample. Results are
> > basically aligned:
> >
> > bpf-next:
> > - xdp drop ~ 665Kpps
> > - xdp_tx   ~ 291Kpps
> > - xdp_pass ~ 118Kpps
> >
> > bpf-next + xdp_multibuff:
> > - xdp drop ~ 672Kpps
> > - xdp_tx   ~ 288Kpps
> > - xdp_pass ~ 118Kpps
> >
> > I am not sure if results are affected by the low power CPU, I will run some
> > tests on ixgbe card.  
> 
> Thanks Lorenzo. I made some new runs, this time with i40e driver
> changes as a new data point. Same baseline as before but with patches
> [1] and [2] applied. Note
> that if you use net or net-next and i40e, you need patch [3] too.
> 
> The i40e multi-buffer support will be posted on the mailing list as a
> separate RFC patch so you can reproduce and review.
> 
> Note, calculations are performed on non-truncated numbers. So 2 ns
> might be 5 cycles on my 2.1 GHz machine since 2.49 ns * 2.1 GHz =
> 5.229 cycles ~ 5 cycles. xdpsock is run in zero-copy mode so it uses
> the zero-copy driver data path in contrast with xdp_rxq_info that uses
> the regular driver data path. Only ran the busy-poll 1-core case this
> time. Reported numbers are the average over 3 runs.

Yes, for i40e the xdpsock zero-copy test uses another code path, this
is something we need to keep in mind. 

Also remember that we designed the central xdp_do_redirect() call to
delay creation of xdp_frame.  This is something what AF_XDP ZC takes
advantage of.
Thus, the cost of xdp_buff to xdp_frame conversion is not covered in
below tests, and I expect this patchset to increase that cost...
(UPDATE: below XDP_TX actually does xdp_frame conversion)


> multi-buffer patches without any driver changes:

Thanks you *SO* much Magnus for these superb tests.  I absolutely love
how comprehensive your test results are.  Thanks you for catching the
performance regression in this patchset. (I for one know how time
consuming these kind of tests are, I appreciate your effort, a lot!)

> xdpsock rxdrop 1-core:
> i40e: -4.5% in throughput / +3 ns / +6 cycles
> ice: -1.5% / +1 ns / +2 cycles
> 
> xdp_rxq_info -a XDP_DROP
> i40e: -2.5% / +2 ns / +3 cycles
> ice: +6% / -3 ns / -7 cycles
> 
> xdp_rxq_info -a XDP_TX
> i40e: -10% / +15 ns / +32 cycles
> ice: -9% / +14 ns / +29 cycles

This is a clear performance regression.

Looking closer at driver i40e_xmit_xdp_tx_ring() actually performs a
xdp_frame conversion calling xdp_convert_buff_to_frame(xdp).

FYI: We have started an offlist thread on finding the root-cause and
on IRC with Lorenzo.   The current lead is that, as Alexei so wisely
pointed out in earlier patches, that struct bit access is not
efficient...

As I expect we soon need bits for HW RX checksum indication, and
indication if metadata contains BTF described area, I've asked Lorenzo
to consider this, and look into introducing a flags member. (Then we
just have to figure out how to make flags access efficient).
 

> multi-buffer patches + i40e driver changes from Tirtha:
> 
> xdpsock rxdrop 1-core:
> i40e: -3% / +2 ns / +3 cycles
> 
> xdp_rxq_info -a XDP_DROP
> i40e: -7.5% / +5 ns / +9 cycles
> 
> xdp_rxq_info -a XDP_TX
> i40e: -10% / +15 ns / +32 cycles
> 
> Would be great if someone could rerun a similar set of experiments on
> i40e or ice then
> report.
 
> [1] https://lists.osuosl.org/pipermail/intel-wired-lan/Week-of-Mon-20210419/024106.html
> [2] https://lists.osuosl.org/pipermail/intel-wired-lan/Week-of-Mon-20210426/024135.html
> [3] https://lists.osuosl.org/pipermail/intel-wired-lan/Week-of-Mon-20210426/024129.html

I'm very happy that you/we all are paying attention to keep XDP
performance intact, as small 'paper-cuts' like +32 cycles does affect
XDP in the long run. Happy performance testing everybody :-)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


      reply	other threads:[~2021-04-29 12:49 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-08 12:50 [PATCH v8 bpf-next 00/14] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
2021-04-08 12:50 ` [PATCH v8 bpf-next 01/14] xdp: introduce mb in xdp_buff/xdp_frame Lorenzo Bianconi
2021-04-08 18:17   ` Vladimir Oltean
2021-04-09 16:03     ` Lorenzo Bianconi
2021-04-29 13:36   ` Jesper Dangaard Brouer
2021-04-29 13:54     ` Lorenzo Bianconi
2021-04-08 12:50 ` [PATCH v8 bpf-next 02/14] xdp: add xdp_shared_info data structure Lorenzo Bianconi
2021-04-08 13:39   ` Vladimir Oltean
2021-04-08 14:26     ` Lorenzo Bianconi
2021-04-08 18:06   ` kernel test robot
2021-04-08 18:06     ` kernel test robot
2021-04-08 12:50 ` [PATCH v8 bpf-next 03/14] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer Lorenzo Bianconi
2021-04-08 18:19   ` Vladimir Oltean
2021-04-09 16:24     ` Lorenzo Bianconi
2021-04-08 12:50 ` [PATCH v8 bpf-next 04/14] xdp: add multi-buff support to xdp_return_{buff/frame} Lorenzo Bianconi
2021-04-08 18:30   ` Vladimir Oltean
2021-04-09 16:28     ` Lorenzo Bianconi
2021-04-08 12:50 ` [PATCH v8 bpf-next 05/14] net: mvneta: add multi buffer support to XDP_TX Lorenzo Bianconi
2021-04-08 18:40   ` Vladimir Oltean
2021-04-09 16:36     ` Lorenzo Bianconi
2021-04-08 12:50 ` [PATCH v8 bpf-next 06/14] net: mvneta: enable jumbo frames for XDP Lorenzo Bianconi
2021-04-08 12:50 ` [PATCH v8 bpf-next 07/14] net: xdp: add multi-buff support to xdp_build_skb_from_fram Lorenzo Bianconi
2021-04-08 12:51 ` [PATCH v8 bpf-next 08/14] bpf: add multi-buff support to the bpf_xdp_adjust_tail() API Lorenzo Bianconi
2021-04-08 19:15   ` Vladimir Oltean
2021-04-08 20:54     ` Vladimir Oltean
2021-04-09 18:13       ` Lorenzo Bianconi
2021-04-08 12:51 ` [PATCH v8 bpf-next 09/14] bpd: add multi-buffer support to xdp copy helpers Lorenzo Bianconi
2021-04-08 20:57   ` Vladimir Oltean
2021-04-09 18:19     ` Lorenzo Bianconi
2021-04-08 21:04   ` Vladimir Oltean
2021-04-14  8:08     ` Eelco Chaudron
2021-04-08 12:51 ` [PATCH v8 bpf-next 10/14] bpf: add new frame_length field to the XDP ctx Lorenzo Bianconi
2021-04-08 12:51 ` [PATCH v8 bpf-next 11/14] bpf: move user_size out of bpf_test_init Lorenzo Bianconi
2021-04-08 12:51 ` [PATCH v8 bpf-next 12/14] bpf: introduce multibuff support to bpf_prog_test_run_xdp() Lorenzo Bianconi
2021-04-08 12:51 ` [PATCH v8 bpf-next 13/14] bpf: test_run: add xdp_shared_info pointer in bpf_test_finish signature Lorenzo Bianconi
2021-04-08 12:51 ` [PATCH v8 bpf-next 14/14] bpf: update xdp_adjust_tail selftest to include multi-buffer Lorenzo Bianconi
2021-04-09  0:56 ` [PATCH v8 bpf-next 00/14] mvneta: introduce XDP multi-buffer support John Fastabend
2021-04-09 20:16   ` Lorenzo Bianconi
2021-04-13 15:16   ` Eelco Chaudron
2021-04-16 14:27 ` Magnus Karlsson
2021-04-16 21:29   ` Lorenzo Bianconi
2021-04-16 23:00     ` Daniel Borkmann
2021-04-18 16:18   ` Jesper Dangaard Brouer
2021-04-19  6:20     ` Magnus Karlsson
2021-04-19  6:55       ` Lorenzo Bianconi
2021-04-20 13:49         ` Magnus Karlsson
2021-04-21 12:47           ` Jesper Dangaard Brouer
2021-04-21 14:12             ` Magnus Karlsson
2021-04-21 15:39               ` Jesper Dangaard Brouer
2021-04-22 10:24                 ` Magnus Karlsson
2021-04-22 14:42                   ` Jesper Dangaard Brouer
2021-04-22 15:05                     ` Crash for i40e on net-next (was: [PATCH v8 bpf-next 00/14] mvneta: introduce XDP multi-buffer support) Jesper Dangaard Brouer
2021-04-23  5:28                       ` Magnus Karlsson
2021-04-23 16:43                         ` Alexander Duyck
2021-04-25  9:45                           ` Magnus Karlsson
2021-04-27 18:28   ` [PATCH v8 bpf-next 00/14] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
2021-04-28  7:41     ` Magnus Karlsson
2021-04-29 12:49       ` Jesper Dangaard Brouer [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210429144910.27aebab2@carbon \
    --to=brouer@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=echaudro@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=lorenzo.bianconi@redhat.com \
    --cc=lorenzo@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeed@kernel.org \
    --cc=sameehj@amazon.com \
    --cc=shayagr@amazon.com \
    --cc=tirthendu.sarkar@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.