All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Magnus Karlsson <magnus.karlsson@gmail.com>
Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>,
	Lorenzo Bianconi <lorenzo@kernel.org>, bpf <bpf@vger.kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	shayagr@amazon.com, sameehj@amazon.com,
	John Fastabend <john.fastabend@gmail.com>,
	David Ahern <dsahern@kernel.org>,
	Eelco Chaudron <echaudro@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	Alexander Duyck <alexander.duyck@gmail.com>,
	Saeed Mahameed <saeed@kernel.org>,
	"Fijalkowski, Maciej" <maciej.fijalkowski@intel.com>,
	Tirthendu <tirthendu.sarkar@intel.com>,
	brouer@redhat.com
Subject: Re: [PATCH v8 bpf-next 00/14] mvneta: introduce XDP multi-buffer support
Date: Wed, 21 Apr 2021 17:39:21 +0200	[thread overview]
Message-ID: <20210421173921.23fef6a7@carbon> (raw)
In-Reply-To: <CAJ8uoz3ROiPn+-bh7OjFOjXjXK9xGhU5cxWoFPM9JoYeh=zw=g@mail.gmail.com>

On Wed, 21 Apr 2021 16:12:32 +0200
Magnus Karlsson <magnus.karlsson@gmail.com> wrote:

> On Wed, Apr 21, 2021 at 2:48 PM Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
> >
> > On Tue, 20 Apr 2021 15:49:44 +0200
> > Magnus Karlsson <magnus.karlsson@gmail.com> wrote:
> >  
> > > On Mon, Apr 19, 2021 at 8:56 AM Lorenzo Bianconi
> > > <lorenzo.bianconi@redhat.com> wrote:  
> > > >  
> > > > > On Sun, Apr 18, 2021 at 6:18 PM Jesper Dangaard Brouer
> > > > > <brouer@redhat.com> wrote:  
> > > > > >
> > > > > > On Fri, 16 Apr 2021 16:27:18 +0200
> > > > > > Magnus Karlsson <magnus.karlsson@gmail.com> wrote:
> > > > > >  
> > > > > > > On Thu, Apr 8, 2021 at 2:51 PM Lorenzo Bianconi <lorenzo@kernel.org> wrote:  
> > > > > > > >
> > > > > > > > This series introduce XDP multi-buffer support. The mvneta driver is
> > > > > > > > the first to support these new "non-linear" xdp_{buff,frame}. Reviewers
> > > > > > > > please focus on how these new types of xdp_{buff,frame} packets
> > > > > > > > traverse the different layers and the layout design. It is on purpose
> > > > > > > > that BPF-helpers are kept simple, as we don't want to expose the
> > > > > > > > internal layout to allow later changes.
> > > > > > > >
> > > > > > > > For now, to keep the design simple and to maintain performance, the XDP
> > > > > > > > BPF-prog (still) only have access to the first-buffer. It is left for
> > > > > > > > later (another patchset) to add payload access across multiple buffers.
> > > > > > > > This patchset should still allow for these future extensions. The goal
> > > > > > > > is to lift the XDP MTU restriction that comes with XDP, but maintain
> > > > > > > > same performance as before.  
> > > > > > [...]  
> > > > > > > >
> > > > > > > > [0] https://netdevconf.info/0x14/session.html?talk-the-path-to-tcp-4k-mtu-and-rx-zerocopy
> > > > > > > > [1] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org
> > > > > > > > [2] https://netdevconf.info/0x14/session.html?tutorial-add-XDP-support-to-a-NIC-driver (XDPmulti-buffers section)  
> > > > > > >
> > > > > > > Took your patches for a test run with the AF_XDP sample xdpsock on an
> > > > > > > i40e card and the throughput degradation is between 2 to 6% depending
> > > > > > > on the setup and microbenchmark within xdpsock that is executed. And
> > > > > > > this is without sending any multi frame packets. Just single frame
> > > > > > > ones. Tirtha made changes to the i40e driver to support this new
> > > > > > > interface so that is being included in the measurements.  
> > > > > >
> > > > > > Could you please share Tirtha's i40e support patch with me?  
> > > > >
> > > > > We will post them on the list as an RFC. Tirtha also added AF_XDP
> > > > > multi-frame support on top of Lorenzo's patches so we will send that
> > > > > one out as well. Will also rerun my experiments, properly document
> > > > > them and send out just to be sure that I did not make any mistake.  
> > > >
> > > > ack, very cool, thx  
> > >
> > > I have now run a new set of experiments on a Cascade Lake server at
> > > 2.1 GHz with turbo boost disabled. Two NICs: i40e and ice. The
> > > baseline is commit 5c507329000e ("libbpf: Clarify flags in ringbuf
> > > helpers") and Lorenzo's and Eelco's path set is their v8. First some
> > > runs with xdpsock (i.e. AF_XDP) in both 2-core mode (app on one core
> > > and the driver on another) and 1-core mode using busy_poll.
> > >
> > > xdpsock rxdrop throughput change with the multi-buffer patches without
> > > any driver changes:
> > > 1-core i40e: -0.5 to 0%   2-cores i40e: -0.5%
> > > 1-core ice: -2%   2-cores ice: -1 to -0.5%
> > >
> > > xdp_rxq_info -a XDP_DROP
> > > i40e: -4%   ice: +8%
> > >
> > > xdp_rxq_info -a XDP_TX
> > > i40e: -10%   ice: +9%
> > >
> > > The XDP results with xdp_rxq_info are just weird! I reran them three
> > > times, rebuilt and rebooted in between and I always get the same
> > > results. And I also checked that I am running on the correct NUMA node
> > > and so on. But I have a hard time believing them. Nearly +10% and -10%
> > > difference. Too much in my book. Jesper, could you please run the same
> > > and see what you get?  
> >
> > We of-cause have to find the root-cause of the +-10%, but let me drill
> > into what the 10% represent time/cycle wise.  Using a percentage
> > difference is usually a really good idea as it implies a comparative
> > measure (something I always request people to do, as a single
> > performance number means nothing by itself).
> >
> > For a zoom-in-benchmarks like these where the amount of code executed
> > is very small, the effect of removing or adding code can effect the
> > measurement a lot.
> >
> > I can only do the tests for i40e, as I don't have ice hardware (but
> > Intel is working on fixing that ;-)).
> >
> >  xdp_rxq_info -a XDP_DROP
> >   i40e: 33,417,775 pps  
> 
> Here I only get around 21 Mpps
> 
> >  CPU is 100% used, so we can calculate nanosec used per packet:
> >   29.92 nanosec (1/33417775*10^9)
> >   2.1 GHz CPU =  approx 63 CPU-cycles
> >
> >  You lost -4% performance in this case.  This correspond to:
> >   -1.2 nanosec (29.92*0.04) slower
> >   (This could be cost of single func call overhead = 1.3 ns)
> >
> > My measurement for XDP_TX:
> >
> >  xdp_rxq_info -a XDP_TX
> >   28,278,722 pps
> >   35.36 ns (1/28278722*10^9)  
> 
> And here, much lower at around 8 Mpps. But I do see correct packets
> coming back on the cable for i40e but not for ice! There is likely a
> bug there in the XDP_TX logic for ice. Might explain the weird results
> I am getting. Will investigate.
> 
> But why do I get only a fraction of your performance? XDP_TX touches
> the packet so I would expect it to be far less than what you get, but
> more than I get. 

I clearly have a bug in the i40e driver.  As I wrote later, I don't see
any packets transmitted for XDP_TX.  Hmm, I using Mel Gorman's tree,
which doesn't contain the i40e/ice/ixgbe bug we fixed earlier.

The call to xdp_convert_buff_to_frame() fails, but (see below) that
error is simply converted to I40E_XDP_CONSUMED.  Thus, not even the
'trace_xdp_exception' will be able to troubleshoot this.  You/Intel
should consider making XDP_TX errors detectable (this will also happen
if TX ring don't have room).

 int i40e_xmit_xdp_tx_ring(struct xdp_buff *xdp, struct i40e_ring *xdp_ring)
 {
	struct xdp_frame *xdpf = xdp_convert_buff_to_frame(xdp);

	if (unlikely(!xdpf))
		return I40E_XDP_CONSUMED;

	return i40e_xmit_xdp_ring(xdpf, xdp_ring);
 }


> What CPU core do you run on? 

Intel(R) Xeon(R) CPU E5-1650 v4 @ 3.60GHz

> It actually looks like
> your packet data gets prefetched successfully. If it had not, you
> would have gotten an access to LLC which is much more expensive than
> the drop you are seeing. If I run on the wrong NUMA node, I get 4
> Mpps, so it is not that.
> 
> One interesting thing is that I get better results using the zero-copy
> path in the driver. I start xdp_rxq_drop then tie an AF_XDP socket to
> the queue id the XDP program gets its traffic from. The AF_XDP program
> will get no traffic in this case, but it will force the driver to use
> the zero-copy path for its XDP processing. In this case I get this:
> 
> -0.5% for XDP_DROP and +-0% for XDP_TX for i40e.
> 
> >  You lost -10% performance in this case:
> >   -3.54 nanosec (35.36*0.10) slower
> >
> > In XDP context 3.54 nanosec is a lot, as you can see it is 10% in this
> > zoom-in benchmark.  We have to look at the details.
> >
> > One detail/issue with i40e doing XDP_TX, is that I cannot verify that
> > packets are actually transmitted... not via exception tracepoint, not
> > via netstats, not via ethtool_stats.pl.  Maybe all the packets are
> > getting (silently) drop in my tests...!?!
> >
> >  
> > > The xdpsock numbers are more in the ballpark of
> > > what I would expect.
> > >
> > > Tirtha and I found some optimizations in the i40e
> > > multi-frame/multi-buffer support that we have implemented. Will test
> > > those next, post the results and share the code.
> > >  
> > > > >
> > > > > Just note that I would really like for the multi-frame support to get
> > > > > in. I have lost count on how many people that have asked for it to be
> > > > > added to XDP and AF_XDP. So please check our implementation and
> > > > > improve it so we can get the overhead down to where we want it to be.  
> > > >
> > > > sure, I will do.
> > > >
> > > > Regards,
> > > > Lorenzo
> > > >  
> > > > >
> > > > > Thanks: Magnus
> > > > >  
> > > > > > I would like to reproduce these results in my testlab, in-order to
> > > > > > figure out where the throughput degradation comes from.
> > > > > >  
> > > > > > > What performance do you see with the mvneta card? How much are we
> > > > > > > willing to pay for this feature when it is not being used or can we in
> > > > > > > some way selectively turn it on only when needed?  
> > > > > >
> > > > > > Well, as Daniel says performance wise we require close to /zero/
> > > > > > additional overhead, especially as you state this happens when sending
> > > > > > a single frame, which is a base case that we must not slowdown.
> > > > > >
> > > > > > --
> > > > > > Best regards,
> > > > > >   Jesper Dangaard Brouer  
> >
> > --
> > Best regards,
> >   Jesper Dangaard Brouer
> >   MSc.CS, Principal Kernel Engineer at Red Hat
> >   LinkedIn: http://www.linkedin.com/in/brouer
> >
> >
> > Running XDP on dev:i40e2 (ifindex:6) action:XDP_DROP options:read
> > XDP stats       CPU     pps         issue-pps
> > XDP-RX CPU      2       33,417,775  0
> > XDP-RX CPU      total   33,417,775
> >
> > RXQ stats       RXQ:CPU pps         issue-pps
> > rx_queue_index    2:2   33,417,775  0
> > rx_queue_index    2:sum 33,417,775
> >
> >
> > Running XDP on dev:i40e2 (ifindex:6) action:XDP_TX options:swapmac
> > XDP stats       CPU     pps         issue-pps
> > XDP-RX CPU      2       28,278,722  0
> > XDP-RX CPU      total   28,278,722
> >
> > RXQ stats       RXQ:CPU pps         issue-pps
> > rx_queue_index    2:2   28,278,726  0
> > rx_queue_index    2:sum 28,278,726
> >
> >
> >  
> 



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


  reply	other threads:[~2021-04-21 15:39 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-08 12:50 [PATCH v8 bpf-next 00/14] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
2021-04-08 12:50 ` [PATCH v8 bpf-next 01/14] xdp: introduce mb in xdp_buff/xdp_frame Lorenzo Bianconi
2021-04-08 18:17   ` Vladimir Oltean
2021-04-09 16:03     ` Lorenzo Bianconi
2021-04-29 13:36   ` Jesper Dangaard Brouer
2021-04-29 13:54     ` Lorenzo Bianconi
2021-04-08 12:50 ` [PATCH v8 bpf-next 02/14] xdp: add xdp_shared_info data structure Lorenzo Bianconi
2021-04-08 13:39   ` Vladimir Oltean
2021-04-08 14:26     ` Lorenzo Bianconi
2021-04-08 18:06   ` kernel test robot
2021-04-08 18:06     ` kernel test robot
2021-04-08 12:50 ` [PATCH v8 bpf-next 03/14] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer Lorenzo Bianconi
2021-04-08 18:19   ` Vladimir Oltean
2021-04-09 16:24     ` Lorenzo Bianconi
2021-04-08 12:50 ` [PATCH v8 bpf-next 04/14] xdp: add multi-buff support to xdp_return_{buff/frame} Lorenzo Bianconi
2021-04-08 18:30   ` Vladimir Oltean
2021-04-09 16:28     ` Lorenzo Bianconi
2021-04-08 12:50 ` [PATCH v8 bpf-next 05/14] net: mvneta: add multi buffer support to XDP_TX Lorenzo Bianconi
2021-04-08 18:40   ` Vladimir Oltean
2021-04-09 16:36     ` Lorenzo Bianconi
2021-04-08 12:50 ` [PATCH v8 bpf-next 06/14] net: mvneta: enable jumbo frames for XDP Lorenzo Bianconi
2021-04-08 12:50 ` [PATCH v8 bpf-next 07/14] net: xdp: add multi-buff support to xdp_build_skb_from_fram Lorenzo Bianconi
2021-04-08 12:51 ` [PATCH v8 bpf-next 08/14] bpf: add multi-buff support to the bpf_xdp_adjust_tail() API Lorenzo Bianconi
2021-04-08 19:15   ` Vladimir Oltean
2021-04-08 20:54     ` Vladimir Oltean
2021-04-09 18:13       ` Lorenzo Bianconi
2021-04-08 12:51 ` [PATCH v8 bpf-next 09/14] bpd: add multi-buffer support to xdp copy helpers Lorenzo Bianconi
2021-04-08 20:57   ` Vladimir Oltean
2021-04-09 18:19     ` Lorenzo Bianconi
2021-04-08 21:04   ` Vladimir Oltean
2021-04-14  8:08     ` Eelco Chaudron
2021-04-08 12:51 ` [PATCH v8 bpf-next 10/14] bpf: add new frame_length field to the XDP ctx Lorenzo Bianconi
2021-04-08 12:51 ` [PATCH v8 bpf-next 11/14] bpf: move user_size out of bpf_test_init Lorenzo Bianconi
2021-04-08 12:51 ` [PATCH v8 bpf-next 12/14] bpf: introduce multibuff support to bpf_prog_test_run_xdp() Lorenzo Bianconi
2021-04-08 12:51 ` [PATCH v8 bpf-next 13/14] bpf: test_run: add xdp_shared_info pointer in bpf_test_finish signature Lorenzo Bianconi
2021-04-08 12:51 ` [PATCH v8 bpf-next 14/14] bpf: update xdp_adjust_tail selftest to include multi-buffer Lorenzo Bianconi
2021-04-09  0:56 ` [PATCH v8 bpf-next 00/14] mvneta: introduce XDP multi-buffer support John Fastabend
2021-04-09 20:16   ` Lorenzo Bianconi
2021-04-13 15:16   ` Eelco Chaudron
2021-04-16 14:27 ` Magnus Karlsson
2021-04-16 21:29   ` Lorenzo Bianconi
2021-04-16 23:00     ` Daniel Borkmann
2021-04-18 16:18   ` Jesper Dangaard Brouer
2021-04-19  6:20     ` Magnus Karlsson
2021-04-19  6:55       ` Lorenzo Bianconi
2021-04-20 13:49         ` Magnus Karlsson
2021-04-21 12:47           ` Jesper Dangaard Brouer
2021-04-21 14:12             ` Magnus Karlsson
2021-04-21 15:39               ` Jesper Dangaard Brouer [this message]
2021-04-22 10:24                 ` Magnus Karlsson
2021-04-22 14:42                   ` Jesper Dangaard Brouer
2021-04-22 15:05                     ` Crash for i40e on net-next (was: [PATCH v8 bpf-next 00/14] mvneta: introduce XDP multi-buffer support) Jesper Dangaard Brouer
2021-04-23  5:28                       ` Magnus Karlsson
2021-04-23 16:43                         ` Alexander Duyck
2021-04-25  9:45                           ` Magnus Karlsson
2021-04-27 18:28   ` [PATCH v8 bpf-next 00/14] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
2021-04-28  7:41     ` Magnus Karlsson
2021-04-29 12:49       ` Jesper Dangaard Brouer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210421173921.23fef6a7@carbon \
    --to=brouer@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=echaudro@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=lorenzo.bianconi@redhat.com \
    --cc=lorenzo@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeed@kernel.org \
    --cc=sameehj@amazon.com \
    --cc=shayagr@amazon.com \
    --cc=tirthendu.sarkar@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.