bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Magnus Karlsson <magnus.karlsson@gmail.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: "Karlsson, Magnus" <magnus.karlsson@intel.com>,
	"Björn Töpel" <bjorn.topel@intel.com>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Network Development" <netdev@vger.kernel.org>,
	"Jonathan Lemon" <jonathan.lemon@gmail.com>,
	bpf <bpf@vger.kernel.org>,
	jeffrey.t.kirsher@intel.com, anthony.l.nguyen@intel.com,
	"Fijalkowski, Maciej" <maciej.fijalkowski@intel.com>,
	"Maciej Fijalkowski" <maciejromanfijalkowski@gmail.com>,
	intel-wired-lan <intel-wired-lan@lists.osuosl.org>
Subject: Re: [PATCH bpf-next 1/6] i40e: introduce lazy Tx completions for AF_XDP zero-copy
Date: Fri, 6 Nov 2020 20:09:42 +0100	[thread overview]
Message-ID: <CAJ8uoz1nyv-_X5+z-nwyDOc628uYwmUVJCLkXJpsHgFK_QV+wQ@mail.gmail.com> (raw)
In-Reply-To: <20201105074511.6935e8b7@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>

On Thu, Nov 5, 2020 at 4:45 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 5 Nov 2020 15:17:50 +0100 Magnus Karlsson wrote:
> > > I feel like this needs a big fat warning somewhere.
> > >
> > > It's perfectly fine to never complete TCP packets, but AF_XDP could be
> > > used to implement protocols in user space. What if someone wants to
> > > implement something like TSQ?
> >
> > I might misunderstand you, but with TSQ here (for something that
> > bypasses qdisk and any buffering and just goes straight to the driver)
> > you mean the ability to have just a few buffers outstanding and
> > continuously reuse these? If so, that is likely best achieved by
> > setting a low Tx queue size on the NIC. Note that even without this
> > patch, completions could be delayed. Though this patch makes that the
> > normal case. In any way, I think this calls for some improved
> > documentation.
>
> TSQ tries to limit the amount of data the TCP stack queues into TC/sched
> and drivers. Say 1MB ~ 16 GSO frames. It will not queue more data until
> some of the transfer is reported as completed.

Thanks. Got it. There is one more use case I can think of for quick
completions of Tx buffers and that is if you have metadata associated
with the completion, for example a Tx time stamp. Not that this
capability exists today, but hopefully it will get added at some
point.

Anyway after some more thinking, I would like to remove this patch
from the patch set and put it on the shelf for a while. The reason
behind this is that if we can get a good busy poll solution for AF_XDP
sockets, then we do not need this patch. With busy-poll the choice of
when to complete Tx buffers would be left to the application in a nice
way. If the application would like to quickly get buffers completed
(at the cost of some performance) it would call sendto() (or friends)
soon after it put the packet on the Tx ring. If max throughput is
desired with no regard to when a buffer is returned, then sendto()
would be called only after a large batch of packets have been put on
the Tx ring. No need for any threshold or new knob, in other words,
much nicer. So let us wait for Björn's busy poll patches and see where
it leads. Please protest if you do not agree. Otherwise I will submit
a v2 without this patch and with Maciej's proposed simplification.

> IIUC you're allowing up to 64 descriptors to linger without reporting
> back that the transfer is done. That means that user space implementing
> a scheme similar to TSQ may see its transfers stalled.

  reply	other threads:[~2020-11-06 19:09 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-04 14:08 [PATCH bpf-next 0/6] xsk: i40e: Tx performance improvements Magnus Karlsson
2020-11-04 14:08 ` [PATCH bpf-next 1/6] i40e: introduce lazy Tx completions for AF_XDP zero-copy Magnus Karlsson
2020-11-04 23:33   ` Jakub Kicinski
2020-11-04 23:35     ` Jakub Kicinski
2020-11-05 14:17     ` Magnus Karlsson
2020-11-05 15:45       ` Jakub Kicinski
2020-11-06 19:09         ` Magnus Karlsson [this message]
2020-11-04 14:08 ` [PATCH bpf-next 2/6] samples/bpf: increment Tx stats at sending Magnus Karlsson
2020-11-09 20:47   ` [Intel-wired-lan] " John Fastabend
2020-11-10  7:12     ` Magnus Karlsson
2020-11-04 14:08 ` [PATCH bpf-next 3/6] i40e: remove unnecessary sw_ring access from xsk Tx Magnus Karlsson
2020-11-09 20:48   ` [Intel-wired-lan] " John Fastabend
2020-11-04 14:09 ` [PATCH bpf-next 4/6] xsk: introduce padding between more ring pointers Magnus Karlsson
2020-11-09 20:43   ` [Intel-wired-lan] " John Fastabend
2020-11-04 14:09 ` [PATCH bpf-next 5/6] xsk: introduce batched Tx descriptor interfaces Magnus Karlsson
2020-11-09 21:06   ` [Intel-wired-lan] " John Fastabend
2020-11-10  8:28     ` Magnus Karlsson
2020-11-04 14:09 ` [PATCH bpf-next 6/6] i40e: use batched xsk Tx interfaces to increase performance Magnus Karlsson
2020-11-04 23:01   ` Maciej Fijalkowski
2020-11-05  7:19     ` Magnus Karlsson
2020-11-09 21:10   ` [Intel-wired-lan] " John Fastabend

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJ8uoz1nyv-_X5+z-nwyDOc628uYwmUVJCLkXJpsHgFK_QV+wQ@mail.gmail.com \
    --to=magnus.karlsson@gmail.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=ast@kernel.org \
    --cc=bjorn.topel@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jeffrey.t.kirsher@intel.com \
    --cc=jonathan.lemon@gmail.com \
    --cc=kuba@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=maciejromanfijalkowski@gmail.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).