netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: David Ahern <dsahern@gmail.com>, David Ahern <dsahern@kernel.org>,
	netdev@vger.kernel.org, prashantbhole.linux@gmail.com,
	jasowang@redhat.com, davem@davemloft.net, jbrouer@redhat.com,
	mst@redhat.com, toshiaki.makita1@gmail.com, daniel@iogearbox.net,
	john.fastabend@gmail.com, ast@kernel.org, kafai@fb.com,
	songliubraving@fb.com, yhs@fb.com, andriin@fb.com,
	David Ahern <dahern@digitalocean.com>
Subject: Re: [PATCH bpf-next 03/12] net: Add IFLA_XDP_EGRESS for XDP programs in the egress path
Date: Sat, 1 Feb 2020 20:15:08 -0800	[thread overview]
Message-ID: <20200201201508.63141689@cakuba.hsd1.ca.comcast.net> (raw)
In-Reply-To: <87sgjucbuf.fsf@toke.dk>

On Sat, 01 Feb 2020 21:05:28 +0100, Toke Høiland-Jørgensen wrote:
> Jakub Kicinski <kuba@kernel.org> writes:
> > On Sat, 01 Feb 2020 17:24:39 +0100, Toke Høiland-Jørgensen wrote:  
> >> > I'm weary of partially implemented XDP features, EGRESS prog does us
> >> > no good when most drivers didn't yet catch up with the REDIRECTs.    
> >> 
> >> I kinda agree with this; but on the other hand, if we have to wait for
> >> all drivers to catch up, that would mean we couldn't add *anything*
> >> new that requires driver changes, which is not ideal either :/  
> >
> > If EGRESS is only for XDP frames we could try to hide the handling in
> > the core (with slight changes to XDP_TX handling in the drivers),
> > making drivers smaller and XDP feature velocity higher.  
> 
> But if it's only for XDP frames that are REDIRECTed, then one might as
> well perform whatever action the TX hook was doing before REDIRECTing
> (as you yourself argued)... :)

Right, that's why I think the design needs to start from queuing which
can't be done today, and has to be done in context of the destination.
Solving queuing justifies the added complexity if you will :)

> > I think loading the drivers with complexity is hurting us in so many
> > ways..  
> 
> Yeah, but having the low-level details available to the XDP program
> (such as HW queue occupancy for the egress hook) is one of the benefits
> of XDP, isn't it?

I think I glossed over the hope for having access to HW queue occupancy
- what exactly are you after? 

I don't think one can get anything beyond a BQL type granularity.
Reading over PCIe is out of question, device write back on high
granularity would burn through way too much bus throughput.

> Ultimately, I think Jesper's idea of having drivers operate exclusively
> on XDP frames and have the skb handling entirely in the core is an
> intriguing way to resolve this problem. Though this is obviously a
> long-term thing, and one might reasonably doubt we'll ever get there for
> existing drivers...
> 
> >> > And we're adding this before we considered the queuing problem.
> >> >
> >> > But if I'm alone in thinking this, and I'm not convincing anyone we
> >> > can move on :)    
> >> 
> >> I do share your concern that this will end up being incompatible with
> >> whatever solution we end up with for queueing. However, I don't
> >> necessarily think it will: I view the XDP egress hook as something
> >> that in any case will run *after* packets are dequeued from whichever
> >> intermediate queueing it has been through (if any). I think such a
> >> hook is missing in any case; for instance, it's currently impossible
> >> to implement something like CoDel (which needs to know how long a
> >> packet spent in the queue) in eBPF.  
> >
> > Possibly 🤔 I don't have a good mental image of how the XDP queuing
> > would work.
> >
> > Maybe once the queuing primitives are defined they can easily be
> > hooked into the Qdisc layer. With Martin's recent work all we need is 
> > a fifo that can store skb pointers, really...
> >
> > It'd be good if the BPF queuing could replace TC Qdiscs, rather than 
> > layer underneath.  
> 
> Hmm, hooking into the existing qdisc layer is an interesting idea.
> Ultimately, I fear it won't be feasible for performance reasons; but
> it's certainly something to consider. Maybe at least as an option?

For forwarding sure, but for locally generated traffic.. 🤷‍♂️

  reply	other threads:[~2020-02-02  4:18 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-23  1:41 [PATCH bpf-next 00/12] Add support for XDP in egress path David Ahern
2020-01-23  1:41 ` [PATCH bpf-next 01/12] net: Add new XDP setup and query commands David Ahern
2020-01-23  1:42 ` [PATCH bpf-next 02/12] net: Add BPF_XDP_EGRESS as a bpf_attach_type David Ahern
2020-01-23 11:34   ` Toke Høiland-Jørgensen
2020-01-23 21:32     ` David Ahern
2020-01-24  9:49       ` Toke Høiland-Jørgensen
2020-01-24  7:33   ` Martin Lau
2020-01-23  1:42 ` [PATCH bpf-next 03/12] net: Add IFLA_XDP_EGRESS for XDP programs in the egress path David Ahern
2020-01-23 11:35   ` Toke Høiland-Jørgensen
2020-01-23 21:33     ` David Ahern
2020-01-24 15:21       ` Jakub Kicinski
2020-01-24 15:36         ` Toke Høiland-Jørgensen
2020-01-26  1:43           ` David Ahern
2020-01-26  4:54             ` Alexei Starovoitov
2020-02-02 17:59               ` David Ahern
2020-01-26 12:49             ` Jesper Dangaard Brouer
2020-01-26 16:38               ` David Ahern
2020-01-26 22:17               ` Jakub Kicinski
2020-01-28 14:13                 ` Jesper Dangaard Brouer
2020-01-30 14:45                   ` Jakub Kicinski
2020-02-01 16:03                     ` Toke Høiland-Jørgensen
2020-02-02 17:48                       ` David Ahern
2020-01-26 22:11             ` Jakub Kicinski
2020-01-27  4:03               ` David Ahern
2020-01-27 14:16                 ` Jakub Kicinski
2020-01-28  3:43                   ` David Ahern
2020-01-28 13:57                     ` Jakub Kicinski
2020-02-01 16:24                       ` Toke Høiland-Jørgensen
2020-02-01 17:08                         ` Jakub Kicinski
2020-02-01 20:05                           ` Toke Høiland-Jørgensen
2020-02-02  4:15                             ` Jakub Kicinski [this message]
2020-02-03 19:56                               ` Toke Høiland-Jørgensen
2020-02-03 20:13                               ` Toke Høiland-Jørgensen
2020-02-03 22:15                                 ` Jesper Dangaard Brouer
2020-02-04 11:00                                   ` Toke Høiland-Jørgensen
2020-02-04 17:09                                     ` Jakub Kicinski
2020-02-05 15:30                                       ` Toke Høiland-Jørgensen
2020-02-02 17:45                           ` David Ahern
2020-02-02 19:12                             ` Jakub Kicinski
2020-02-02 17:43                       ` David Ahern
2020-02-02 19:31                         ` Jakub Kicinski
2020-02-02 21:51                           ` David Ahern
2020-02-01 15:59             ` Toke Høiland-Jørgensen
2020-02-02 17:54               ` David Ahern
2020-02-03 20:09                 ` Toke Høiland-Jørgensen
2020-01-23  1:42 ` [PATCH bpf-next 04/12] net: core: rename netif_receive_generic_xdp() to do_generic_xdp_core() David Ahern
2020-01-23  1:42 ` [PATCH bpf-next 05/12] tuntap: check tun_msg_ctl type at necessary places David Ahern
2020-01-23  1:42 ` [PATCH bpf-next 06/12] tun: move shared functions to if_tun.h David Ahern
2020-01-23  1:42 ` [PATCH bpf-next 07/12] vhost_net: user tap recvmsg api to access ptr ring David Ahern
2020-01-23  8:26   ` Michael S. Tsirkin
2020-01-23  1:42 ` [PATCH bpf-next 08/12] tuntap: remove usage of ptr ring in vhost_net David Ahern
2020-01-23  1:42 ` [PATCH bpf-next 09/12] tun: set egress XDP program David Ahern
2020-01-23  1:42 ` [PATCH bpf-next 10/12] tun: run XDP program in tx path David Ahern
2020-01-23  8:23   ` Michael S. Tsirkin
2020-01-24 13:36     ` Prashant Bhole
2020-01-24 13:44     ` Prashant Bhole
2020-01-23  1:42 ` [PATCH bpf-next 11/12] libbpf: Add egress XDP support David Ahern
2020-01-23  1:42 ` [PATCH bpf-next 12/12] samples/bpf: xdp1, add " David Ahern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200201201508.63141689@cakuba.hsd1.ca.comcast.net \
    --to=kuba@kernel.org \
    --cc=andriin@fb.com \
    --cc=ast@kernel.org \
    --cc=dahern@digitalocean.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@gmail.com \
    --cc=dsahern@kernel.org \
    --cc=jasowang@redhat.com \
    --cc=jbrouer@redhat.com \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=prashantbhole.linux@gmail.com \
    --cc=songliubraving@fb.com \
    --cc=toke@redhat.com \
    --cc=toshiaki.makita1@gmail.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).