All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Berg <johannes@sipsolutions.net>
To: "Toke Høiland-Jørgensen" <toke@toke.dk>,
	"Michał Kazior" <kazikcz@gmail.com>,
	lorenzo.bianconi@redhat.com
Cc: kvalo@codeaurora.org,
	linux-wireless <linux-wireless@vger.kernel.org>,
	Felix Fietkau <nbd@nbd.name>,
	brouer@redhat.com
Subject: Re: [RFC 0/5] add XDP support to mt76x2e/mt76x0e drivers
Date: Mon, 03 Dec 2018 18:57:04 +0100	[thread overview]
Message-ID: <bb855ffd6043743ef10058c9395d0c6b42a7832a.camel@sipsolutions.net> (raw)
In-Reply-To: <87k1kwq9bo.fsf@toke.dk> (sfid-20181129_145320_748252_63BF551A)

Hi Toke, all,

Sorry I wasn't around for the discussion, I've been travelling and sick.

I'm picking this as an arbitrary point in the discussion to reply to,
hope you don't feel bad about that.

>> [A-MPDU reordering]
> In principle, all of this could be done. But we need to think carefully
> about what things it really makes sense to offload to XDP. In the
> general case, we will end up re-implementing all of mac80211 in eBPF,
> which is obviously not ideal.

Agree with this, completely. Mind you, it's not even simple to do that,
because each kind of hardware behaves differently. See below.

> For crypto in particular, I wonder if there is much of a speedup to be
> had if the crypto is in software anyway. Wouldn't that dominate the
> processing time?

Yeah, I'd also think the cost of skb allocation is dwarfed by the cost
of doing the crypto, but not really sure, maybe you have a fast special
AES instruction? :)


Anyway, here are my thoughts after reading most of the thread and having
discussed it with Lorenzo a while back and today:

1) I'm going to restrict most of the discussion to mac80211, but there
   are interesting cases outside of it as well. Much of the same
   arguments apply, though it means *more* feature combinatorial
   explosion to be able to handle everything.

2) We don't yet have it (but it's been suggested for years and we might
   also get it on Intel hardware eventually), but if we have TX and/or
   RX 802.3/802.11 frame conversion offload then I think we have no
   argument - XDP applies pure and simple as is in each direction you
   have offloads for, unless you do something really stupid in HW like
   still having to do reorder buffer or PN checking based on RX frame
   metadata. I hope nobody does that, but then I suppose you could do it
   before the XDP hooks w/o allocating SKBs.

3) TX from XDP (i.e. X -> wifi) could be implemented in mac80211, but is
   subject to some rather iffy details, for example:
   a) mac80211 may need more headroom than the input has, do we have to
      copy the whole data? depending on xdp_mem_type I think we may need
      that, rather than being able to make an skb with frag pages and
      extra headroom? We don't know it was just a single page and I
      guess we can't ref that page anyway, but maybe we could meddle
      with the SKB frag data somehow to avoid that.
   b) mac80211 may need to software encrypt - copy the whole data
   c) lifetime issues - we need to call xdp_return_frame(), which I
      guess we can tie to the skb we'd create anyway [2]
   (and IMHO we need to allocate an skb anyway, due to TXQs like Toke
    mentioned and much other possible software handling)

4) XDP during RX - well, that's _mostly_ what this thread has been
   about, but really I think it's a can of worms. Consider:
   a) drivers may or may not do SW decryption
   b) drivers may or may not do PN checking - so your XDP program might
      end up having to do that or risk security/replay bugs!
      (which, btw, you can only do after reordering so see below)
   c) drivers may or may not do A-MSDU deaggregation, do you[1] really
      want to handle that in XDP? You could argue outer code should walk
      over the subframes, but then dropping becomes hard - and this
      really should only happen after a) and b)
      Oh, and IIRC there are cases that use A-MSDUs for single frames
      just to encapsulate different MAC addresses.
      Speaking of which, you have to validate the inner MAC addresses
      (but it's not a trivial comparison) or risk security bugs again.
   d) drivers may or may not do A-MPDU reordering (this was discussed
      earlier in the thread), which means even if you drop a frame you
      haven't just affected that connection but potentially stalled
      everything until the reorder buffer times out (~100ms), or we need
      a way for mac80211 to insert a fake skb into the reorder buffer at
      that spot (but then you allocate an skb again and high-speed
      traffic where you'd care most is always in an A-MPDU session)
   e) speaking of which, Intel's device has hardware assist things for
      reordering, but not fully offloaded reordering, so you might end
      up with hardware-specific ways of doing this in XDP?
   f) Some data frames are used internally in the stack, e.g. for TDLS.
   g) data frames may also affect the powersave state of a station, so
      if you have e.g. ath9k and the station sends a frame to wake up,
      but you decide you XDP_DROP it, the powersave state will get
      messed up, unless ... special handling (either mac80211 or the
      bpf program) again. Also, U-APSD.
   h) There are also simple things like QoS/non-QoS that add to the
      combinatorial explosion of things you have to handle
   i) data frames could be sent by anyone, not just STAs connected to
      your network, so I guess you'd have to filter that and mac80211
      would have to expose a station table lookup helper or something?
      Alternatively shuffle the code around so mac80211 does that first,
      but then we already have an SKB anyway since we got it from the
      driver, and you want this stuff to run in the driver.
   j) in AP mode, if data frames are sent by somebody not belonging to
      the network, we should send an event to hostapd so it can tell
      them they're not on the network
   k) some drivers implement client-side powersave in mac80211 and need
      the more-data bit even if you were to XDP_DROP a frame. I hate
      this code with a passion, but it's there for now.
   m) did somebody mention fragmentation yet? and the special security
      handling for CCMP/GCMP there?
   n) I guess checking for port authorized/not would be one of the more
      trivial things, just adds another helper/bit/state/something.
   o) not supported yet by anyone (well, mac80211 at least), but the
      standard is adding provisions to leave out the SNAP/LLC header,
      which requires more out-of-band data (i.e. data not directly in
      the frame)
   p) there's also iwlwifi with it's multiqueue RX, but let's not go
      there now

[Toke, do you still think it's not sufficiently different? ;-P]

Of course with XDP anyone can shoot themselves in the foot easily, but I
also think we shouldn't make it too easy to do things like XDP_REDIRECT
on a frame that hasn't even had its security properties validated. It'd
work, but an attacker's frame could be accepted as well. We have enough
security bugs as is, I'm not sure I'd want that extra surface...


Now, of course you can pick each one of these points and say "but that's
not relevant for my ${favourite_hw}", which is true. Each hardware
(family) will likely make *most* of fixed or not relevant. Restrict
sufficiently though and you'll end up with *exactly* ${favourite_hw}
doing it the way you want to assume it is like.

E.g. let's say you require PN checking in hardware - maybe 2 or 3
drivers (I'm thinking ath10k, maybe mt76, maybe some ralink family)?

Or let's say you want A-MPDU reordering offload to not have to worry
about that and replays and all - I'm guessing only ath10k now and that
perhaps even only in 802.3 mode?

Or let's say you have enough filtering to not have to worry about 4j),
what if you enable monitor mode at the same time? Can you tell the
difference in the driver before calling the XDP program?


As fun as it sounds (on paper) to have XDP available for wifi, I'm not
sure we really should do that. We'd have to handle many of the issues
outlined above (which is at minimum a LOT of helper code that each at
least somewhat secure and portable XDP-wifi program would have to call),
and it looks like in the future we'll be moving more towards offloaded
802.11/.3 frame conversion anyway, at which point it all becomes more or
less useless.

So I see a few ways this might go:
 A) We basically allow everything, and let the XDP program authors
    figure it out. They'll eventually learn to handle the whole mac80211
    datapath (items in 4)), and will add a TON of helper code for it.
 B) We try to foresee all the needed helper code, and essentially build
    a parallel mac80211 datapath equivalent to the one today, but
    without SKBs.
 C) We rebuild the mac80211 datapath w/o SKBs so we can put XDP behind
    it. Ugh.
 D) We restrict it to ~1 driver that does most of the important things
    anyway in the device, thus hardcoding most of the variables, but
    then we expand out to A) or B)
 E) Like D, but then that device learns to do 802.3/.11 conversion.
    Oops.
 F) Like D, but then that *driver* learns to do 802.3/.11 conversion.
   
Also oops.

Maybe F) kind of has a good direction in it? If the device does enough
that we don't have to worry about most things, then - provided we
actually implement the code to allow 802.3/.11 conversion offload in
mac80211 - the driver can probably easily do that itself rather than
punting to mac80211's fast-RX/fast-TX, which would likely be used?

Or perhaps that means we should teach mac80211 fast-RX to be skb-less,
or something like that? But I think that doesn't solve the reorder
buffer issue either.


(Let's also not forget that the original use case was something around
defective HW filters for BSSIDs - IMHO that should just be *hardcoded*
into the driver anyway because all other frames are useless to you,
except perhaps if also monitor mode is active but then you need an SKB!)


Ok, this got longer than I wanted...

johannes

[1] generally speaking, no particular "you" intended
[2] btw, i40e_clean_xdp_tx_buffer() looks kinda weird? it returns first
and then DMA-unmaps, you'd think it should be done the other way around?



  reply	other threads:[~2018-12-03 17:57 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-27 22:21 [RFC 0/5] add XDP support to mt76x2e/mt76x0e drivers Lorenzo Bianconi
2018-11-27 22:21 ` [RFC 1/5] mac80211: introduce ieee80211_xdp handler Lorenzo Bianconi
2018-11-27 22:21 ` [RFC 2/5] mac80211: introduce ieee80211_vif_to_netdev routine Lorenzo Bianconi
2018-11-27 22:21 ` [RFC 3/5] mt76: split mt76_dma_rx_reset in init_rx_reset and complete_rx_reset Lorenzo Bianconi
2018-11-27 22:21 ` [RFC 4/5] mt76: make mt76x02_vif_init return int Lorenzo Bianconi
2018-11-27 22:21 ` [RFC 5/5] mt76: add XDP support Lorenzo Bianconi
2018-11-28 10:15 ` [RFC 0/5] add XDP support to mt76x2e/mt76x0e drivers Kalle Valo
2018-11-28 10:44   ` Lorenzo Bianconi
2018-11-28 12:36     ` Toke Høiland-Jørgensen
2018-11-28 12:53       ` Michał Kazior
2018-11-28 14:19         ` Toke Høiland-Jørgensen
2018-11-28 13:11       ` Lorenzo Bianconi
2018-11-28 14:21         ` Toke Høiland-Jørgensen
2018-11-28 14:35           ` Lorenzo Bianconi
2018-11-28 14:43             ` Toke Høiland-Jørgensen
2018-11-28 15:35               ` Lorenzo Bianconi
2018-11-28 23:12                 ` Toke Høiland-Jørgensen
2018-11-29 12:59                   ` Lorenzo Bianconi
2018-11-29 13:29                     ` Toke Høiland-Jørgensen
2018-11-29 13:45                     ` Michał Kazior
2018-11-29 13:53                       ` Toke Høiland-Jørgensen
2018-12-03 17:57                         ` Johannes Berg [this message]
2018-12-03 19:36                           ` Toke Høiland-Jørgensen
2018-12-03 19:41                             ` Johannes Berg
2018-12-03 19:51                               ` Toke Høiland-Jørgensen
2018-12-03 20:00                               ` Lorenzo Bianconi
2018-11-28 15:43       ` Jesper Dangaard Brouer
2018-11-28 15:43         ` Jesper Dangaard Brouer
2018-11-29 10:30         ` Lorenzo Bianconi
2018-11-29 13:27           ` Toke Høiland-Jørgensen
2018-11-29 13:27             ` Toke Høiland-Jørgensen
2018-11-29 13:41             ` Michał Kazior
2018-11-29 13:48               ` Toke Høiland-Jørgensen
2018-11-29 13:58             ` Lorenzo Bianconi
2018-11-29 14:06               ` Toke Høiland-Jørgensen
2018-11-29 14:06                 ` Toke Høiland-Jørgensen
2018-11-29 15:45                 ` Lorenzo Bianconi
2018-11-29 16:06                   ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bb855ffd6043743ef10058c9395d0c6b42a7832a.camel@sipsolutions.net \
    --to=johannes@sipsolutions.net \
    --cc=brouer@redhat.com \
    --cc=kazikcz@gmail.com \
    --cc=kvalo@codeaurora.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=lorenzo.bianconi@redhat.com \
    --cc=nbd@nbd.name \
    --cc=toke@toke.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.