netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Tobias Waldekranz" <tobias@waldekranz.com>
To: "Vladimir Oltean" <olteanv@gmail.com>
Cc: <andrew@lunn.ch>, <vivien.didelot@gmail.com>,
	<f.fainelli@gmail.com>, <netdev@vger.kernel.org>,
	"Ido Schimmel" <idosch@idosch.org>
Subject: Re: [RFC PATCH 4/4] net: dsa: tag_edsa: support reception of packets from lag devices
Date: Wed, 28 Oct 2020 23:31:58 +0100	[thread overview]
Message-ID: <C6OVPVXHQ5OA.21IJYAHUW1SW4@wkz-x280> (raw)
In-Reply-To: <20201028181824.3dccguch7d5iij2r@skbuf>

On Wed Oct 28, 2020 at 9:18 PM CET, Vladimir Oltean wrote:
> Let's say you receive a packet on the standalone swp0, and you need to
> perform IP routing towards the bridged domain br0. Some switchdev/DSA
> ports are bridged and some aren't.
>
> The switchdev/DSA switch will attempt to do the IP routing step first,
> and it _can_ do that because it is aware of the br0 interface, so it
> will decrement the TTL and replace the L2 header.
>
> At this stage we have a modified IP packet, which corresponds with what
> should be injected into the hardware's view of the br0 interface. The
> packet is still in the switchdev/DSA hardware data path.
>
> But then, the switchdev/DSA hardware will look up the FDB in the name of
> br0, in an attempt of finding the destination port for the packet. But
> the packet should be delivered to a station connected to eth0 (e1000,
> foreign interface). So that's part of the exception path, the packet
> should be delivered to the CPU.
>
> But the packet was already modified by the hardware data path (IP
> forwarding has already taken place)! So how should the DSA/switchdev
> hardware deliver the packet to the CPU? It has 2 options:
>
> (a) unwind the entire packet modification, cancel the IP forwarding and
> deliver the unmodified packet to the CPU on behalf of swp0, the
> ingress port. Then let software IP forwarding plus software bridging
> deal with it, so that it can reach the e1000.
> (b) deliver the packet to the CPU in the middle of the hardware
> forwarding data path, where the exception/miss occurred, aka deliver
> it on behalf of br0. Modified by IP forwarding. This is where we'd
> have to manually inject skb->dev into br0 somehow.

The thing is, unlike L2 where the hardware will add new neighbors to
its FDB autonomously, every entry in the hardware FIB is under the
strict control of the CPU. So I think you can avoid much of this
headache simply by determining if a given L3 nexthop/neighbor is
"foreign" to the switch or not, and then just skip offloading for
those entries.

You miss out on the hardware acceleration of replacing the L2 header
of course. But my guess would be that once you have payed the tax of
receiving the buffer via the NIC driver, allocated an skb, and called
netif_rx() etc. the routing operation will be a rounding error. At
least on smaller devices where the FIB is typically quite small.

> Maybe this sounds a bit crazy, considering that we don't have IP
> forwarding hardware with DSA today, and I am not exactly sure how other
> switchdev drivers deal with this exception path today. But nonetheless,
> it's almost impossible for DSA switches with IP forwarding abilities to
> never come up some day, so we ought to have our mind set about how the
> RX data path should like, and whether injecting directly into an upper
> is icky or a fact of life.

Not crazy at all. In fact the Amethyst (6393X), for which there is a
patchset available on netdev, is capable of doing this (the hardware
is - the posted patches do not implement it).

> Things get even more interesting when this is a cascaded DSA setup, and
> the bridging and routing are cross-chip. There, the FIB/FDB of 2 there
> isn't really any working around the problem that the packet might need
> to be delivered to the CPU somewhere in the middle of the data path, and
> it would need to be injected into the RX path of an upper interface in
> that case.
>
> What do you think?


  reply	other threads:[~2020-10-28 22:52 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-27 10:51 [RFC PATCH 0/4] net: dsa: link aggregation support Tobias Waldekranz
2020-10-27 10:51 ` [RFC PATCH 1/4] net: dsa: mv88e6xxx: use ethertyped dsa for 6390/6390X Tobias Waldekranz
2020-10-27 14:52   ` Marek Behun
2020-10-27 14:54     ` Marek Behun
2020-10-27 14:58       ` Marek Behun
2020-10-27 10:51 ` [RFC PATCH 2/4] net: dsa: link aggregation support Tobias Waldekranz
2020-10-28  0:58   ` Vladimir Oltean
2020-10-28 14:03     ` Tobias Waldekranz
2020-10-27 10:51 ` [RFC PATCH 3/4] net: dsa: mv88e6xxx: " Tobias Waldekranz
2020-10-27 10:51 ` [RFC PATCH 4/4] net: dsa: tag_edsa: support reception of packets from lag devices Tobias Waldekranz
2020-10-28 12:05   ` Vladimir Oltean
2020-10-28 15:28     ` Tobias Waldekranz
2020-10-28 18:18       ` Vladimir Oltean
2020-10-28 22:31         ` Tobias Waldekranz [this message]
2020-10-28 23:08           ` Vladimir Oltean
2020-10-29  7:47             ` Tobias Waldekranz
2020-10-30  9:21               ` Vladimir Oltean
2020-11-01 11:31         ` Ido Schimmel
2020-10-27 12:27 ` [RFC PATCH 0/4] net: dsa: link aggregation support Vladimir Oltean
2020-10-27 14:29   ` Andrew Lunn
2020-10-27 14:59   ` Tobias Waldekranz
2020-10-27 14:00 ` Andrew Lunn
2020-10-27 15:09   ` Tobias Waldekranz
2020-10-27 15:05 ` Marek Behun
2020-10-27 15:23   ` Andrew Lunn
2020-10-27 18:25     ` Tobias Waldekranz
2020-10-27 18:33       ` Marek Behun
2020-10-27 19:04         ` Vladimir Oltean
2020-10-27 19:21           ` Tobias Waldekranz
2020-10-27 19:00       ` Vladimir Oltean
2020-10-27 19:37         ` Tobias Waldekranz
2020-10-27 20:02           ` Vladimir Oltean
2020-10-27 20:53             ` Tobias Waldekranz
2020-10-27 22:32               ` Vladimir Oltean
2020-10-28  0:27                 ` Tobias Waldekranz
2020-10-28 22:35       ` Marek Behun
2020-10-27 22:36 ` Andrew Lunn
2020-10-28  0:45   ` Tobias Waldekranz
2020-10-28  1:03     ` Andrew Lunn
2020-11-11  4:28 ` Florian Fainelli
2020-11-19 10:51 ` Vladimir Oltean
2020-11-19 11:52   ` Tobias Waldekranz
2020-11-19 18:12     ` Vladimir Oltean

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C6OVPVXHQ5OA.21IJYAHUW1SW4@wkz-x280 \
    --to=tobias@waldekranz.com \
    --cc=andrew@lunn.ch \
    --cc=f.fainelli@gmail.com \
    --cc=idosch@idosch.org \
    --cc=netdev@vger.kernel.org \
    --cc=olteanv@gmail.com \
    --cc=vivien.didelot@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).