netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tobias Waldekranz <tobias@waldekranz.com>
To: Vladimir Oltean <olteanv@gmail.com>
Cc: davem@davemloft.net, kuba@kernel.org, andrew@lunn.ch,
	vivien.didelot@gmail.com, f.fainelli@gmail.com,
	j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net,
	netdev@vger.kernel.org
Subject: Re: [PATCH v3 net-next 2/4] net: dsa: Link aggregation support
Date: Mon, 07 Dec 2020 22:49:39 +0100	[thread overview]
Message-ID: <87sg8h5lsc.fsf@waldekranz.com> (raw)
In-Reply-To: <20201204005653.uep7nvtg4ish5xct@skbuf>

On Fri, Dec 04, 2020 at 02:56, Vladimir Oltean <olteanv@gmail.com> wrote:
> On Fri, Dec 04, 2020 at 12:12:32AM +0100, Tobias Waldekranz wrote:
>> You make a lot of good points. I think it might be better to force the
>> user to be explicit about their choice though. Imagine something like
>> this:
>>
>> - We add NETIF_F_SWITCHDEV_OFFLOAD, which is set on switchdev ports by
>>   default. This flag is only allowed to be toggled when the port has no
>>   uppers - we do not want to deal with a port in a LAG in a bridge all
>>   of a sudden changing mode.
>>
>> - If it is set, we only allow uppers/tc-rules/etc that we can
>>   offload. If the user tries to configure something outside of that, we
>>   can suggest disabling offloading in the error we emit.
>>
>> - If it is not set, we just sit back and let the kernel do its thing.
>>
>> This would work well both for exotic LAG modes and for advanced
>> netfilter(ebtables)/tc setups I think. Example session:
>>
>> $ ip link add dev bond0 type bond mode balance-rr
>> $ ip link set dev swp0 master bond0
>> Error: swp0: balance-rr not supported when using switchdev offloading
>> $ ethtool -K swp0 switchdev off
>> $ ip link set dev swp0 master bond0
>> $ echo $?
>> 0
>
> And you want the default to be what, on or off? I believe on?
> I'd say the default should be off. The idea being that you could have
> "write once, run everywhere" types of scripts. You can only get that
> behavior with "off", otherwise you'd get random errors on some equipment
> and it wouldn't be portable. And "ethtool -K swp0 switchdev off" is a
> bit of a strange incantation to add to every script just to avoid
> errors.. But if the default switchdev mode is off, then what's the
> point in even having the knob, your poor Linus will still be confused
> and frustrated, and it won't help him any bit if he can flip the switch
> - it's too late, he already knows what the problem is by the time he
> finds the switch.

Yeah I can not argue with that. OK, I surrender, software fallback it
is.

>> > I would even go out on a limb and say hardcode the TX_TYPE_HASH in DSA
>> > for now. I would be completely surprised to see hardware that can
>> > offload anything else in the near future.
>>
>> If you tilt your head a little, I think active backup is really just a
>> trivial case of a hashed LAG wherein only a single member is ever
>> active. I.e. all buckets are always allocated to one port (effectivly
>> negating the hashing). The active member is controlled by software, so I
>> think we should be able to support that.
>
> Yup, my head is tilted and I see it now. If I understand this mode
> (never used it), then any hardware switch that can offload bridging can
> also offload the active-backup LAG.

Neither have I, but I guess you still need an actual LAG to associate
neighbors with, instead of the physical port? Maybe ocelot is different,
but on mv88e6xxx you otherwise get either (1) packet loss when the
active member changes or (2) duplicated packets when more than one
member is active.

>> mv88e6xxx could also theoretically be made to support broadcast. You can
>> enable any given bucket on multiple ports, but that seems silly.
>
> Yeah, the broadcast bonding mode looks like an oddball. It sounds to me
> almost like HSR/PRP/FRER but without the sequence numbering, which is a
> surefire way to make a mess out of everything. I have no idea how it is
> used (how duplicate elimination is achieved).

That is the way I interpret it as well. I suppose the dedup is done on
some higher layer, or it is some kind of redundant rx-only recorder or
something.

  reply	other threads:[~2020-12-07 21:50 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-02  9:13 [PATCH v3 net-next 0/4] net: dsa: Link aggregation support Tobias Waldekranz
2020-12-02  9:13 ` [PATCH v3 net-next 1/4] net: bonding: Notify ports about their initial state Tobias Waldekranz
2020-12-02 19:09   ` Jay Vosburgh
2020-12-02 21:52     ` Tobias Waldekranz
2020-12-03  0:39       ` Jay Vosburgh
2020-12-03  8:16         ` Tobias Waldekranz
2020-12-02  9:13 ` [PATCH v3 net-next 2/4] net: dsa: Link aggregation support Tobias Waldekranz
2020-12-02 10:07   ` Vladimir Oltean
2020-12-02 10:51     ` Tobias Waldekranz
2020-12-02 18:58   ` Jakub Kicinski
2020-12-02 21:29     ` Tobias Waldekranz
2020-12-02 21:32       ` Vladimir Oltean
2020-12-03 16:24   ` Vladimir Oltean
2020-12-03 20:53     ` Tobias Waldekranz
2020-12-03 21:09       ` Andrew Lunn
2020-12-03 21:35         ` Tobias Waldekranz
2020-12-04  0:35           ` Vladimir Oltean
2020-12-03 21:57       ` Vladimir Oltean
2020-12-03 23:12         ` Tobias Waldekranz
2020-12-04  0:56           ` Vladimir Oltean
2020-12-07 21:49             ` Tobias Waldekranz [this message]
2020-12-04  1:33         ` Andrew Lunn
2020-12-04  4:18           ` Florian Fainelli
2020-12-07 21:56             ` Tobias Waldekranz
2020-12-03 20:48   ` Vladimir Oltean
2020-12-04  2:20   ` Andrew Lunn
2020-12-07 21:19     ` Tobias Waldekranz
2020-12-07 23:26       ` Andrew Lunn
2020-12-09  8:57         ` Tobias Waldekranz
2020-12-09 14:27           ` Andrew Lunn
2020-12-09 15:21             ` Tobias Waldekranz
2020-12-09 23:03               ` Andrew Lunn
2020-12-04  4:04   ` Florian Fainelli
2020-12-08 11:23   ` Vladimir Oltean
2020-12-08 15:33     ` Tobias Waldekranz
2020-12-08 16:37       ` Vladimir Oltean
2020-12-09  8:37         ` Tobias Waldekranz
2020-12-09 10:53           ` Vladimir Oltean
2020-12-09 14:11             ` Tobias Waldekranz
2020-12-09 16:04               ` Vladimir Oltean
2020-12-09 22:01                 ` Tobias Waldekranz
2020-12-09 22:21                   ` Vladimir Oltean
2020-12-10 10:18                     ` Tobias Waldekranz
2020-12-09 22:59                 ` Andrew Lunn
2020-12-10  1:05                   ` Vladimir Oltean
2020-12-09 14:23             ` Andrew Lunn
2020-12-09 23:17               ` Vladimir Oltean
2020-12-08 17:26     ` Andrew Lunn
2020-12-11 20:50     ` Tobias Waldekranz
2020-12-12 14:26       ` Vladimir Oltean
2020-12-13 21:18         ` Tobias Waldekranz
2020-12-14  0:12           ` Vladimir Oltean
2020-12-14 11:42             ` Ido Schimmel
2020-12-16 15:15               ` Tobias Waldekranz
2020-12-16 18:48                 ` Ido Schimmel
2020-12-14  9:41           ` Tobias Waldekranz
2020-12-02  9:13 ` [PATCH v3 net-next 3/4] net: dsa: mv88e6xxx: " Tobias Waldekranz
2020-12-02  9:13 ` [PATCH v3 net-next 4/4] net: dsa: tag_dsa: Support reception of packets from LAG devices Tobias Waldekranz
2020-12-04  3:58   ` Florian Fainelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sg8h5lsc.fsf@waldekranz.com \
    --to=tobias@waldekranz.com \
    --cc=andrew@lunn.ch \
    --cc=andy@greyhouse.net \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=j.vosburgh@gmail.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=olteanv@gmail.com \
    --cc=vfalico@gmail.com \
    --cc=vivien.didelot@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).