linux-omap.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nikolay Aleksandrov <nikolay@nvidia.com>
To: Vladimir Oltean <olteanv@gmail.com>,
	Jakub Kicinski <kuba@kernel.org>,
	"David S. Miller" <davem@davemloft.net>
Cc: Andrew Lunn <andrew@lunn.ch>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	Florian Fainelli <f.fainelli@gmail.com>,
	Tobias Waldekranz <tobias@waldekranz.com>,
	Claudiu Manoil <claudiu.manoil@nxp.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Roopa Prabhu <roopa@nvidia.com>, Jiri Pirko <jiri@resnulli.us>,
	Ido Schimmel <idosch@idosch.org>,
	Alexandre Belloni <alexandre.belloni@bootlin.com>,
	UNGLinuxDriver@microchip.com, Ivan Vecera <ivecera@redhat.com>,
	linux-omap@vger.kernel.org,
	Vladimir Oltean <vladimir.oltean@nxp.com>
Subject: Re: [PATCH v3 net-next 00/12] Better support for sandwiched LAGs with bridge and DSA
Date: Mon, 22 Mar 2021 18:27:24 +0200	[thread overview]
Message-ID: <3230fb8e-eb75-acc1-e53c-c2525e022370@nvidia.com> (raw)
In-Reply-To: <20210320223448.2452869-1-olteanv@gmail.com>

On 21/03/2021 00:34, Vladimir Oltean wrote:
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
> 
> The objective of this series is to make LAG uppers on top of switchdev
> ports work regardless of which order we link interfaces to their masters
> (first make the port join the LAG, then the LAG join the bridge, or the
> other way around).
> 
> There was a design decision to be made in patches 2-4 on whether we
> should adopt the "push" model (which attempts to solve the problem
> centrally, in the bridge layer) where the driver just calls:
> 
>   switchdev_bridge_port_offloaded(brport_dev,
>                                   &atomic_notifier_block,
>                                   &blocking_notifier_block,
>                                   extack);
> 
> and the bridge just replays the entire collection of switchdev port
> attributes and objects that it has, in some predefined order and with
> some predefined error handling logic;
> 
> 
> or the "pull" model (which attempts to solve the problem by giving the
> driver the rope to hang itself), where the driver, apart from calling:
> 
>   switchdev_bridge_port_offloaded(brport_dev, extack);
> 
> has the task of "dumpster diving" (as Tobias puts it) through the bridge
> attributes and objects by itself, by calling:
> 
>   - br_vlan_replay
>   - br_fdb_replay
>   - br_mdb_replay
>   - br_vlan_enabled
>   - br_port_flag_is_set
>   - br_port_get_stp_state
>   - br_multicast_router
>   - br_get_ageing_time
> 
> (not necessarily all of them, and not necessarily in this order, and
> with driver-defined error handling).
> 
> Even though I'm not in love myself with the "pull" model, I chose it
> because there is a fundamental trick with replaying switchdev events
> like this:
> 
> ip link add br0 type bridge
> ip link add bond0 type bond
> ip link set bond0 master br0
> ip link set swp0 master bond0 <- this will replay the objects once for
>                                  the bond0 bridge port, and the swp0
>                                  switchdev port will process them
> ip link set swp1 master bond0 <- this will replay the objects again for
>                                  the bond0 bridge port, and the swp1
>                                  switchdev port will see them, but swp0
>                                  will see them for the second time now
> 
> Basically I believe that it is implementation defined whether the driver
> wants to error out on switchdev objects seen twice on a port, and the
> bridge should not enforce a certain model for that. For example, for FDB
> entries added to a bonding interface, the underling switchdev driver
> might have an abstraction for just that: an FDB entry pointing towards a
> logical (as opposed to physical) port. So when the second port joins the
> bridge, it doesn't realy need to replay FDB entries, since there is
> already at least one hardware port which has been receiving those
> events, and the FDB entries don't need to be added a second time to the
> same logical port.
> In the other corner, we have the drivers that handle switchdev port
> attributes on a LAG as individual switchdev port attributes on physical
> ports (example: VLAN filtering). In fact, the switchdev_handle_port_attr_set
> helper facilitates this: it is a fan-out from a single orig_dev towards
> multiple lowers that pass the check_cb().
> But that's the point: switchdev_handle_port_attr_set is just a helper
> which the driver _opts_ to use. The bridge can't enforce the "push"
> model, because that would assume that all drivers handle port attributes
> in the same way, which is probably false.
> 
> For this reason, I preferred to go with the "pull" mode for this patch
> set. Just to see how bad it is for other switchdev drivers to copy-paste
> this logic, I added the pull support to ocelot too, and I think it's
> pretty manageable.
> 
> Vladimir Oltean (12):
>   net: dsa: call dsa_port_bridge_join when joining a LAG that is already
>     in a bridge
>   net: dsa: pass extack to dsa_port_{bridge,lag}_join
>   net: dsa: inherit the actual bridge port flags at join time
>   net: dsa: sync up with bridge port's STP state when joining
>   net: dsa: sync up VLAN filtering state when joining the bridge
>   net: dsa: sync multicast router state when joining the bridge
>   net: dsa: sync ageing time when joining the bridge
>   net: dsa: replay port and host-joined mdb entries when joining the
>     bridge
>   net: dsa: replay port and local fdb entries when joining the bridge
>   net: dsa: replay VLANs installed on port when joining the bridge
>   net: ocelot: call ocelot_netdevice_bridge_join when joining a bridged
>     LAG
>   net: ocelot: replay switchdev events when joining bridge
> 
>  drivers/net/dsa/ocelot/felix.c         |   4 +-
>  drivers/net/ethernet/mscc/ocelot.c     |  18 +--
>  drivers/net/ethernet/mscc/ocelot_net.c | 208 +++++++++++++++++++++----
>  include/linux/if_bridge.h              |  40 +++++
>  include/net/switchdev.h                |   1 +
>  include/soc/mscc/ocelot.h              |   6 +-
>  net/bridge/br_fdb.c                    |  52 +++++++
>  net/bridge/br_mdb.c                    |  84 ++++++++++
>  net/bridge/br_stp.c                    |  27 ++++
>  net/bridge/br_vlan.c                   |  71 +++++++++
>  net/dsa/dsa_priv.h                     |   9 +-
>  net/dsa/port.c                         | 203 ++++++++++++++++++------
>  net/dsa/slave.c                        |  11 +-
>  13 files changed, 631 insertions(+), 103 deletions(-)
> 

Hi Vladimir,
Please pull all of the new bridge code into separate patches with the proper
bridge subsystems tagged in the subject.
I'll review the bridge changes in a minute.

Thanks,
 Nik

      parent reply	other threads:[~2021-03-22 16:28 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-20 22:34 [PATCH v3 net-next 00/12] Better support for sandwiched LAGs with bridge and DSA Vladimir Oltean
2021-03-20 22:34 ` [PATCH v3 net-next 01/12] net: dsa: call dsa_port_bridge_join when joining a LAG that is already in a bridge Vladimir Oltean
2021-03-20 22:34 ` [PATCH v3 net-next 02/12] net: dsa: pass extack to dsa_port_{bridge,lag}_join Vladimir Oltean
2021-03-20 22:34 ` [PATCH v3 net-next 03/12] net: dsa: inherit the actual bridge port flags at join time Vladimir Oltean
2021-03-20 22:34 ` [PATCH v3 net-next 04/12] net: dsa: sync up with bridge port's STP state when joining Vladimir Oltean
2021-03-20 22:34 ` [PATCH v3 net-next 05/12] net: dsa: sync up VLAN filtering state when joining the bridge Vladimir Oltean
2021-03-20 22:34 ` [PATCH v3 net-next 06/12] net: dsa: sync multicast router " Vladimir Oltean
2021-03-20 22:34 ` [PATCH v3 net-next 07/12] net: dsa: sync ageing time " Vladimir Oltean
2021-03-22 16:02   ` Florian Fainelli
2021-03-20 22:34 ` [PATCH v3 net-next 08/12] net: dsa: replay port and host-joined mdb entries " Vladimir Oltean
2021-03-22 16:35   ` Nikolay Aleksandrov
2021-03-22 16:56     ` Vladimir Oltean
2021-03-22 17:00       ` Nikolay Aleksandrov
2021-03-20 22:34 ` [PATCH v3 net-next 09/12] net: dsa: replay port and local fdb " Vladimir Oltean
2021-03-22 16:39   ` Nikolay Aleksandrov
2021-03-20 22:34 ` [PATCH v3 net-next 10/12] net: dsa: replay VLANs installed on port " Vladimir Oltean
2021-03-22 16:48   ` Nikolay Aleksandrov
2021-03-20 22:34 ` [PATCH v3 net-next 11/12] net: ocelot: call ocelot_netdevice_bridge_join when joining a bridged LAG Vladimir Oltean
2021-03-20 22:34 ` [PATCH v3 net-next 12/12] net: ocelot: replay switchdev events when joining bridge Vladimir Oltean
2021-03-22 16:27 ` Nikolay Aleksandrov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3230fb8e-eb75-acc1-e53c-c2525e022370@nvidia.com \
    --to=nikolay@nvidia.com \
    --cc=UNGLinuxDriver@microchip.com \
    --cc=alexandre.belloni@bootlin.com \
    --cc=andrew@lunn.ch \
    --cc=claudiu.manoil@nxp.com \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=idosch@idosch.org \
    --cc=ivecera@redhat.com \
    --cc=jiri@resnulli.us \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-omap@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=olteanv@gmail.com \
    --cc=roopa@nvidia.com \
    --cc=tobias@waldekranz.com \
    --cc=vivien.didelot@gmail.com \
    --cc=vladimir.oltean@nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).