linux-omap.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tobias Waldekranz <tobias@waldekranz.com>
To: Vladimir Oltean <olteanv@gmail.com>,
	Jakub Kicinski <kuba@kernel.org>,
	"David S. Miller" <davem@davemloft.net>
Cc: Andrew Lunn <andrew@lunn.ch>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	Florian Fainelli <f.fainelli@gmail.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Roopa Prabhu <roopa@nvidia.com>,
	Nikolay Aleksandrov <nikolay@nvidia.com>,
	Jiri Pirko <jiri@resnulli.us>, Ido Schimmel <idosch@idosch.org>,
	Alexandre Belloni <alexandre.belloni@bootlin.com>,
	UNGLinuxDriver@microchip.com, Vadym Kochan <vkochan@marvell.com>,
	Taras Chornyi <tchornyi@marvell.com>,
	Grygorii Strashko <grygorii.strashko@ti.com>,
	Vignesh Raghavendra <vigneshr@ti.com>,
	Ioana Ciornei <ioana.ciornei@nxp.com>,
	Ivan Vecera <ivecera@redhat.com>,
	linux-omap@vger.kernel.org,
	Vladimir Oltean <vladimir.oltean@nxp.com>
Subject: Re: [RFC PATCH v2 net-next 09/16] net: dsa: replay port and local fdb entries when joining the bridge
Date: Mon, 22 Mar 2021 16:44:41 +0100	[thread overview]
Message-ID: <87wntzmbva.fsf@waldekranz.com> (raw)
In-Reply-To: <20210318231829.3892920-10-olteanv@gmail.com>

On Fri, Mar 19, 2021 at 01:18, Vladimir Oltean <olteanv@gmail.com> wrote:
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
>
> When a DSA port joins a LAG that already had an FDB entry pointing to it:
>
> ip link set bond0 master br0
> bridge fdb add dev bond0 00:01:02:03:04:05 master static
> ip link set swp0 master bond0
>
> the DSA port will have no idea that this FDB entry is there, because it
> missed the switchdev event emitted at its creation.
>
> Ido Schimmel pointed this out during a discussion about challenges with
> switchdev offloading of stacked interfaces between the physical port and
> the bridge, and recommended to just catch that condition and deny the
> CHANGEUPPER event:
> https://lore.kernel.org/netdev/20210210105949.GB287766@shredder.lan/
>
> But in fact, we might need to deal with the hard thing anyway, which is
> to replay all FDB addresses relevant to this port, because it isn't just
> static FDB entries, but also local addresses (ones that are not
> forwarded but terminated by the bridge). There, we can't just say 'oh
> yeah, there was an upper already so I'm not joining that'.
>
> So, similar to the logic for replaying MDB entries, add a function that
> must be called by individual switchdev drivers and replays local FDB
> entries as well as ones pointing towards a bridge port. This time, we
> use the atomic switchdev notifier block, since that's what FDB entries
> expect for some reason.
>
> Reported-by: Ido Schimmel <idosch@idosch.org>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
>  include/linux/if_bridge.h |  9 +++++++
>  include/net/switchdev.h   |  1 +
>  net/bridge/br_fdb.c       | 52 +++++++++++++++++++++++++++++++++++++++
>  net/dsa/dsa_priv.h        |  1 +
>  net/dsa/port.c            |  4 +++
>  net/dsa/slave.c           |  2 +-
>  6 files changed, 68 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
> index 4c25dafb013d..89596134e88f 100644
> --- a/include/linux/if_bridge.h
> +++ b/include/linux/if_bridge.h
> @@ -147,6 +147,8 @@ void br_fdb_clear_offload(const struct net_device *dev, u16 vid);
>  bool br_port_flag_is_set(const struct net_device *dev, unsigned long flag);
>  u8 br_port_get_stp_state(const struct net_device *dev);
>  clock_t br_get_ageing_time(struct net_device *br_dev);
> +int br_fdb_replay(struct net_device *br_dev, struct net_device *dev,
> +		  struct notifier_block *nb);
>  #else
>  static inline struct net_device *
>  br_fdb_find_port(const struct net_device *br_dev,
> @@ -175,6 +177,13 @@ static inline clock_t br_get_ageing_time(struct net_device *br_dev)
>  {
>  	return 0;
>  }
> +
> +static inline int br_fdb_replay(struct net_device *br_dev,
> +				struct net_device *dev,
> +				struct notifier_block *nb)
> +{
> +	return -EINVAL;
> +}
>  #endif
>  
>  #endif
> diff --git a/include/net/switchdev.h b/include/net/switchdev.h
> index b7fc7d0f54e2..7688ec572757 100644
> --- a/include/net/switchdev.h
> +++ b/include/net/switchdev.h
> @@ -205,6 +205,7 @@ struct switchdev_notifier_info {
>  
>  struct switchdev_notifier_fdb_info {
>  	struct switchdev_notifier_info info; /* must be first */
> +	struct list_head list;
>  	const unsigned char *addr;
>  	u16 vid;
>  	u8 added_by_user:1,
> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> index b7490237f3fc..49125cc196ac 100644
> --- a/net/bridge/br_fdb.c
> +++ b/net/bridge/br_fdb.c
> @@ -726,6 +726,58 @@ static inline size_t fdb_nlmsg_size(void)
>  		+ nla_total_size(sizeof(u8)); /* NFEA_ACTIVITY_NOTIFY */
>  }
>  
> +static int br_fdb_replay_one(struct notifier_block *nb,
> +			     struct net_bridge_fdb_entry *fdb,
> +			     struct net_device *dev)
> +{
> +	struct switchdev_notifier_fdb_info item;
> +	int err;
> +
> +	item.addr = fdb->key.addr.addr;
> +	item.vid = fdb->key.vlan_id;
> +	item.added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
> +	item.offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags);
> +	item.info.dev = dev;
> +
> +	err = nb->notifier_call(nb, SWITCHDEV_FDB_ADD_TO_DEVICE, &item);
> +	return notifier_to_errno(err);
> +}
> +
> +int br_fdb_replay(struct net_device *br_dev, struct net_device *dev,
> +		  struct notifier_block *nb)
> +{
> +	struct net_bridge_fdb_entry *fdb;
> +	struct net_bridge *br;
> +	int err = 0;
> +
> +	if (!netif_is_bridge_master(br_dev))
> +		return -EINVAL;
> +
> +	if (!netif_is_bridge_port(dev))
> +		return -EINVAL;
> +
> +	br = netdev_priv(br_dev);
> +
> +	rcu_read_lock();
> +
> +	hlist_for_each_entry_rcu(fdb, &br->fdb_list, fdb_node) {
> +		struct net_device *dst_dev;
> +
> +		dst_dev = fdb->dst ? fdb->dst->dev : br->dev;
> +		if (dst_dev != br_dev && dst_dev != dev)
> +			continue;
> +

I do not know if it is a problem or not, more of an observation: This is
not guaranteed to be an exact replay of the events that the bridge port
(i.e. bond0 or whatever) has received since, in fdb_insert, we exit
early when adding local entries if that address is already in the
database.

Do we have to guard against this somehow? Or maybe we should consider
the current behavior a bug and make sure to always send the event in the
first place?

> +		err = br_fdb_replay_one(nb, fdb, dst_dev);
> +		if (err)
> +			break;
> +	}
> +
> +	rcu_read_unlock();
> +
> +	return err;
> +}
> +EXPORT_SYMBOL(br_fdb_replay);
> +
>  static void fdb_notify(struct net_bridge *br,
>  		       const struct net_bridge_fdb_entry *fdb, int type,
>  		       bool swdev_notify)
> diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
> index b14c43cb88bb..92282de54230 100644
> --- a/net/dsa/dsa_priv.h
> +++ b/net/dsa/dsa_priv.h
> @@ -262,6 +262,7 @@ static inline bool dsa_tree_offloads_bridge_port(struct dsa_switch_tree *dst,
>  
>  /* slave.c */
>  extern const struct dsa_device_ops notag_netdev_ops;
> +extern struct notifier_block dsa_slave_switchdev_notifier;
>  extern struct notifier_block dsa_slave_switchdev_blocking_notifier;
>  
>  void dsa_slave_mii_bus_init(struct dsa_switch *ds);
> diff --git a/net/dsa/port.c b/net/dsa/port.c
> index 6670612f96c6..9850051071f2 100644
> --- a/net/dsa/port.c
> +++ b/net/dsa/port.c
> @@ -205,6 +205,10 @@ static int dsa_port_switchdev_sync(struct dsa_port *dp,
>  	if (err && err != -EOPNOTSUPP)
>  		return err;
>  
> +	err = br_fdb_replay(br, brport_dev, &dsa_slave_switchdev_notifier);
> +	if (err && err != -EOPNOTSUPP)
> +		return err;
> +
>  	return 0;
>  }
>  
> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
> index b974d8f84a2e..c51e52418a62 100644
> --- a/net/dsa/slave.c
> +++ b/net/dsa/slave.c
> @@ -2392,7 +2392,7 @@ static struct notifier_block dsa_slave_nb __read_mostly = {
>  	.notifier_call  = dsa_slave_netdevice_event,
>  };
>  
> -static struct notifier_block dsa_slave_switchdev_notifier = {
> +struct notifier_block dsa_slave_switchdev_notifier = {
>  	.notifier_call = dsa_slave_switchdev_event,
>  };
>  
> -- 
> 2.25.1

  reply	other threads:[~2021-03-22 15:45 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-18 23:18 [RFC PATCH v2 net-next 00/16] Better support for sandwiched LAGs with bridge and DSA Vladimir Oltean
2021-03-18 23:18 ` [RFC PATCH v2 net-next 01/16] net: dsa: call dsa_port_bridge_join when joining a LAG that is already in a bridge Vladimir Oltean
2021-03-19 22:04   ` Florian Fainelli
2021-03-22 10:24   ` Tobias Waldekranz
2021-03-18 23:18 ` [RFC PATCH v2 net-next 02/16] net: dsa: pass extack to dsa_port_{bridge,lag}_join Vladimir Oltean
2021-03-19 22:05   ` Florian Fainelli
2021-03-22 10:25   ` Tobias Waldekranz
2021-03-18 23:18 ` [RFC PATCH v2 net-next 03/16] net: dsa: inherit the actual bridge port flags at join time Vladimir Oltean
2021-03-19 22:08   ` Florian Fainelli
2021-03-20 10:05     ` Vladimir Oltean
2021-03-18 23:18 ` [RFC PATCH v2 net-next 04/16] net: dsa: sync up with bridge port's STP state when joining Vladimir Oltean
2021-03-19 22:11   ` Florian Fainelli
2021-03-22 10:29   ` Tobias Waldekranz
2021-03-18 23:18 ` [RFC PATCH v2 net-next 05/16] net: dsa: sync up VLAN filtering state when joining the bridge Vladimir Oltean
2021-03-19 22:11   ` Florian Fainelli
2021-03-22 10:30   ` Tobias Waldekranz
2021-03-18 23:18 ` [RFC PATCH v2 net-next 06/16] net: dsa: sync multicast router " Vladimir Oltean
2021-03-19 22:12   ` Florian Fainelli
2021-03-22 11:17   ` Tobias Waldekranz
2021-03-22 11:43     ` Vladimir Oltean
2021-03-18 23:18 ` [RFC PATCH v2 net-next 07/16] net: dsa: sync ageing time " Vladimir Oltean
2021-03-19 22:13   ` Florian Fainelli
2021-03-20 10:09     ` Vladimir Oltean
2021-03-22 11:20   ` Tobias Waldekranz
2021-03-18 23:18 ` [RFC PATCH v2 net-next 08/16] net: dsa: replay port and host-joined mdb entries " Vladimir Oltean
2021-03-19 22:20   ` Florian Fainelli
2021-03-20  9:53     ` Vladimir Oltean
2021-03-22 15:56       ` Florian Fainelli
2021-03-18 23:18 ` [RFC PATCH v2 net-next 09/16] net: dsa: replay port and local fdb " Vladimir Oltean
2021-03-22 15:44   ` Tobias Waldekranz [this message]
2021-03-22 16:19     ` Vladimir Oltean
2021-03-22 17:07       ` Tobias Waldekranz
2021-03-22 17:13         ` Vladimir Oltean
2021-03-18 23:18 ` [RFC PATCH v2 net-next 10/16] net: dsa: replay VLANs installed on port " Vladimir Oltean
2021-03-19 22:24   ` Florian Fainelli
2021-03-18 23:18 ` [RFC PATCH v2 net-next 11/16] net: ocelot: support multiple bridges Vladimir Oltean
2021-03-18 23:18 ` [RFC PATCH v2 net-next 12/16] net: ocelot: call ocelot_netdevice_bridge_join when joining a bridged LAG Vladimir Oltean
2021-03-18 23:18 ` [RFC PATCH v2 net-next 13/16] net: ocelot: replay switchdev events when joining bridge Vladimir Oltean
2021-03-18 23:18 ` [RFC PATCH v2 net-next 14/16] net: dsa: don't set skb->offload_fwd_mark when not offloading the bridge Vladimir Oltean
2021-03-19  8:52   ` DENG Qingfang
2021-03-19  9:06     ` Vladimir Oltean
2021-03-19  9:29       ` DENG Qingfang
2021-03-19 10:49         ` Vladimir Oltean
2021-03-22  8:04           ` DENG Qingfang
2021-03-22 22:23             ` Vladimir Oltean
2021-03-22 16:06   ` Florian Fainelli
2021-03-18 23:18 ` [RFC PATCH v2 net-next 15/16] net: dsa: return -EOPNOTSUPP when driver does not implement .port_lag_join Vladimir Oltean
2021-03-22 15:51   ` Florian Fainelli
2021-03-22 15:58   ` Tobias Waldekranz
2021-03-18 23:18 ` [RFC PATCH v2 net-next 16/16] net: bridge: switchdev: let drivers inform which bridge ports are offloaded Vladimir Oltean
2021-03-22 16:30   ` Tobias Waldekranz
2021-03-22 17:19     ` Vladimir Oltean

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wntzmbva.fsf@waldekranz.com \
    --to=tobias@waldekranz.com \
    --cc=UNGLinuxDriver@microchip.com \
    --cc=alexandre.belloni@bootlin.com \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=grygorii.strashko@ti.com \
    --cc=idosch@idosch.org \
    --cc=ioana.ciornei@nxp.com \
    --cc=ivecera@redhat.com \
    --cc=jiri@resnulli.us \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-omap@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@nvidia.com \
    --cc=olteanv@gmail.com \
    --cc=roopa@nvidia.com \
    --cc=tchornyi@marvell.com \
    --cc=vigneshr@ti.com \
    --cc=vivien.didelot@gmail.com \
    --cc=vkochan@marvell.com \
    --cc=vladimir.oltean@nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).