netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vladimir Oltean <vladimir.oltean@nxp.com>
To: netdev@vger.kernel.org
Cc: Jakub Kicinski <kuba@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>, Ido Schimmel <idosch@nvidia.com>,
	"Hans J. Schultz" <netdev@kapio-technology.com>,
	Roopa Prabhu <roopa@nvidia.com>,
	Nikolay Aleksandrov <razor@blackwall.org>,
	Ivan Vecera <ivecera@redhat.com>, Jiri Pirko <jiri@resnulli.us>,
	Arkadi Sharshevsky <arkadis@mellanox.com>,
	Ido Schimmel <idosch@mellanox.com>,
	bridge@lists.linux-foundation.org, linux-kernel@vger.kernel.org
Subject: [PATCH net] net: bridge: switchdev: don't notify FDB entries with "master dynamic"
Date: Mon, 10 Apr 2023 23:49:51 +0300	[thread overview]
Message-ID: <20230410204951.1359485-1-vladimir.oltean@nxp.com> (raw)

There is a structural problem in switchdev, where the flag bits in
struct switchdev_notifier_fdb_info (added_by_user, is_local etc) only
represent a simplified / denatured view of what's in struct
net_bridge_fdb_entry :: flags (BR_FDB_ADDED_BY_USER, BR_FDB_LOCAL etc).
Each time we want to pass more information about struct
net_bridge_fdb_entry :: flags to struct switchdev_notifier_fdb_info
(here, BR_FDB_STATIC), we find that FDB entries were already notified to
switchdev with no regard to this flag, and thus, switchdev drivers had
no indication whether the notified entries were static or not.

For example, this command:

ip link add br0 type bridge && ip link set swp0 master br0
bridge fdb add dev swp0 00:01:02:03:04:05 master dynamic

causes a struct net_bridge_fdb_entry to be passed to
br_switchdev_fdb_notify() which has a single flag set:
BR_FDB_ADDED_BY_USER.

This is further passed to the switchdev notifier chain, where interested
drivers have no choice but to assume this is a static FDB entry.
So currently, all drivers offload it to hardware as such.

bridge fdb get 00:01:02:03:04:05 dev swp0 master
00:01:02:03:04:05 dev swp0 offload master br0

The software FDB entry expires after the $ageing_time and the bridge
notifies its deletion as well, so it eventually disappears from hardware
too.

This is a problem, because it is actually desirable to start offloading
"master dynamic" FDB entries correctly, and this is how the current
incorrect behavior was discovered.

To see why the current behavior of "here's a static FDB entry when you
asked for a dynamic one" is incorrect, it is possible to imagine a
scenario like below, where this decision could lead to packet loss:

Step 1: management prepares FDB entries like this:

bridge fdb add dev swp0 ${MAC_A} master dynamic
bridge fdb add dev swp2 ${MAC_B} master dynamic

        br0
      /  |  \
     /   |   \
  swp0  swp1  swp2
   |           |
   A           B

Step 2: station A migrates to swp1 (assume that swp0's link doesn't flap
during that time so that the port isn't flushed, for example station A
was behind an intermediary switch):

        br0
      /  |  \
     /   |   \
  swp0  swp1  swp2
   |     |     |
         A     B

Whenever A wants to ping B, its packets will be autonomously forwarded
by the switch (because ${MAC_B} is known). So the software will never
see packets from ${MAC_A} as source address, and will never know it
needs to invalidate the dynamic FDB entry towards swp0. As for the
hardware FDB entry, that's static, it doesn't move when the station
roams.

So when B wants to reply to A's pings, the switch will forward those
replies to swp0 until the software bridge ages out its dynamic entry,
and that can cause connectivity loss for up to 5 minutes after roaming.

With a correctly offloaded dynamic FDB entry, the switch would update
its entry for ${MAC_A} to be towards swp1 as soon as it sees packets
from it (no need for CPU intervention).

Looking at tools/testing/selftests/net/forwarding/, there is no valid
use of the "bridge fdb add ... master dynamic" command there, so I am
fairly confident that no one used to rely on this behavior.

With the change in place, these FDB entries are no longer offloaded:

bridge fdb get 00:01:02:03:04:05 dev swp0 master
00:01:02:03:04:05 dev swp0 master br0

and this also constitutes a better way (assuming a backport to stable
kernels) for user space to determine whether the switchdev driver did
actually act upon the dynamic FDB entry or not.

Fixes: 6b26b51b1d13 ("net: bridge: Add support for notifying devices about FDB add/del")
Link: https://lore.kernel.org/netdev/20230327115206.jk5q5l753aoelwus@skbuf/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 net/bridge/br_switchdev.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index de18e9c1d7a7..0ec3d5e5e77d 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -148,6 +148,10 @@ br_switchdev_fdb_notify(struct net_bridge *br,
 	if (test_bit(BR_FDB_LOCKED, &fdb->flags))
 		return;
 
+	if (test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags) &&
+	    !test_bit(BR_FDB_STATIC, &fdb->flags))
+		return;
+
 	br_switchdev_fdb_populate(br, &item, fdb, NULL);
 
 	switch (type) {
-- 
2.34.1


             reply	other threads:[~2023-04-10 20:50 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-10 20:49 Vladimir Oltean [this message]
2023-04-11 15:30 ` [PATCH net] net: bridge: switchdev: don't notify FDB entries with "master dynamic" Jesse Brandeburg
2023-04-12 14:15 ` Ido Schimmel
2023-04-12 14:27   ` Vladimir Oltean
2023-04-12 16:00     ` Ido Schimmel
2023-04-12 16:24       ` Vladimir Oltean
2023-04-12 16:49         ` Ido Schimmel
2023-04-12 17:04           ` Vladimir Oltean

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230410204951.1359485-1-vladimir.oltean@nxp.com \
    --to=vladimir.oltean@nxp.com \
    --cc=arkadis@mellanox.com \
    --cc=bridge@lists.linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=idosch@mellanox.com \
    --cc=idosch@nvidia.com \
    --cc=ivecera@redhat.com \
    --cc=jiri@resnulli.us \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@kapio-technology.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=razor@blackwall.org \
    --cc=roopa@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).