[RFC PATCH net-next 00/15] Synchronous feedback on FDB add/del from switchdev to the bridge

* [RFC PATCH net-next 00/15] Synchronous feedback on FDB add/del from switchdev to the bridge
@ 2021-10-25 22:24 Vladimir Oltean
  2021-10-25 22:24 ` [RFC PATCH net-next 01/15] net: bridge: remove fdb_notify forward declaration Vladimir Oltean
                   ` (15 more replies)
  0 siblings, 16 replies; 29+ messages in thread
From: Vladimir Oltean @ 2021-10-25 22:24 UTC (permalink / raw)
  To: netdev, Roopa Prabhu, Nikolay Aleksandrov, Ido Schimmel
  Cc: Jakub Kicinski, David S. Miller, Andrew Lunn, Florian Fainelli,
	Vivien Didelot, Vladimir Oltean, Jiri Pirko

Hello, this is me bombarding the list with switchdev FDB changes again.

This series attempts to address one design limitation in the interaction
between the bridge and switchdev: error codes returned from the
SWITCHDEV_FDB_ADD_TO_DEVICE and SWITCHDEV_FDB_DEL_TO_DEVICE handlers are
completely ignored.

There are multiple aspects to that. First of all, drivers have a portion
that handles those switchdev events in atomic context, and a portion
that handles them in a private deferred work context. Errors reported
from both calling contexts are ignored by the bridge, and it is
desirable to actually propagate both to user space.

Secondly, it is in fact okay that some switchdev errors are ignored.
The call graph for fdb_notify() is not simple, it looks something like
this (not complete):

IFLA_BRPORT_FLUSH                                                              RTM_NEWNEIGH
   |                                                                               |
   | {br,nbp}_vlan_delete                 br_fdb_change_mac_address                v
   |   |  |                                                  |     fast      __br_fdb_add
   |   |  |  del_nbp, br_dev_delete       br_fdb_changeaddr  |     path         /  |  \
   |   |  |      |                                        |  |    learning     /   |   \
   \   |   -------------------- br_fdb_find_delete_local  |  |       |        /    |    \     switchdev event
    \  |         |                                     |  |  |       |       /     |     \     listener
     -------------------------- br_fdb_delete_by_port  |  |  |       |      /      |      \       |
                                                 |  |  |  |  |       |     /       |       \      |
                                                 |  |  |  |  |       |    /        |        \     |
                                                 |  |  |  |  |    br_fdb_update    |        br_fdb_external_learn_add
           (RTM_DELNEIGH)  br_fdb_delete         |  |  |  |  |       |             |              |
                                     |           |  |  |  |  |       |             |              |    gc_work        netdevice
                                     |           |  |  |  |  |       |      fdb_add_entry         |     timer          event
                                     |           | fdb_delete_local  |             |              |        |          listener
                         __br_fdb_delete         |  |                |             |              /  br_fdb_cleanup      |
                                     |           |  |                |             |             /         |             |     br_stp_change_bridge_id
                                     |           |  |                \             |            /          | br_fdb_changeaddr      |
                                     |           |  |                 \            |           /           |     |                  |
                     fdb_delete_by_addr_and_port |  | fdb_insert       \           |          /       ----/      | br_fdb_change_mac_address
                                              |  |  |  |                \          |         /       /           |  |
                   br_fdb_external_learn_del  |  |  |  | br_fdb_cleanup  \         |        /       /            |  | br_fdb_insert
                                          |   |  |  |  |  |               \        |       /   ----/             |  | |
                                          |   |  |  |  |  |                \       |      /   /                 fdb_insert
                          br_fdb_flush    |   |  |  |  |  |                 \      |     /   /            --------/
                                 \----    |   |  |  |  |  |                  \     |    /   /      ------/
                                      \----------- fdb_delete --------------- fdb_notify ---------/

There's not a lot that the fast path learning can do about switchdev
when that returns an error.

So this patch set mainly wants to deal with the 2 code paths that are
triggered by these regular commands:

bridge fdb add dev swp0 00:01:02:03:04:05 master static # __br_fdb_add
bridge fdb del dev swp0 00:01:02:03:04:05 master static # __br_fdb_delete

In some other, semi-related discussions, Ido Schimmel pointed out that
it would be nice if user space got some feedback from the actual driver,
and made some proposals about how that could be done.
https://patchwork.kernel.org/project/netdevbpf/cover/20210819160723.2186424-1-vladimir.oltean@nxp.com/
One of the proposals was to call fdb_notify() from sleepable context,
but Nikolay disliked the idea of introducing deferred work in the bridge
driver (seems like nobody wants to deal with it).

And since all proposals of dealing with the deferred work inside
switchdev were also shot down for valid reasons, we are basically left
as a baseline with the code that we have today, with the deferred work
being private to the driver, and somehow we must propagate an err and an
extack from there.

So the approach taken here is to reorganize the code a bit and add some
hooks in:
(a) some callers of the fdb_notify() function to initialize a completion
    structure
(b) some drivers that catch SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE and mark
    that completion structure as done
(c) some bridge logic that I believe is fairly safe (I'm open to being
    proven wrong) that temporarily drops the &br->hash_lock in order to
    sleep until the completion is done.

There are some further optimizations that can be made. For example, we
can avoid dropping the hash_lock if there is no switchdev response pending.
And we can move some of that completion logic in br_switchdev.c such
that it is compiled out on a CONFIG_NET_SWITCHDEV=n build. I haven't
done those here, since they aren't exactly trivial. Mainly searching for
high-level feedback first and foremost.

The structure of the patch series is:
- patches 1-6 are me toying around with some code organization while I
  was trying to understand the various call paths better. I like not
  having forward declarations, but if they exist for a reason, I can
  drop these patches.
- patches 7-10 and 12 are some preparation work that can also be ignored.
- patches 11 and 13 are where the meat of the series is.
- patches 14 and 15 are DSA boilerplate so I could test what I'm doing.

Vladimir Oltean (15):
  net: bridge: remove fdb_notify forward declaration
  net: bridge: remove fdb_insert forward declaration
  net: bridge: rename fdb_insert to fdb_add_local
  net: bridge: rename br_fdb_insert to br_fdb_add_local
  net: bridge: move br_fdb_replay inside br_switchdev.c
  net: bridge: create a common function for populating switchdev FDB
    entries
  net: switchdev: keep the MAC address by value in struct
    switchdev_notifier_fdb_info
  net: bridge: take the hash_lock inside fdb_add_entry
  net: bridge: rename fdb_notify to br_fdb_notify
  net: switchdev: merge switchdev_handle_fdb_{add,del}_to_device
  net: bridge: make fdb_add_entry() wait for switchdev feedback
  net: rtnetlink: pass extack to .ndo_fdb_del
  net: bridge: wait for errors from switchdev when deleting FDB entries
  net: dsa: propagate feedback to SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE
  net: dsa: propagate extack to .port_fdb_{add,del}

 drivers/net/dsa/b53/b53_common.c              |   6 +-
 drivers/net/dsa/b53/b53_priv.h                |   6 +-
 drivers/net/dsa/hirschmann/hellcreek.c        |   6 +-
 drivers/net/dsa/lan9303-core.c                |   7 +-
 drivers/net/dsa/lantiq_gswip.c                |   6 +-
 drivers/net/dsa/microchip/ksz9477.c           |   6 +-
 drivers/net/dsa/mt7530.c                      |   6 +-
 drivers/net/dsa/mv88e6xxx/chip.c              |   6 +-
 drivers/net/dsa/ocelot/felix.c                |   6 +-
 drivers/net/dsa/qca8k.c                       |   6 +-
 drivers/net/dsa/sja1105/sja1105_main.c        |  12 +-
 .../ethernet/freescale/dpaa2/dpaa2-switch.c   |  14 +-
 drivers/net/ethernet/intel/ice/ice_main.c     |   4 +-
 .../marvell/prestera/prestera_switchdev.c     |  18 +-
 .../mellanox/mlx5/core/en/rep/bridge.c        |  10 -
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  |   2 +-
 .../ethernet/mellanox/mlxsw/spectrum_router.c |   4 +-
 .../mellanox/mlxsw/spectrum_switchdev.c       |  11 +-
 .../microchip/sparx5/sparx5_mactable.c        |   2 +-
 .../microchip/sparx5/sparx5_switchdev.c       |  12 +-
 drivers/net/ethernet/mscc/ocelot_net.c        |   3 +-
 .../net/ethernet/qlogic/qlcnic/qlcnic_main.c  |   5 +-
 drivers/net/ethernet/rocker/rocker_main.c     |  13 +-
 drivers/net/ethernet/rocker/rocker_ofdpa.c    |   2 +-
 drivers/net/ethernet/ti/am65-cpsw-switchdev.c |  14 +-
 drivers/net/ethernet/ti/cpsw_switchdev.c      |  14 +-
 drivers/net/macvlan.c                         |   3 +-
 drivers/net/vxlan.c                           |   3 +-
 drivers/s390/net/qeth_l2_main.c               |  11 +-
 include/linux/netdevice.h                     |   6 +-
 include/net/dsa.h                             |   6 +-
 include/net/switchdev.h                       |  67 +-
 net/bridge/br_fdb.c                           | 657 ++++++++++--------
 net/bridge/br_if.c                            |   2 +-
 net/bridge/br_private.h                       |  25 +-
 net/bridge/br_switchdev.c                     | 103 ++-
 net/bridge/br_vlan.c                          |   5 +-
 net/core/rtnetlink.c                          |   4 +-
 net/dsa/dsa_priv.h                            |  15 +-
 net/dsa/port.c                                |  13 +-
 net/dsa/slave.c                               |  87 +--
 net/dsa/switch.c                              |  22 +-
 net/switchdev/switchdev.c                     | 156 +----
 43 files changed, 678 insertions(+), 708 deletions(-)

-- 
2.25.1

^ permalink raw reply	[flat|nested] 29+ messages in thread