All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Oltean <vladimir.oltean@nxp.com>
To: netdev@vger.kernel.org, Roopa Prabhu <roopa@nvidia.com>,
	Nikolay Aleksandrov <nikolay@nvidia.com>,
	Ido Schimmel <idosch@nvidia.com>
Cc: Jakub Kicinski <kuba@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Andrew Lunn <andrew@lunn.ch>,
	Florian Fainelli <f.fainelli@gmail.com>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	Vladimir Oltean <olteanv@gmail.com>, Jiri Pirko <jiri@nvidia.com>
Subject: [RFC PATCH net-next 00/15] Synchronous feedback on FDB add/del from switchdev to the bridge
Date: Tue, 26 Oct 2021 01:24:00 +0300	[thread overview]
Message-ID: <20211025222415.983883-1-vladimir.oltean@nxp.com> (raw)

Hello, this is me bombarding the list with switchdev FDB changes again.

This series attempts to address one design limitation in the interaction
between the bridge and switchdev: error codes returned from the
SWITCHDEV_FDB_ADD_TO_DEVICE and SWITCHDEV_FDB_DEL_TO_DEVICE handlers are
completely ignored.

There are multiple aspects to that. First of all, drivers have a portion
that handles those switchdev events in atomic context, and a portion
that handles them in a private deferred work context. Errors reported
from both calling contexts are ignored by the bridge, and it is
desirable to actually propagate both to user space.

Secondly, it is in fact okay that some switchdev errors are ignored.
The call graph for fdb_notify() is not simple, it looks something like
this (not complete):

IFLA_BRPORT_FLUSH                                                              RTM_NEWNEIGH
   |                                                                               |
   | {br,nbp}_vlan_delete                 br_fdb_change_mac_address                v
   |   |  |                                                  |     fast      __br_fdb_add
   |   |  |  del_nbp, br_dev_delete       br_fdb_changeaddr  |     path         /  |  \
   |   |  |      |                                        |  |    learning     /   |   \
   \   |   -------------------- br_fdb_find_delete_local  |  |       |        /    |    \     switchdev event
    \  |         |                                     |  |  |       |       /     |     \     listener
     -------------------------- br_fdb_delete_by_port  |  |  |       |      /      |      \       |
                                                 |  |  |  |  |       |     /       |       \      |
                                                 |  |  |  |  |       |    /        |        \     |
                                                 |  |  |  |  |    br_fdb_update    |        br_fdb_external_learn_add
           (RTM_DELNEIGH)  br_fdb_delete         |  |  |  |  |       |             |              |
                                     |           |  |  |  |  |       |             |              |    gc_work        netdevice
                                     |           |  |  |  |  |       |      fdb_add_entry         |     timer          event
                                     |           | fdb_delete_local  |             |              |        |          listener
                         __br_fdb_delete         |  |                |             |              /  br_fdb_cleanup      |
                                     |           |  |                |             |             /         |             |     br_stp_change_bridge_id
                                     |           |  |                \             |            /          | br_fdb_changeaddr      |
                                     |           |  |                 \            |           /           |     |                  |
                     fdb_delete_by_addr_and_port |  | fdb_insert       \           |          /       ----/      | br_fdb_change_mac_address
                                              |  |  |  |                \          |         /       /           |  |
                   br_fdb_external_learn_del  |  |  |  | br_fdb_cleanup  \         |        /       /            |  | br_fdb_insert
                                          |   |  |  |  |  |               \        |       /   ----/             |  | |
                                          |   |  |  |  |  |                \       |      /   /                 fdb_insert
                          br_fdb_flush    |   |  |  |  |  |                 \      |     /   /            --------/
                                 \----    |   |  |  |  |  |                  \     |    /   /      ------/
                                      \----------- fdb_delete --------------- fdb_notify ---------/

There's not a lot that the fast path learning can do about switchdev
when that returns an error.

So this patch set mainly wants to deal with the 2 code paths that are
triggered by these regular commands:

bridge fdb add dev swp0 00:01:02:03:04:05 master static # __br_fdb_add
bridge fdb del dev swp0 00:01:02:03:04:05 master static # __br_fdb_delete

In some other, semi-related discussions, Ido Schimmel pointed out that
it would be nice if user space got some feedback from the actual driver,
and made some proposals about how that could be done.
https://patchwork.kernel.org/project/netdevbpf/cover/20210819160723.2186424-1-vladimir.oltean@nxp.com/
One of the proposals was to call fdb_notify() from sleepable context,
but Nikolay disliked the idea of introducing deferred work in the bridge
driver (seems like nobody wants to deal with it).

And since all proposals of dealing with the deferred work inside
switchdev were also shot down for valid reasons, we are basically left
as a baseline with the code that we have today, with the deferred work
being private to the driver, and somehow we must propagate an err and an
extack from there.

So the approach taken here is to reorganize the code a bit and add some
hooks in:
(a) some callers of the fdb_notify() function to initialize a completion
    structure
(b) some drivers that catch SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE and mark
    that completion structure as done
(c) some bridge logic that I believe is fairly safe (I'm open to being
    proven wrong) that temporarily drops the &br->hash_lock in order to
    sleep until the completion is done.

There are some further optimizations that can be made. For example, we
can avoid dropping the hash_lock if there is no switchdev response pending.
And we can move some of that completion logic in br_switchdev.c such
that it is compiled out on a CONFIG_NET_SWITCHDEV=n build. I haven't
done those here, since they aren't exactly trivial. Mainly searching for
high-level feedback first and foremost.

The structure of the patch series is:
- patches 1-6 are me toying around with some code organization while I
  was trying to understand the various call paths better. I like not
  having forward declarations, but if they exist for a reason, I can
  drop these patches.
- patches 7-10 and 12 are some preparation work that can also be ignored.
- patches 11 and 13 are where the meat of the series is.
- patches 14 and 15 are DSA boilerplate so I could test what I'm doing.

Vladimir Oltean (15):
  net: bridge: remove fdb_notify forward declaration
  net: bridge: remove fdb_insert forward declaration
  net: bridge: rename fdb_insert to fdb_add_local
  net: bridge: rename br_fdb_insert to br_fdb_add_local
  net: bridge: move br_fdb_replay inside br_switchdev.c
  net: bridge: create a common function for populating switchdev FDB
    entries
  net: switchdev: keep the MAC address by value in struct
    switchdev_notifier_fdb_info
  net: bridge: take the hash_lock inside fdb_add_entry
  net: bridge: rename fdb_notify to br_fdb_notify
  net: switchdev: merge switchdev_handle_fdb_{add,del}_to_device
  net: bridge: make fdb_add_entry() wait for switchdev feedback
  net: rtnetlink: pass extack to .ndo_fdb_del
  net: bridge: wait for errors from switchdev when deleting FDB entries
  net: dsa: propagate feedback to SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE
  net: dsa: propagate extack to .port_fdb_{add,del}

 drivers/net/dsa/b53/b53_common.c              |   6 +-
 drivers/net/dsa/b53/b53_priv.h                |   6 +-
 drivers/net/dsa/hirschmann/hellcreek.c        |   6 +-
 drivers/net/dsa/lan9303-core.c                |   7 +-
 drivers/net/dsa/lantiq_gswip.c                |   6 +-
 drivers/net/dsa/microchip/ksz9477.c           |   6 +-
 drivers/net/dsa/mt7530.c                      |   6 +-
 drivers/net/dsa/mv88e6xxx/chip.c              |   6 +-
 drivers/net/dsa/ocelot/felix.c                |   6 +-
 drivers/net/dsa/qca8k.c                       |   6 +-
 drivers/net/dsa/sja1105/sja1105_main.c        |  12 +-
 .../ethernet/freescale/dpaa2/dpaa2-switch.c   |  14 +-
 drivers/net/ethernet/intel/ice/ice_main.c     |   4 +-
 .../marvell/prestera/prestera_switchdev.c     |  18 +-
 .../mellanox/mlx5/core/en/rep/bridge.c        |  10 -
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  |   2 +-
 .../ethernet/mellanox/mlxsw/spectrum_router.c |   4 +-
 .../mellanox/mlxsw/spectrum_switchdev.c       |  11 +-
 .../microchip/sparx5/sparx5_mactable.c        |   2 +-
 .../microchip/sparx5/sparx5_switchdev.c       |  12 +-
 drivers/net/ethernet/mscc/ocelot_net.c        |   3 +-
 .../net/ethernet/qlogic/qlcnic/qlcnic_main.c  |   5 +-
 drivers/net/ethernet/rocker/rocker_main.c     |  13 +-
 drivers/net/ethernet/rocker/rocker_ofdpa.c    |   2 +-
 drivers/net/ethernet/ti/am65-cpsw-switchdev.c |  14 +-
 drivers/net/ethernet/ti/cpsw_switchdev.c      |  14 +-
 drivers/net/macvlan.c                         |   3 +-
 drivers/net/vxlan.c                           |   3 +-
 drivers/s390/net/qeth_l2_main.c               |  11 +-
 include/linux/netdevice.h                     |   6 +-
 include/net/dsa.h                             |   6 +-
 include/net/switchdev.h                       |  67 +-
 net/bridge/br_fdb.c                           | 657 ++++++++++--------
 net/bridge/br_if.c                            |   2 +-
 net/bridge/br_private.h                       |  25 +-
 net/bridge/br_switchdev.c                     | 103 ++-
 net/bridge/br_vlan.c                          |   5 +-
 net/core/rtnetlink.c                          |   4 +-
 net/dsa/dsa_priv.h                            |  15 +-
 net/dsa/port.c                                |  13 +-
 net/dsa/slave.c                               |  87 +--
 net/dsa/switch.c                              |  22 +-
 net/switchdev/switchdev.c                     | 156 +----
 43 files changed, 678 insertions(+), 708 deletions(-)

-- 
2.25.1


             reply	other threads:[~2021-10-25 22:24 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-25 22:24 Vladimir Oltean [this message]
2021-10-25 22:24 ` [RFC PATCH net-next 01/15] net: bridge: remove fdb_notify forward declaration Vladimir Oltean
2021-10-25 22:24 ` [RFC PATCH net-next 02/15] net: bridge: remove fdb_insert " Vladimir Oltean
2021-10-25 22:24 ` [RFC PATCH net-next 03/15] net: bridge: rename fdb_insert to fdb_add_local Vladimir Oltean
2021-10-25 22:24 ` [RFC PATCH net-next 04/15] net: bridge: rename br_fdb_insert to br_fdb_add_local Vladimir Oltean
2021-10-25 22:24 ` [RFC PATCH net-next 05/15] net: bridge: move br_fdb_replay inside br_switchdev.c Vladimir Oltean
2021-10-25 22:24 ` [RFC PATCH net-next 06/15] net: bridge: create a common function for populating switchdev FDB entries Vladimir Oltean
2021-10-25 22:24 ` [RFC PATCH net-next 07/15] net: switchdev: keep the MAC address by value in struct switchdev_notifier_fdb_info Vladimir Oltean
2021-10-25 22:24 ` [RFC PATCH net-next 08/15] net: bridge: take the hash_lock inside fdb_add_entry Vladimir Oltean
2021-10-25 22:24 ` [RFC PATCH net-next 09/15] net: bridge: rename fdb_notify to br_fdb_notify Vladimir Oltean
2021-10-25 22:24 ` [RFC PATCH net-next 10/15] net: switchdev: merge switchdev_handle_fdb_{add,del}_to_device Vladimir Oltean
2021-10-25 22:24 ` [RFC PATCH net-next 11/15] net: bridge: make fdb_add_entry() wait for switchdev feedback Vladimir Oltean
2021-10-25 22:24 ` [RFC PATCH net-next 12/15] net: rtnetlink: pass extack to .ndo_fdb_del Vladimir Oltean
2021-10-25 22:24 ` [RFC PATCH net-next 13/15] net: bridge: wait for errors from switchdev when deleting FDB entries Vladimir Oltean
2021-10-25 22:24 ` [RFC PATCH net-next 14/15] net: dsa: propagate feedback to SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE Vladimir Oltean
2021-10-25 22:24 ` [RFC PATCH net-next 15/15] net: dsa: propagate extack to .port_fdb_{add,del} Vladimir Oltean
2021-10-26 10:40 ` [RFC PATCH net-next 00/15] Synchronous feedback on FDB add/del from switchdev to the bridge Nikolay Aleksandrov
2021-10-26 11:25   ` Vladimir Oltean
2021-10-26 12:20     ` Nikolay Aleksandrov
2021-10-26 12:38       ` Ido Schimmel
2021-10-26 16:54       ` Vladimir Oltean
2021-10-26 17:10         ` Nikolay Aleksandrov
2021-10-26 19:01           ` Vladimir Oltean
2021-10-26 19:56             ` Nikolay Aleksandrov
2021-10-26 21:51               ` Vladimir Oltean
2021-10-26 22:27                 ` Nikolay Aleksandrov
2021-10-27  9:20                   ` Nikolay Aleksandrov
2021-10-27  9:36                     ` Nikolay Aleksandrov
2021-10-27  7:24                 ` Ido Schimmel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211025222415.983883-1-vladimir.oltean@nxp.com \
    --to=vladimir.oltean@nxp.com \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=idosch@nvidia.com \
    --cc=jiri@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@nvidia.com \
    --cc=olteanv@gmail.com \
    --cc=roopa@nvidia.com \
    --cc=vivien.didelot@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.