netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 00/12] ice: switchdev bridge offload
@ 2023-04-17  9:34 Wojciech Drewek
  2023-04-17  9:34 ` [PATCH net-next 01/12] ice: Minor switchdev fixes Wojciech Drewek
                   ` (11 more replies)
  0 siblings, 12 replies; 46+ messages in thread
From: Wojciech Drewek @ 2023-04-17  9:34 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, alexandr.lobakin, david.m.ertman, michal.swiatkowski,
	marcin.szycik, pawel.chmielewski, sridhar.samudrala

Linux bridge provides ability to learn MAC addresses and vlans
detected on bridge's ports. As a result of this, FDB (forward data base)
entries are created and they can be offloaded to the HW. By adding
VF's port representors to the bridge together with the uplink netdev,
we can learn VF's and link partner's MAC addresses. This is achieved
by slow/exception-path, where packets that do not match any filters
(FDB entries in this case) are send to the bridge ports.

Driver keeps track of the netdevs added to the bridge
by listening for NETDEV_CHANGEUPPER event. We distinguish two types
of bridge ports: uplink port and VF's representor port. Linux
bridge always learns src MAC of the packet on rx path. With the
current slow-path implementation, it means that we will learn
VF's MAC on port repr (when the VF transmits the packet) and
link partner's MAC on uplink (when we receive it on uplink from LAN).

The driver is notified about learning of the MAC/VLAN by
SWITCHDEV_FDB_{ADD|DEL}_TO_DEVICE events. This is followed by creation
of the HW filter. The direction of the filter is based on port
type (uplink or VF repr). In case of the uplink, rule forwards
the packets to the LAN (matching on link partner's MAC). When the
notification is received on VF repr then the rule forwards the
packets to the associated VF (matching on VF's MAC).

This approach would not work on its own however. This is because if
one of the directions is offloaded, then the bridge would not be able
to learn the other one. If the egress rule is added (learned on uplink)
then the response from the VF will be sent directly to the LAN.
The packet will not got through slow-path, it would not be seen on
VF's port repr. Because of that, the bridge would not learn VF's MAC.

This is solved by introducing guard rule. It prevents forward rule from
working until the opposite direction is offloaded.

Aging is not fully supported yet, aging time is static for now. The
follow up submissions will introduce counters that will allow us to
keep track if the rule is actually being used or not.

A few fixes/changes are needed for this feature to work with ice driver.
These are introduced in first 3 patches.

Dave Ertman (1):
  ice: Remove exclusion code for RDMA+SRIOV

Marcin Szycik (2):
  ice: Add guard rule when creating FDB in switchdev
  ice: Add VLAN FDB support in switchdev mode

Michal Swiatkowski (2):
  ice: implement bridge port vlan
  ice: implement static version of ageing

Pawel Chmielewski (1):
  ice: add tracepoints for the switchdev bridge

Wojciech Drewek (6):
  ice: Minor switchdev fixes
  ice: Unset src prune on uplink VSI
  ice: Implement basic eswitch bridge setup
  ice: Switchdev FDB events support
  ice: Accept LAG netdevs in bridge offloads
  ice: Ethtool fdb_cnt stats

 drivers/net/ethernet/intel/ice/Makefile       |    2 +-
 drivers/net/ethernet/intel/ice/ice.h          |   26 +-
 drivers/net/ethernet/intel/ice/ice_eswitch.c  |   43 +-
 .../net/ethernet/intel/ice/ice_eswitch_br.c   | 1350 +++++++++++++++++
 .../net/ethernet/intel/ice/ice_eswitch_br.h   |  112 ++
 drivers/net/ethernet/intel/ice/ice_ethtool.c  |    1 +
 drivers/net/ethernet/intel/ice/ice_lag.c      |   12 -
 drivers/net/ethernet/intel/ice/ice_lag.h      |   50 -
 drivers/net/ethernet/intel/ice/ice_lib.c      |   26 +-
 drivers/net/ethernet/intel/ice/ice_lib.h      |    1 +
 drivers/net/ethernet/intel/ice/ice_main.c     |    4 +-
 drivers/net/ethernet/intel/ice/ice_repr.c     |    2 +-
 drivers/net/ethernet/intel/ice/ice_repr.h     |    3 +-
 drivers/net/ethernet/intel/ice/ice_sriov.c    |    4 -
 drivers/net/ethernet/intel/ice/ice_switch.c   |   45 +-
 drivers/net/ethernet/intel/ice/ice_switch.h   |    5 +
 drivers/net/ethernet/intel/ice/ice_trace.h    |   90 ++
 drivers/net/ethernet/intel/ice/ice_type.h     |    1 +
 .../ethernet/intel/ice/ice_vf_vsi_vlan_ops.c  |  195 ++-
 .../ethernet/intel/ice/ice_vf_vsi_vlan_ops.h  |    3 +
 .../net/ethernet/intel/ice/ice_vsi_vlan_lib.c |   84 +-
 .../net/ethernet/intel/ice/ice_vsi_vlan_lib.h |    8 +
 .../net/ethernet/intel/ice/ice_vsi_vlan_ops.h |    1 +
 23 files changed, 1876 insertions(+), 192 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/ice/ice_eswitch_br.c
 create mode 100644 drivers/net/ethernet/intel/ice/ice_eswitch_br.h

-- 
2.39.2


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH net-next 01/12] ice: Minor switchdev fixes
  2023-04-17  9:34 [PATCH net-next 00/12] ice: switchdev bridge offload Wojciech Drewek
@ 2023-04-17  9:34 ` Wojciech Drewek
  2023-04-19 14:35   ` Alexander Lobakin
  2023-04-17  9:34 ` [PATCH net-next 02/12] ice: Remove exclusion code for RDMA+SRIOV Wojciech Drewek
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 46+ messages in thread
From: Wojciech Drewek @ 2023-04-17  9:34 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, alexandr.lobakin, david.m.ertman, michal.swiatkowski,
	marcin.szycik, pawel.chmielewski, sridhar.samudrala

Introduce a few fixes that are needed for bridge offload
to work properly.

- Skip adv rule removal in ice_eswitch_disable_switchdev.
  Advanced rules for ctrl VSI will be removed anyway when the
  VSI will cleaned up, no need to do it explicitly.

- Don't allow to change promisc mode in switchdev mode.
  When switchdev is configured, PF netdev is set to be a
  default VSI. This is needed for the slow-path to work correctly.
  All the unmatched packets will be directed to PF netdev.

  It is possible that this setting might be overwritten by
  ndo_set_rx_mode. Prevent this by checking if switchdev is
  enabled in ice_vsi_sync_fltr.

- Disable vlan pruning for uplink VSI. In switchdev mode, uplink VSI
  is configured to be default VSI which means it will receive all
  unmatched packets. In order to receive vlan packets we need to
  disable vlan pruning as well. This is done by dis_rx_filtering
  vlan op.

- There is possibility that ice_eswitch_port_start_xmit might be
  called while some resources are still not allocated which might
  cause NULL pointer dereference. Fix this by checking if switchdev
  configuration was finished.

Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_eswitch.c | 14 +++++++++++++-
 drivers/net/ethernet/intel/ice/ice_main.c    |  2 +-
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch.c b/drivers/net/ethernet/intel/ice/ice_eswitch.c
index ad0a007b7398..bfd003135fc8 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch.c
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch.c
@@ -103,6 +103,10 @@ static int ice_eswitch_setup_env(struct ice_pf *pf)
 		rule_added = true;
 	}
 
+	vlan_ops = ice_get_compat_vsi_vlan_ops(uplink_vsi);
+	if (vlan_ops->dis_rx_filtering(uplink_vsi))
+		goto err_dis_rx;
+
 	if (ice_vsi_update_security(uplink_vsi, ice_vsi_ctx_set_allow_override))
 		goto err_override_uplink;
 
@@ -114,6 +118,8 @@ static int ice_eswitch_setup_env(struct ice_pf *pf)
 err_override_control:
 	ice_vsi_update_security(uplink_vsi, ice_vsi_ctx_clear_allow_override);
 err_override_uplink:
+	vlan_ops->ena_rx_filtering(uplink_vsi);
+err_dis_rx:
 	if (rule_added)
 		ice_clear_dflt_vsi(uplink_vsi);
 err_def_rx:
@@ -331,6 +337,9 @@ ice_eswitch_port_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 	np = netdev_priv(netdev);
 	vsi = np->vsi;
 
+	if (!vsi || !ice_is_switchdev_running(vsi->back))
+		return NETDEV_TX_BUSY;
+
 	if (ice_is_reset_in_progress(vsi->back->state) ||
 	    test_bit(ICE_VF_DIS, vsi->back->state))
 		return NETDEV_TX_BUSY;
@@ -378,9 +387,13 @@ static void ice_eswitch_release_env(struct ice_pf *pf)
 {
 	struct ice_vsi *uplink_vsi = pf->switchdev.uplink_vsi;
 	struct ice_vsi *ctrl_vsi = pf->switchdev.control_vsi;
+	struct ice_vsi_vlan_ops *vlan_ops;
+
+	vlan_ops = ice_get_compat_vsi_vlan_ops(uplink_vsi);
 
 	ice_vsi_update_security(ctrl_vsi, ice_vsi_ctx_clear_allow_override);
 	ice_vsi_update_security(uplink_vsi, ice_vsi_ctx_clear_allow_override);
+	vlan_ops->ena_rx_filtering(uplink_vsi);
 	ice_clear_dflt_vsi(uplink_vsi);
 	ice_fltr_add_mac_and_broadcast(uplink_vsi,
 				       uplink_vsi->port_info->mac.perm_addr,
@@ -503,7 +516,6 @@ static void ice_eswitch_disable_switchdev(struct ice_pf *pf)
 
 	ice_eswitch_napi_disable(pf);
 	ice_eswitch_release_env(pf);
-	ice_rem_adv_rule_for_vsi(&pf->hw, ctrl_vsi->idx);
 	ice_eswitch_release_reprs(pf, ctrl_vsi);
 	ice_vsi_release(ctrl_vsi);
 	ice_repr_rem_from_all_vfs(pf);
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 7c04057c524c..f198c845631f 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -385,7 +385,7 @@ static int ice_vsi_sync_fltr(struct ice_vsi *vsi)
 	}
 	err = 0;
 	/* check for changes in promiscuous modes */
-	if (changed_flags & IFF_ALLMULTI) {
+	if (changed_flags & IFF_ALLMULTI && !ice_is_switchdev_running(pf)) {
 		if (vsi->current_netdev_flags & IFF_ALLMULTI) {
 			err = ice_set_promisc(vsi, ICE_MCAST_PROMISC_BITS);
 			if (err) {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH net-next 02/12] ice: Remove exclusion code for RDMA+SRIOV
  2023-04-17  9:34 [PATCH net-next 00/12] ice: switchdev bridge offload Wojciech Drewek
  2023-04-17  9:34 ` [PATCH net-next 01/12] ice: Minor switchdev fixes Wojciech Drewek
@ 2023-04-17  9:34 ` Wojciech Drewek
  2023-04-19 14:38   ` Alexander Lobakin
  2023-04-25 15:26   ` [Intel-wired-lan] " Michal Schmidt
  2023-04-17  9:34 ` [PATCH net-next 03/12] ice: Unset src prune on uplink VSI Wojciech Drewek
                   ` (9 subsequent siblings)
  11 siblings, 2 replies; 46+ messages in thread
From: Wojciech Drewek @ 2023-04-17  9:34 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, alexandr.lobakin, david.m.ertman, michal.swiatkowski,
	marcin.szycik, pawel.chmielewski, sridhar.samudrala

From: Dave Ertman <david.m.ertman@intel.com>

There was a change previously to stop SR-IOV and LAG from existing on the
same interface.  This was to prevent the violation of LACP (Link
Aggregation Control Protocol).  The method to achieve this was to add a
no-op Rx handler onto the netdev when SR-IOV VFs were present, thus
blocking bonding, bridging, etc from claiming the interface by adding
its own Rx handler.  Also, when an interface was added into a aggregate,
then the SR-IOV capability was set to false.

There are some customers that have in house solutions using both SR-IOV and
bridging/bonding that this method interferes with (e.g. creating duplicate
VFs on the bonded interfaces and failing between them when the interface
fails over).

It has been decided that to provide the most functionality
possible, the restriction on co-existence of these features will be
removed.  No additional functionality is currently being provided beyond
what existed before the co-existence restriction was put into place.  It is
up to the end user to not implement a solution that would interfere with
existing network protocols.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
---
 drivers/net/ethernet/intel/ice/ice.h       | 19 --------
 drivers/net/ethernet/intel/ice/ice_lag.c   | 12 ------
 drivers/net/ethernet/intel/ice/ice_lag.h   | 50 ----------------------
 drivers/net/ethernet/intel/ice/ice_lib.c   |  2 -
 drivers/net/ethernet/intel/ice/ice_sriov.c |  4 --
 5 files changed, 87 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index d637032c8139..ac2971073fdd 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -813,25 +813,6 @@ static inline bool ice_is_switchdev_running(struct ice_pf *pf)
 	return pf->switchdev.is_running;
 }
 
-/**
- * ice_set_sriov_cap - enable SRIOV in PF flags
- * @pf: PF struct
- */
-static inline void ice_set_sriov_cap(struct ice_pf *pf)
-{
-	if (pf->hw.func_caps.common_cap.sr_iov_1_1)
-		set_bit(ICE_FLAG_SRIOV_CAPABLE, pf->flags);
-}
-
-/**
- * ice_clear_sriov_cap - disable SRIOV in PF flags
- * @pf: PF struct
- */
-static inline void ice_clear_sriov_cap(struct ice_pf *pf)
-{
-	clear_bit(ICE_FLAG_SRIOV_CAPABLE, pf->flags);
-}
-
 #define ICE_FD_STAT_CTR_BLOCK_COUNT	256
 #define ICE_FD_STAT_PF_IDX(base_idx) \
 			((base_idx) * ICE_FD_STAT_CTR_BLOCK_COUNT)
diff --git a/drivers/net/ethernet/intel/ice/ice_lag.c b/drivers/net/ethernet/intel/ice/ice_lag.c
index ee5b36941ba3..5a7753bda324 100644
--- a/drivers/net/ethernet/intel/ice/ice_lag.c
+++ b/drivers/net/ethernet/intel/ice/ice_lag.c
@@ -6,15 +6,6 @@
 #include "ice.h"
 #include "ice_lag.h"
 
-/**
- * ice_lag_nop_handler - no-op Rx handler to disable LAG
- * @pskb: pointer to skb pointer
- */
-rx_handler_result_t ice_lag_nop_handler(struct sk_buff __always_unused **pskb)
-{
-	return RX_HANDLER_PASS;
-}
-
 /**
  * ice_lag_set_primary - set PF LAG state as Primary
  * @lag: LAG info struct
@@ -158,7 +149,6 @@ ice_lag_link(struct ice_lag *lag, struct netdev_notifier_changeupper_info *info)
 		lag->upper_netdev = upper;
 	}
 
-	ice_clear_sriov_cap(pf);
 	ice_clear_rdma_cap(pf);
 
 	lag->bonded = true;
@@ -205,7 +195,6 @@ ice_lag_unlink(struct ice_lag *lag,
 	}
 
 	lag->peer_netdev = NULL;
-	ice_set_sriov_cap(pf);
 	ice_set_rdma_cap(pf);
 	lag->bonded = false;
 	lag->role = ICE_LAG_NONE;
@@ -229,7 +218,6 @@ static void ice_lag_unregister(struct ice_lag *lag, struct net_device *netdev)
 	if (lag->upper_netdev) {
 		dev_put(lag->upper_netdev);
 		lag->upper_netdev = NULL;
-		ice_set_sriov_cap(pf);
 		ice_set_rdma_cap(pf);
 	}
 	/* perform some cleanup in case we come back */
diff --git a/drivers/net/ethernet/intel/ice/ice_lag.h b/drivers/net/ethernet/intel/ice/ice_lag.h
index 51b5cf467ce2..0bd6b96d7e01 100644
--- a/drivers/net/ethernet/intel/ice/ice_lag.h
+++ b/drivers/net/ethernet/intel/ice/ice_lag.h
@@ -29,59 +29,9 @@ struct ice_lag {
 	/* each thing blocking bonding will increment this value by one.
 	 * If this value is zero, then bonding is allowed.
 	 */
-	u16 dis_lag;
 	u8 role;
 };
 
 int ice_init_lag(struct ice_pf *pf);
 void ice_deinit_lag(struct ice_pf *pf);
-rx_handler_result_t ice_lag_nop_handler(struct sk_buff **pskb);
-
-/**
- * ice_disable_lag - increment LAG disable count
- * @lag: LAG struct
- */
-static inline void ice_disable_lag(struct ice_lag *lag)
-{
-	/* If LAG this PF is not already disabled, disable it */
-	rtnl_lock();
-	if (!netdev_is_rx_handler_busy(lag->netdev)) {
-		if (!netdev_rx_handler_register(lag->netdev,
-						ice_lag_nop_handler,
-						NULL))
-			lag->handler = true;
-	}
-	rtnl_unlock();
-	lag->dis_lag++;
-}
-
-/**
- * ice_enable_lag - decrement disable count for a PF
- * @lag: LAG struct
- *
- * Decrement the disable counter for a port, and if that count reaches
- * zero, then remove the no-op Rx handler from that netdev
- */
-static inline void ice_enable_lag(struct ice_lag *lag)
-{
-	if (lag->dis_lag)
-		lag->dis_lag--;
-	if (!lag->dis_lag && lag->handler) {
-		rtnl_lock();
-		netdev_rx_handler_unregister(lag->netdev);
-		rtnl_unlock();
-		lag->handler = false;
-	}
-}
-
-/**
- * ice_is_lag_dis - is LAG disabled
- * @lag: LAG struct
- *
- * Return true if bonding is disabled
- */
-static inline bool ice_is_lag_dis(struct ice_lag *lag)
-{
-	return !!(lag->dis_lag);
-}
 #endif /* _ICE_LAG_H_ */
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index 387bb9cbafbe..3de9556b89ac 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -2707,8 +2707,6 @@ ice_vsi_setup(struct ice_pf *pf, struct ice_vsi_cfg_params *params)
 	return vsi;
 
 err_vsi_cfg:
-	if (params->type == ICE_VSI_VF)
-		ice_enable_lag(pf->lag);
 	ice_vsi_free(vsi);
 
 	return NULL;
diff --git a/drivers/net/ethernet/intel/ice/ice_sriov.c b/drivers/net/ethernet/intel/ice/ice_sriov.c
index 80c643fb9f2f..a7e7debb1428 100644
--- a/drivers/net/ethernet/intel/ice/ice_sriov.c
+++ b/drivers/net/ethernet/intel/ice/ice_sriov.c
@@ -979,8 +979,6 @@ int ice_sriov_configure(struct pci_dev *pdev, int num_vfs)
 	if (!num_vfs) {
 		if (!pci_vfs_assigned(pdev)) {
 			ice_free_vfs(pf);
-			if (pf->lag)
-				ice_enable_lag(pf->lag);
 			return 0;
 		}
 
@@ -992,8 +990,6 @@ int ice_sriov_configure(struct pci_dev *pdev, int num_vfs)
 	if (err)
 		return err;
 
-	if (pf->lag)
-		ice_disable_lag(pf->lag);
 	return num_vfs;
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH net-next 03/12] ice: Unset src prune on uplink VSI
  2023-04-17  9:34 [PATCH net-next 00/12] ice: switchdev bridge offload Wojciech Drewek
  2023-04-17  9:34 ` [PATCH net-next 01/12] ice: Minor switchdev fixes Wojciech Drewek
  2023-04-17  9:34 ` [PATCH net-next 02/12] ice: Remove exclusion code for RDMA+SRIOV Wojciech Drewek
@ 2023-04-17  9:34 ` Wojciech Drewek
  2023-04-19 14:49   ` Alexander Lobakin
  2023-04-17  9:34 ` [PATCH net-next 04/12] ice: Implement basic eswitch bridge setup Wojciech Drewek
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 46+ messages in thread
From: Wojciech Drewek @ 2023-04-17  9:34 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, alexandr.lobakin, david.m.ertman, michal.swiatkowski,
	marcin.szycik, pawel.chmielewski, sridhar.samudrala

In switchdev mode uplink VSI is supposed to receive all packets that
were not matched by existing filters. If ICE_AQ_VSI_SW_FLAG_LOCAL_LB
bit is unset and we have a filter associated with uplink VSI
which matches on dst mac equal to MAC1, then packets with src mac equal
to MAC1 will be pruned from reaching uplink VSI.

Fix this by updating uplink VSI with ICE_AQ_VSI_SW_FLAG_LOCAL_LB bit
set when configuring switchdev mode.

Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_eswitch.c |  6 +++++
 drivers/net/ethernet/intel/ice/ice_lib.c     | 24 ++++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_lib.h     |  1 +
 3 files changed, 31 insertions(+)

diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch.c b/drivers/net/ethernet/intel/ice/ice_eswitch.c
index bfd003135fc8..4fe235da1182 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch.c
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch.c
@@ -113,8 +113,13 @@ static int ice_eswitch_setup_env(struct ice_pf *pf)
 	if (ice_vsi_update_security(ctrl_vsi, ice_vsi_ctx_set_allow_override))
 		goto err_override_control;
 
+	if (ice_vsi_update_local_lb(uplink_vsi, true))
+		goto err_override_local_lb;
+
 	return 0;
 
+err_override_local_lb:
+	ice_vsi_update_security(ctrl_vsi, ice_vsi_ctx_clear_allow_override);
 err_override_control:
 	ice_vsi_update_security(uplink_vsi, ice_vsi_ctx_clear_allow_override);
 err_override_uplink:
@@ -391,6 +396,7 @@ static void ice_eswitch_release_env(struct ice_pf *pf)
 
 	vlan_ops = ice_get_compat_vsi_vlan_ops(uplink_vsi);
 
+	ice_vsi_update_local_lb(uplink_vsi, false);
 	ice_vsi_update_security(ctrl_vsi, ice_vsi_ctx_clear_allow_override);
 	ice_vsi_update_security(uplink_vsi, ice_vsi_ctx_clear_allow_override);
 	vlan_ops->ena_rx_filtering(uplink_vsi);
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index 3de9556b89ac..60b123d3c9cf 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -4112,3 +4112,27 @@ void ice_vsi_ctx_clear_allow_override(struct ice_vsi_ctx *ctx)
 {
 	ctx->info.sec_flags &= ~ICE_AQ_VSI_SEC_FLAG_ALLOW_DEST_OVRD;
 }
+
+/**
+ * ice_vsi_update_local_lb - update sw block in VSI with local loopback bit
+ * @vsi: pointer to VSI structure
+ * @set: set or unset the bit
+ */
+int
+ice_vsi_update_local_lb(struct ice_vsi *vsi, bool set)
+{
+	struct ice_vsi_ctx ctx = { 0 };
+
+	ctx.info = vsi->info;
+	ctx.info.valid_sections = cpu_to_le16(ICE_AQ_VSI_PROP_SW_VALID);
+	if (set)
+		ctx.info.sw_flags |= ICE_AQ_VSI_SW_FLAG_LOCAL_LB;
+	else
+		ctx.info.sw_flags &= ~ICE_AQ_VSI_SW_FLAG_LOCAL_LB;
+
+	if (ice_update_vsi(&vsi->back->hw, vsi->idx, &ctx, NULL))
+		return -ENODEV;
+
+	vsi->info = ctx.info;
+	return 0;
+}
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.h b/drivers/net/ethernet/intel/ice/ice_lib.h
index e985766e6bb5..1628385a9672 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.h
+++ b/drivers/net/ethernet/intel/ice/ice_lib.h
@@ -157,6 +157,7 @@ void ice_vsi_ctx_clear_antispoof(struct ice_vsi_ctx *ctx);
 void ice_vsi_ctx_set_allow_override(struct ice_vsi_ctx *ctx);
 
 void ice_vsi_ctx_clear_allow_override(struct ice_vsi_ctx *ctx);
+int ice_vsi_update_local_lb(struct ice_vsi *vsi, bool set);
 int ice_vsi_add_vlan_zero(struct ice_vsi *vsi);
 int ice_vsi_del_vlan_zero(struct ice_vsi *vsi);
 bool ice_vsi_has_non_zero_vlans(struct ice_vsi *vsi);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH net-next 04/12] ice: Implement basic eswitch bridge setup
  2023-04-17  9:34 [PATCH net-next 00/12] ice: switchdev bridge offload Wojciech Drewek
                   ` (2 preceding siblings ...)
  2023-04-17  9:34 ` [PATCH net-next 03/12] ice: Unset src prune on uplink VSI Wojciech Drewek
@ 2023-04-17  9:34 ` Wojciech Drewek
  2023-04-19 15:23   ` Alexander Lobakin
  2023-04-17  9:34 ` [PATCH net-next 05/12] ice: Switchdev FDB events support Wojciech Drewek
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 46+ messages in thread
From: Wojciech Drewek @ 2023-04-17  9:34 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, alexandr.lobakin, david.m.ertman, michal.swiatkowski,
	marcin.szycik, pawel.chmielewski, sridhar.samudrala

With this patch, ice driver is able to track if the port
representors or uplink port were added to the linux bridge in
switchdev mode. Listen for NETDEV_CHANGEUPPER events in order to
detect this. ice_esw_br data structure reflects the linux bridge
and stores all the ports of the bridge (ice_esw_br_port) in
xarray, it's created when the first port is added to the bridge and
freed once the last port is removed. Note that only one bridge is
supported per eswitch.

Bridge port (ice_esw_br_port) can be either a VF port representor
port or uplink port (ice_esw_br_port_type). In both cases bridge port
holds a reference to the VSI, VF's VSI in case of the PR and uplink
VSI in case of the uplink. VSI's index is used as an index to the
xarray in which ports are stored.

Add a check which prevents configuring switchdev mode if uplink is
already added to any bridge. This is needed because we need to listen
for NETDEV_CHANGEUPPER events to record if the uplink was added to
the bridge. Netdevice notifier is registered after eswitch mode
is changed top switchdev.

Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
---
 drivers/net/ethernet/intel/ice/Makefile       |   2 +-
 drivers/net/ethernet/intel/ice/ice.h          |   4 +-
 drivers/net/ethernet/intel/ice/ice_eswitch.c  |  23 +-
 .../net/ethernet/intel/ice/ice_eswitch_br.c   | 379 ++++++++++++++++++
 .../net/ethernet/intel/ice/ice_eswitch_br.h   |  42 ++
 drivers/net/ethernet/intel/ice/ice_main.c     |   2 +-
 drivers/net/ethernet/intel/ice/ice_repr.c     |   2 +-
 drivers/net/ethernet/intel/ice/ice_repr.h     |   3 +-
 8 files changed, 448 insertions(+), 9 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/ice/ice_eswitch_br.c
 create mode 100644 drivers/net/ethernet/intel/ice/ice_eswitch_br.h

diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile
index 817977e3039d..960277d78e09 100644
--- a/drivers/net/ethernet/intel/ice/Makefile
+++ b/drivers/net/ethernet/intel/ice/Makefile
@@ -47,5 +47,5 @@ ice-$(CONFIG_PTP_1588_CLOCK) += ice_ptp.o ice_ptp_hw.o
 ice-$(CONFIG_DCB) += ice_dcb.o ice_dcb_nl.o ice_dcb_lib.o
 ice-$(CONFIG_RFS_ACCEL) += ice_arfs.o
 ice-$(CONFIG_XDP_SOCKETS) += ice_xsk.o
-ice-$(CONFIG_ICE_SWITCHDEV) += ice_eswitch.o
+ice-$(CONFIG_ICE_SWITCHDEV) += ice_eswitch.o ice_eswitch_br.o
 ice-$(CONFIG_GNSS) += ice_gnss.o
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index ac2971073fdd..5b2ade5908e8 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -511,6 +511,7 @@ struct ice_switchdev_info {
 	struct ice_vsi *control_vsi;
 	struct ice_vsi *uplink_vsi;
 	bool is_running;
+	struct ice_esw_br_offloads *br_offloads;
 };
 
 struct ice_agg_node {
@@ -618,6 +619,7 @@ struct ice_pf {
 	struct ice_lag *lag; /* Link Aggregation information */
 
 	struct ice_switchdev_info switchdev;
+	struct ice_esw_br_port *br_port;
 
 #define ICE_INVALID_AGG_NODE_ID		0
 #define ICE_PF_AGG_NODE_ID_START	1
@@ -845,7 +847,7 @@ static inline bool ice_is_adq_active(struct ice_pf *pf)
 	return false;
 }
 
-bool netif_is_ice(struct net_device *dev);
+bool netif_is_ice(const struct net_device *dev);
 int ice_vsi_setup_tx_rings(struct ice_vsi *vsi);
 int ice_vsi_setup_rx_rings(struct ice_vsi *vsi);
 int ice_vsi_open_ctrl(struct ice_vsi *vsi);
diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch.c b/drivers/net/ethernet/intel/ice/ice_eswitch.c
index 4fe235da1182..c2e3289a0bb4 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch.c
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch.c
@@ -4,6 +4,7 @@
 #include "ice.h"
 #include "ice_lib.h"
 #include "ice_eswitch.h"
+#include "ice_eswitch_br.h"
 #include "ice_fltr.h"
 #include "ice_repr.h"
 #include "ice_devlink.h"
@@ -474,16 +475,24 @@ static void ice_eswitch_napi_disable(struct ice_pf *pf)
  */
 static int ice_eswitch_enable_switchdev(struct ice_pf *pf)
 {
-	struct ice_vsi *ctrl_vsi;
+	struct ice_vsi *ctrl_vsi, *uplink_vsi;
+
+	uplink_vsi = ice_get_main_vsi(pf);
+	if (!uplink_vsi)
+		return -ENODEV;
+
+	if (netif_is_any_bridge_port(uplink_vsi->netdev)) {
+		dev_err(ice_pf_to_dev(pf),
+			"Uplink port cannot be a bridge port\n");
+		return -EINVAL;
+	}
 
 	pf->switchdev.control_vsi = ice_eswitch_vsi_setup(pf, pf->hw.port_info);
 	if (!pf->switchdev.control_vsi)
 		return -ENODEV;
 
 	ctrl_vsi = pf->switchdev.control_vsi;
-	pf->switchdev.uplink_vsi = ice_get_main_vsi(pf);
-	if (!pf->switchdev.uplink_vsi)
-		goto err_vsi;
+	pf->switchdev.uplink_vsi = uplink_vsi;
 
 	if (ice_eswitch_setup_env(pf))
 		goto err_vsi;
@@ -499,10 +508,15 @@ static int ice_eswitch_enable_switchdev(struct ice_pf *pf)
 	if (ice_vsi_open(ctrl_vsi))
 		goto err_setup_reprs;
 
+	if (ice_eswitch_br_offloads_init(pf))
+		goto err_br_offloads;
+
 	ice_eswitch_napi_enable(pf);
 
 	return 0;
 
+err_br_offloads:
+	ice_vsi_close(ctrl_vsi);
 err_setup_reprs:
 	ice_repr_rem_from_all_vfs(pf);
 err_repr_add:
@@ -521,6 +535,7 @@ static void ice_eswitch_disable_switchdev(struct ice_pf *pf)
 	struct ice_vsi *ctrl_vsi = pf->switchdev.control_vsi;
 
 	ice_eswitch_napi_disable(pf);
+	ice_eswitch_br_offloads_deinit(pf);
 	ice_eswitch_release_env(pf);
 	ice_eswitch_release_reprs(pf, ctrl_vsi);
 	ice_vsi_release(ctrl_vsi);
diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
new file mode 100644
index 000000000000..02406f870c83
--- /dev/null
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
@@ -0,0 +1,379 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (C) 2023, Intel Corporation. */
+
+#include "ice.h"
+#include "ice_eswitch_br.h"
+#include "ice_repr.h"
+
+static bool ice_eswitch_br_is_dev_valid(const struct net_device *dev)
+{
+	/* Accept only PF netdev and PRs */
+	return ice_is_port_repr_netdev(dev) || netif_is_ice(dev);
+}
+
+static struct ice_esw_br_port *
+ice_eswitch_br_netdev_to_port(struct net_device *dev)
+{
+	if (ice_is_port_repr_netdev(dev)) {
+		struct ice_repr *repr = ice_netdev_to_repr(dev);
+
+		return repr->br_port;
+	} else if (netif_is_ice(dev)) {
+		struct ice_pf *pf = ice_netdev_to_pf(dev);
+
+		return pf->br_port;
+	}
+
+	return NULL;
+}
+
+static void
+ice_eswitch_br_port_deinit(struct ice_esw_br *bridge,
+			   struct ice_esw_br_port *br_port)
+{
+	struct ice_vsi *vsi = br_port->vsi;
+
+	if (br_port->type == ICE_ESWITCH_BR_UPLINK_PORT && vsi->back)
+		vsi->back->br_port = NULL;
+	else if (vsi->vf)
+		vsi->vf->repr->br_port = NULL;
+
+	xa_erase(&bridge->ports, br_port->vsi_idx);
+	kfree(br_port);
+}
+
+static struct ice_esw_br_port *
+ice_eswitch_br_port_init(struct ice_esw_br *bridge)
+{
+	struct ice_esw_br_port *br_port;
+
+	br_port = kzalloc(sizeof(*br_port), GFP_KERNEL);
+	if (!br_port)
+		return ERR_PTR(-ENOMEM);
+
+	br_port->bridge = bridge;
+
+	return br_port;
+}
+
+static int
+ice_eswitch_br_vf_repr_port_init(struct ice_esw_br *bridge,
+				 struct ice_repr *repr)
+{
+	struct ice_esw_br_port *br_port;
+	int err;
+
+	br_port = ice_eswitch_br_port_init(bridge);
+	if (IS_ERR(br_port))
+		return PTR_ERR(br_port);
+
+	br_port->vsi = repr->src_vsi;
+	br_port->vsi_idx = br_port->vsi->idx;
+	br_port->type = ICE_ESWITCH_BR_VF_REPR_PORT;
+	repr->br_port = br_port;
+
+	err = xa_insert(&bridge->ports, br_port->vsi_idx, br_port, GFP_KERNEL);
+	if (err) {
+		ice_eswitch_br_port_deinit(bridge, br_port);
+		return err;
+	}
+
+	return 0;
+}
+
+static int
+ice_eswitch_br_uplink_port_init(struct ice_esw_br *bridge, struct ice_pf *pf)
+{
+	struct ice_vsi *vsi = pf->switchdev.uplink_vsi;
+	struct ice_esw_br_port *br_port;
+	int err;
+
+	br_port = ice_eswitch_br_port_init(bridge);
+	if (IS_ERR(br_port))
+		return PTR_ERR(br_port);
+
+	br_port->vsi = vsi;
+	br_port->vsi_idx = br_port->vsi->idx;
+	br_port->type = ICE_ESWITCH_BR_UPLINK_PORT;
+	pf->br_port = br_port;
+
+	err = xa_insert(&bridge->ports, br_port->vsi_idx, br_port, GFP_KERNEL);
+	if (err) {
+		ice_eswitch_br_port_deinit(bridge, br_port);
+		return err;
+	}
+
+	return 0;
+}
+
+static void
+ice_eswitch_br_ports_flush(struct ice_esw_br *bridge)
+{
+	struct ice_esw_br_port *port;
+	unsigned long i;
+
+	xa_for_each(&bridge->ports, i, port)
+		ice_eswitch_br_port_deinit(bridge, port);
+}
+
+static void
+ice_eswitch_br_deinit(struct ice_esw_br_offloads *br_offloads,
+		      struct ice_esw_br *bridge)
+{
+	if (!bridge)
+		return;
+
+	/* Cleanup all the ports that were added asynchronously
+	 * through NETDEV_CHANGEUPPER event.
+	 */
+	ice_eswitch_br_ports_flush(bridge);
+	WARN_ON(!xa_empty(&bridge->ports));
+	xa_destroy(&bridge->ports);
+	br_offloads->bridge = NULL;
+	kfree(bridge);
+}
+
+static struct ice_esw_br *
+ice_eswitch_br_init(struct ice_esw_br_offloads *br_offloads, int ifindex)
+{
+	struct ice_esw_br *bridge;
+
+	bridge = kzalloc(sizeof(*bridge), GFP_KERNEL);
+	if (!bridge)
+		return ERR_PTR(-ENOMEM);
+
+	bridge->br_offloads = br_offloads;
+	bridge->ifindex = ifindex;
+	xa_init(&bridge->ports);
+	br_offloads->bridge = bridge;
+
+	return bridge;
+}
+
+static struct ice_esw_br *
+ice_eswitch_br_get(struct ice_esw_br_offloads *br_offloads, int ifindex,
+		   struct netlink_ext_ack *extack)
+{
+	struct ice_esw_br *bridge = br_offloads->bridge;
+
+	if (bridge) {
+		if (bridge->ifindex != ifindex) {
+			NL_SET_ERR_MSG_MOD(extack,
+					   "Only one bridge is supported per eswitch");
+			return ERR_PTR(-EOPNOTSUPP);
+		}
+		return bridge;
+	}
+
+	/* Create the bridge if it doesn't exist yet */
+	bridge = ice_eswitch_br_init(br_offloads, ifindex);
+	if (IS_ERR(bridge))
+		NL_SET_ERR_MSG_MOD(extack, "Failed to init the bridge");
+
+	return bridge;
+}
+
+static void
+ice_eswitch_br_verify_deinit(struct ice_esw_br_offloads *br_offloads,
+			     struct ice_esw_br *bridge)
+{
+	/* Remove the bridge if it exists and there are no ports left */
+	if (!bridge || !xa_empty(&bridge->ports))
+		return;
+
+	ice_eswitch_br_deinit(br_offloads, bridge);
+}
+
+static int
+ice_eswitch_br_port_unlink(struct ice_esw_br_offloads *br_offloads,
+			   struct net_device *dev, int ifindex,
+			   struct netlink_ext_ack *extack)
+{
+	struct ice_esw_br_port *br_port = ice_eswitch_br_netdev_to_port(dev);
+
+	if (!br_port) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "Port representor is not attached to any bridge");
+		return -EINVAL;
+	}
+
+	if (br_port->bridge->ifindex != ifindex) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "Port representor is attached to another bridge");
+		return -EINVAL;
+	}
+
+	ice_eswitch_br_port_deinit(br_port->bridge, br_port);
+	ice_eswitch_br_verify_deinit(br_offloads, br_port->bridge);
+
+	return 0;
+}
+
+static int
+ice_eswitch_br_port_link(struct ice_esw_br_offloads *br_offloads,
+			 struct net_device *dev, int ifindex,
+			 struct netlink_ext_ack *extack)
+{
+	struct ice_esw_br *bridge;
+	int err;
+
+	if (ice_eswitch_br_netdev_to_port(dev)) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "Port is already attached to the bridge");
+		return -EINVAL;
+	}
+
+	bridge = ice_eswitch_br_get(br_offloads, ifindex, extack);
+	if (IS_ERR(bridge))
+		return PTR_ERR(bridge);
+
+	if (ice_is_port_repr_netdev(dev)) {
+		struct ice_repr *repr = ice_netdev_to_repr(dev);
+
+		err = ice_eswitch_br_vf_repr_port_init(bridge, repr);
+	} else {
+		struct ice_pf *pf = ice_netdev_to_pf(dev);
+
+		err = ice_eswitch_br_uplink_port_init(bridge, pf);
+	}
+	if (err) {
+		NL_SET_ERR_MSG_MOD(extack, "Failed to init bridge port");
+		goto err_port_init;
+	}
+
+	return 0;
+
+err_port_init:
+	ice_eswitch_br_verify_deinit(br_offloads, bridge);
+	return err;
+}
+
+static int
+ice_eswitch_br_port_changeupper(struct notifier_block *nb, void *ptr)
+{
+	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
+	struct netdev_notifier_changeupper_info *info = ptr;
+	struct ice_esw_br_offloads *br_offloads =
+		ice_nb_to_br_offloads(nb, netdev_nb);
+	struct netlink_ext_ack *extack;
+	struct net_device *upper;
+
+	if (!ice_eswitch_br_is_dev_valid(dev))
+		return 0;
+
+	upper = info->upper_dev;
+	if (!netif_is_bridge_master(upper))
+		return 0;
+
+	extack = netdev_notifier_info_to_extack(&info->info);
+
+	return info->linking ?
+		ice_eswitch_br_port_link(br_offloads, dev, upper->ifindex,
+					 extack) :
+		ice_eswitch_br_port_unlink(br_offloads, dev, upper->ifindex,
+					   extack);
+}
+
+static int
+ice_eswitch_br_port_event(struct notifier_block *nb,
+			  unsigned long event, void *ptr)
+{
+	int err = 0;
+
+	switch (event) {
+	case NETDEV_CHANGEUPPER:
+		err = ice_eswitch_br_port_changeupper(nb, ptr);
+		break;
+	}
+
+	return notifier_from_errno(err);
+}
+
+static void
+ice_eswitch_br_offloads_dealloc(struct ice_pf *pf)
+{
+	struct ice_esw_br_offloads *br_offloads = pf->switchdev.br_offloads;
+
+	ASSERT_RTNL();
+
+	if (!br_offloads)
+		return;
+
+	ice_eswitch_br_deinit(br_offloads, br_offloads->bridge);
+
+	pf->switchdev.br_offloads = NULL;
+	kfree(br_offloads);
+}
+
+static struct ice_esw_br_offloads *
+ice_eswitch_br_offloads_alloc(struct ice_pf *pf)
+{
+	struct ice_esw_br_offloads *br_offloads;
+
+	ASSERT_RTNL();
+
+	if (pf->switchdev.br_offloads)
+		return ERR_PTR(-EEXIST);
+
+	br_offloads = kzalloc(sizeof(*br_offloads), GFP_KERNEL);
+	if (!br_offloads)
+		return ERR_PTR(-ENOMEM);
+
+	pf->switchdev.br_offloads = br_offloads;
+	br_offloads->pf = pf;
+
+	return br_offloads;
+}
+
+void
+ice_eswitch_br_offloads_deinit(struct ice_pf *pf)
+{
+	struct ice_esw_br_offloads *br_offloads;
+
+	br_offloads = pf->switchdev.br_offloads;
+	if (!br_offloads)
+		return;
+
+	unregister_netdevice_notifier(&br_offloads->netdev_nb);
+	/* Although notifier block is unregistered just before,
+	 * so we don't get any new events, some events might be
+	 * already in progress. Hold the rtnl lock and wait for
+	 * them to finished.
+	 */
+	rtnl_lock();
+	ice_eswitch_br_offloads_dealloc(pf);
+	rtnl_unlock();
+}
+
+int
+ice_eswitch_br_offloads_init(struct ice_pf *pf)
+{
+	struct ice_esw_br_offloads *br_offloads;
+	struct device *dev = ice_pf_to_dev(pf);
+	int err;
+
+	rtnl_lock();
+	br_offloads = ice_eswitch_br_offloads_alloc(pf);
+	rtnl_unlock();
+	if (IS_ERR(br_offloads)) {
+		dev_err(dev, "Failed to init eswitch bridge\n");
+		return PTR_ERR(br_offloads);
+	}
+
+	br_offloads->netdev_nb.notifier_call = ice_eswitch_br_port_event;
+	err = register_netdevice_notifier(&br_offloads->netdev_nb);
+	if (err) {
+		dev_err(dev,
+			"Failed to register bridge port event notifier\n");
+		goto err_reg_netdev_nb;
+	}
+
+	return 0;
+
+err_reg_netdev_nb:
+	rtnl_lock();
+	ice_eswitch_br_offloads_dealloc(pf);
+	rtnl_unlock();
+
+	return err;
+}
diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
new file mode 100644
index 000000000000..53ea29569c36
--- /dev/null
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
@@ -0,0 +1,42 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2023, Intel Corporation. */
+
+#ifndef _ICE_ESWITCH_BR_H_
+#define _ICE_ESWITCH_BR_H_
+
+enum ice_esw_br_port_type {
+	ICE_ESWITCH_BR_UPLINK_PORT = 0,
+	ICE_ESWITCH_BR_VF_REPR_PORT = 1,
+};
+
+struct ice_esw_br_port {
+	struct ice_esw_br *bridge;
+	enum ice_esw_br_port_type type;
+	struct ice_vsi *vsi;
+	u16 vsi_idx;
+};
+
+struct ice_esw_br {
+	struct ice_esw_br_offloads *br_offloads;
+	int ifindex;
+
+	struct xarray ports;
+};
+
+struct ice_esw_br_offloads {
+	struct ice_pf *pf;
+	struct ice_esw_br *bridge;
+	struct notifier_block netdev_nb;
+};
+
+#define ice_nb_to_br_offloads(nb, nb_name) \
+	container_of(nb, \
+		     struct ice_esw_br_offloads, \
+		     nb_name)
+
+void
+ice_eswitch_br_offloads_deinit(struct ice_pf *pf);
+int
+ice_eswitch_br_offloads_init(struct ice_pf *pf);
+
+#endif /* _ICE_ESWITCH_BR_H_ */
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index f198c845631f..f92a136797ae 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -80,7 +80,7 @@ ice_indr_setup_tc_cb(struct net_device *netdev, struct Qdisc *sch,
 		     void *data,
 		     void (*cleanup)(struct flow_block_cb *block_cb));
 
-bool netif_is_ice(struct net_device *dev)
+bool netif_is_ice(const struct net_device *dev)
 {
 	return dev && (dev->netdev_ops == &ice_netdev_ops);
 }
diff --git a/drivers/net/ethernet/intel/ice/ice_repr.c b/drivers/net/ethernet/intel/ice/ice_repr.c
index e30e12321abd..c686ac0935eb 100644
--- a/drivers/net/ethernet/intel/ice/ice_repr.c
+++ b/drivers/net/ethernet/intel/ice/ice_repr.c
@@ -254,7 +254,7 @@ static const struct net_device_ops ice_repr_netdev_ops = {
  * ice_is_port_repr_netdev - Check if a given netdevice is a port representor netdev
  * @netdev: pointer to netdev
  */
-bool ice_is_port_repr_netdev(struct net_device *netdev)
+bool ice_is_port_repr_netdev(const struct net_device *netdev)
 {
 	return netdev && (netdev->netdev_ops == &ice_repr_netdev_ops);
 }
diff --git a/drivers/net/ethernet/intel/ice/ice_repr.h b/drivers/net/ethernet/intel/ice/ice_repr.h
index 9c2a6f496b3b..e1ee2d2c1d2d 100644
--- a/drivers/net/ethernet/intel/ice/ice_repr.h
+++ b/drivers/net/ethernet/intel/ice/ice_repr.h
@@ -12,6 +12,7 @@ struct ice_repr {
 	struct ice_q_vector *q_vector;
 	struct net_device *netdev;
 	struct metadata_dst *dst;
+	struct ice_esw_br_port *br_port;
 #ifdef CONFIG_ICE_SWITCHDEV
 	/* info about slow path rule */
 	struct ice_rule_query_data sp_rule;
@@ -27,5 +28,5 @@ void ice_repr_stop_tx_queues(struct ice_repr *repr);
 void ice_repr_set_traffic_vsi(struct ice_repr *repr, struct ice_vsi *vsi);
 
 struct ice_repr *ice_netdev_to_repr(struct net_device *netdev);
-bool ice_is_port_repr_netdev(struct net_device *netdev);
+bool ice_is_port_repr_netdev(const struct net_device *netdev);
 #endif
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH net-next 05/12] ice: Switchdev FDB events support
  2023-04-17  9:34 [PATCH net-next 00/12] ice: switchdev bridge offload Wojciech Drewek
                   ` (3 preceding siblings ...)
  2023-04-17  9:34 ` [PATCH net-next 04/12] ice: Implement basic eswitch bridge setup Wojciech Drewek
@ 2023-04-17  9:34 ` Wojciech Drewek
  2023-04-19 15:38   ` Alexander Lobakin
  2023-04-17  9:34 ` [PATCH net-next 06/12] ice: Add guard rule when creating FDB in switchdev Wojciech Drewek
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 46+ messages in thread
From: Wojciech Drewek @ 2023-04-17  9:34 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, alexandr.lobakin, david.m.ertman, michal.swiatkowski,
	marcin.szycik, pawel.chmielewski, sridhar.samudrala

Listen for SWITCHDEV_FDB_{ADD|DEL}_TO_DEVICE events while in switchdev
mode. Accept these events on both uplink and VF PR ports. Add HW
rules in newly created workqueue. FDB entries are stored in rhashtable
for lookup when removing the entry and in the list for cleanup
purpose. Direction of the HW rule depends on the type of the ports
on which the FDB event was received:

ICE_ESWITCH_BR_UPLINK_PORT:
TX rule that forwards the packet to the LAN (egress).

ICE_ESWITCH_BR_VF_REPR_PORT:
RX rule that forwards the packet to the VF associated
with the port representor.

In both cases the rule matches on the dst mac address.
All the FDB entries are stored in the bridge structure.
When the port is removed all the FDB entries associated with
this port are removed as well. This is achieved thanks to the reference
to the port that FDB entry holds.

Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
---
 .../net/ethernet/intel/ice/ice_eswitch_br.c   | 445 ++++++++++++++++++
 .../net/ethernet/intel/ice/ice_eswitch_br.h   |  45 ++
 2 files changed, 490 insertions(+)

diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
index 02406f870c83..4008665d5383 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
@@ -4,6 +4,14 @@
 #include "ice.h"
 #include "ice_eswitch_br.h"
 #include "ice_repr.h"
+#include "ice_switch.h"
+
+static const struct rhashtable_params ice_fdb_ht_params = {
+	.key_offset = offsetof(struct ice_esw_br_fdb_entry, data),
+	.key_len = sizeof(struct ice_esw_br_fdb_data),
+	.head_offset = offsetof(struct ice_esw_br_fdb_entry, ht_node),
+	.automatic_shrinking = true,
+};
 
 static bool ice_eswitch_br_is_dev_valid(const struct net_device *dev)
 {
@@ -27,12 +35,417 @@ ice_eswitch_br_netdev_to_port(struct net_device *dev)
 	return NULL;
 }
 
+static void
+ice_eswitch_br_ingress_rule_setup(struct ice_adv_lkup_elem *list,
+				  struct ice_adv_rule_info *rule_info,
+				  const unsigned char *mac,
+				  u8 pf_id, u16 vf_vsi_idx)
+{
+	list[0].type = ICE_MAC_OFOS;
+	ether_addr_copy(list[0].h_u.eth_hdr.dst_addr, mac);
+	eth_broadcast_addr(list[0].m_u.eth_hdr.dst_addr);
+
+	rule_info->sw_act.vsi_handle = vf_vsi_idx;
+	rule_info->sw_act.flag |= ICE_FLTR_RX;
+	rule_info->sw_act.src = pf_id;
+	rule_info->priority = 5;
+}
+
+static void
+ice_eswitch_br_egress_rule_setup(struct ice_adv_lkup_elem *list,
+				 struct ice_adv_rule_info *rule_info,
+				 const unsigned char *mac,
+				 u16 pf_vsi_idx)
+{
+	list[0].type = ICE_MAC_OFOS;
+	ether_addr_copy(list[0].h_u.eth_hdr.dst_addr, mac);
+	eth_broadcast_addr(list[0].m_u.eth_hdr.dst_addr);
+
+	rule_info->sw_act.vsi_handle = pf_vsi_idx;
+	rule_info->sw_act.flag |= ICE_FLTR_TX;
+	rule_info->flags_info.act = ICE_SINGLE_ACT_LAN_ENABLE;
+	rule_info->flags_info.act_valid = true;
+	rule_info->priority = 5;
+}
+
+static int
+ice_eswitch_br_rule_delete(struct ice_hw *hw, struct ice_rule_query_data *rule)
+{
+	int err;
+
+	if (!rule)
+		return -EINVAL;
+
+	err = ice_rem_adv_rule_by_id(hw, rule);
+	kfree(rule);
+
+	return err;
+}
+
+static struct ice_rule_query_data *
+ice_eswitch_br_fwd_rule_create(struct ice_hw *hw, u16 vsi_idx, int port_type,
+			       const unsigned char *mac)
+{
+	struct ice_adv_rule_info rule_info = { 0 };
+	struct ice_rule_query_data *rule;
+	struct ice_adv_lkup_elem *list;
+	u16 lkups_cnt = 1;
+	int err;
+
+	rule = kzalloc(sizeof(*rule), GFP_KERNEL);
+	if (!rule)
+		return ERR_PTR(-ENOMEM);
+
+	list = kcalloc(lkups_cnt, sizeof(*list), GFP_ATOMIC);
+	if (!list) {
+		err = -ENOMEM;
+		goto err_list_alloc;
+	}
+
+	switch (port_type) {
+	case ICE_ESWITCH_BR_UPLINK_PORT:
+		ice_eswitch_br_egress_rule_setup(list, &rule_info, mac,
+						 vsi_idx);
+		break;
+	case ICE_ESWITCH_BR_VF_REPR_PORT:
+		ice_eswitch_br_ingress_rule_setup(list, &rule_info, mac,
+						  hw->pf_id, vsi_idx);
+		break;
+	default:
+		err = -EINVAL;
+		goto err_add_rule;
+	}
+
+	rule_info.sw_act.fltr_act = ICE_FWD_TO_VSI;
+
+	err = ice_add_adv_rule(hw, list, lkups_cnt, &rule_info, rule);
+	if (err)
+		goto err_add_rule;
+
+	kfree(list);
+
+	return rule;
+
+err_add_rule:
+	kfree(list);
+err_list_alloc:
+	kfree(rule);
+
+	return ERR_PTR(err);
+}
+
+static struct ice_esw_br_flow *
+ice_eswitch_br_flow_create(struct device *dev, struct ice_hw *hw, u16 vsi_idx,
+			   int port_type, const unsigned char *mac)
+{
+	struct ice_rule_query_data *fwd_rule;
+	struct ice_esw_br_flow *flow;
+	int err;
+
+	flow = kzalloc(sizeof(*flow), GFP_KERNEL);
+	if (!flow)
+		return ERR_PTR(-ENOMEM);
+
+	fwd_rule = ice_eswitch_br_fwd_rule_create(hw, vsi_idx, port_type, mac);
+	if (IS_ERR(fwd_rule)) {
+		err = PTR_ERR(fwd_rule);
+		dev_err(dev, "Failed to create eswitch bridge %sgress forward rule, err: %d\n",
+			port_type == ICE_ESWITCH_BR_UPLINK_PORT ? "e" : "in",
+			err);
+		goto err_fwd_rule;
+	}
+
+	flow->fwd_rule = fwd_rule;
+
+	return flow;
+
+err_fwd_rule:
+	kfree(flow);
+
+	return ERR_PTR(err);
+}
+
+static struct ice_esw_br_fdb_entry *
+ice_eswitch_br_fdb_find(struct ice_esw_br *bridge, const unsigned char *mac,
+			u16 vid)
+{
+	struct ice_esw_br_fdb_data data = {};
+
+	ether_addr_copy(data.addr, mac);
+	data.vid = vid;
+	return rhashtable_lookup_fast(&bridge->fdb_ht, &data,
+				      ice_fdb_ht_params);
+}
+
+static void
+ice_eswitch_br_flow_delete(struct ice_pf *pf, struct ice_esw_br_flow *flow)
+{
+	struct device *dev = ice_pf_to_dev(pf);
+	int err;
+
+	err = ice_eswitch_br_rule_delete(&pf->hw, flow->fwd_rule);
+	if (err)
+		dev_err(dev, "Failed to delete FDB forward rule, err: %d\n",
+			err);
+
+	kfree(flow);
+}
+
+static void
+ice_eswitch_br_fdb_entry_delete(struct ice_esw_br *bridge,
+				struct ice_esw_br_fdb_entry *fdb_entry)
+{
+	struct ice_pf *pf = bridge->br_offloads->pf;
+
+	rhashtable_remove_fast(&bridge->fdb_ht, &fdb_entry->ht_node,
+			       ice_fdb_ht_params);
+	list_del(&fdb_entry->list);
+
+	ice_eswitch_br_flow_delete(pf, fdb_entry->flow);
+
+	kfree(fdb_entry);
+}
+
+static void
+ice_eswitch_br_fdb_offload_notify(struct net_device *dev,
+				  const unsigned char *mac, u16 vid,
+				  unsigned long val)
+{
+	struct switchdev_notifier_fdb_info fdb_info;
+
+	fdb_info.addr = mac;
+	fdb_info.vid = vid;
+	fdb_info.offloaded = true;
+	call_switchdev_notifiers(val, dev, &fdb_info.info, NULL);
+}
+
+static void
+ice_eswitch_br_fdb_entry_notify_and_cleanup(struct ice_esw_br *bridge,
+					    struct ice_esw_br_fdb_entry *entry)
+{
+	if (!(entry->flags & ICE_ESWITCH_BR_FDB_ADDED_BY_USER))
+		ice_eswitch_br_fdb_offload_notify(entry->dev, entry->data.addr,
+						  entry->data.vid,
+						  SWITCHDEV_FDB_DEL_TO_BRIDGE);
+	ice_eswitch_br_fdb_entry_delete(bridge, entry);
+}
+
+static void
+ice_eswitch_br_fdb_entry_find_and_delete(struct ice_esw_br *bridge,
+					 const unsigned char *mac, u16 vid)
+{
+	struct ice_pf *pf = bridge->br_offloads->pf;
+	struct ice_esw_br_fdb_entry *fdb_entry;
+	struct device *dev = ice_pf_to_dev(pf);
+
+	fdb_entry = ice_eswitch_br_fdb_find(bridge, mac, vid);
+	if (!fdb_entry) {
+		dev_err(dev, "FDB entry with mac: %pM and vid: %u not found\n",
+			mac, vid);
+		return;
+	}
+
+	ice_eswitch_br_fdb_entry_notify_and_cleanup(bridge, fdb_entry);
+}
+
+static void
+ice_eswitch_br_fdb_entry_create(struct net_device *netdev,
+				struct ice_esw_br_port *br_port,
+				bool added_by_user,
+				const unsigned char *mac, u16 vid)
+{
+	struct ice_esw_br *bridge = br_port->bridge;
+	struct ice_pf *pf = bridge->br_offloads->pf;
+	struct device *dev = ice_pf_to_dev(pf);
+	struct ice_esw_br_fdb_entry *fdb_entry;
+	struct ice_esw_br_flow *flow;
+	struct ice_hw *hw = &pf->hw;
+	unsigned long event;
+	int err;
+
+	fdb_entry = ice_eswitch_br_fdb_find(bridge, mac, vid);
+	if (fdb_entry)
+		ice_eswitch_br_fdb_entry_notify_and_cleanup(bridge, fdb_entry);
+
+	fdb_entry = kzalloc(sizeof(*fdb_entry), GFP_KERNEL);
+	if (!fdb_entry) {
+		err = -ENOMEM;
+		goto err_exit;
+	}
+
+	flow = ice_eswitch_br_flow_create(dev, hw, br_port->vsi_idx,
+					  br_port->type, mac);
+	if (IS_ERR(flow)) {
+		err = PTR_ERR(flow);
+		goto err_add_flow;
+	}
+
+	ether_addr_copy(fdb_entry->data.addr, mac);
+	fdb_entry->data.vid = vid;
+	fdb_entry->br_port = br_port;
+	fdb_entry->flow = flow;
+	fdb_entry->dev = netdev;
+	event = SWITCHDEV_FDB_ADD_TO_BRIDGE;
+
+	if (added_by_user) {
+		fdb_entry->flags |= ICE_ESWITCH_BR_FDB_ADDED_BY_USER;
+		event = SWITCHDEV_FDB_OFFLOADED;
+	}
+
+	err = rhashtable_insert_fast(&bridge->fdb_ht, &fdb_entry->ht_node,
+				     ice_fdb_ht_params);
+	if (err)
+		goto err_fdb_insert;
+
+	list_add(&fdb_entry->list, &bridge->fdb_list);
+
+	ice_eswitch_br_fdb_offload_notify(netdev, mac, vid, event);
+
+	return;
+
+err_fdb_insert:
+	ice_eswitch_br_flow_delete(pf, flow);
+err_add_flow:
+	kfree(fdb_entry);
+err_exit:
+	dev_err(dev, "Failed to create fdb entry, err: %d\n", err);
+}
+
+static void
+ice_eswitch_br_fdb_work_dealloc(struct ice_esw_br_fdb_work *fdb_work)
+{
+	kfree(fdb_work->fdb_info.addr);
+	kfree(fdb_work);
+}
+
+static void
+ice_eswitch_br_fdb_event_work(struct work_struct *work)
+{
+	struct ice_esw_br_fdb_work *fdb_work = ice_work_to_fdb_work(work);
+	bool added_by_user = fdb_work->fdb_info.added_by_user;
+	struct ice_esw_br_port *br_port = fdb_work->br_port;
+	const unsigned char *mac = fdb_work->fdb_info.addr;
+	u16 vid = fdb_work->fdb_info.vid;
+
+	rtnl_lock();
+
+	if (!br_port || !br_port->bridge)
+		goto err_exit;
+
+	switch (fdb_work->event) {
+	case SWITCHDEV_FDB_ADD_TO_DEVICE:
+		ice_eswitch_br_fdb_entry_create(fdb_work->dev, br_port,
+						added_by_user, mac, vid);
+		break;
+	case SWITCHDEV_FDB_DEL_TO_DEVICE:
+		ice_eswitch_br_fdb_entry_find_and_delete(br_port->bridge,
+							 mac, vid);
+		break;
+	default:
+		goto err_exit;
+	}
+
+err_exit:
+	rtnl_unlock();
+	dev_put(fdb_work->dev);
+	ice_eswitch_br_fdb_work_dealloc(fdb_work);
+}
+
+static struct ice_esw_br_fdb_work *
+ice_eswitch_br_fdb_work_alloc(struct switchdev_notifier_fdb_info *fdb_info,
+			      struct ice_esw_br_port *br_port,
+			      struct net_device *dev,
+			      unsigned long event)
+{
+	struct ice_esw_br_fdb_work *work;
+	unsigned char *mac;
+
+	work = kzalloc(sizeof(*work), GFP_ATOMIC);
+	if (!work)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_WORK(&work->work, ice_eswitch_br_fdb_event_work);
+	memcpy(&work->fdb_info, fdb_info, sizeof(work->fdb_info));
+
+	mac = kzalloc(ETH_ALEN, GFP_ATOMIC);
+	if (!mac) {
+		kfree(work);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	ether_addr_copy(mac, fdb_info->addr);
+	work->fdb_info.addr = mac;
+	work->br_port = br_port;
+	work->event = event;
+	work->dev = dev;
+
+	return work;
+}
+
+static int
+ice_eswitch_br_switchdev_event(struct notifier_block *nb,
+			       unsigned long event, void *ptr)
+{
+	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
+	struct ice_esw_br_offloads *br_offloads =
+		ice_nb_to_br_offloads(nb, switchdev_nb);
+	struct netlink_ext_ack *extack =
+		switchdev_notifier_info_to_extack(ptr);
+	struct switchdev_notifier_fdb_info *fdb_info;
+	struct switchdev_notifier_info *info = ptr;
+	struct ice_esw_br_fdb_work *work;
+	struct net_device *upper;
+	struct ice_esw_br_port *br_port;
+
+	upper = netdev_master_upper_dev_get_rcu(dev);
+	if (!upper)
+		return NOTIFY_DONE;
+
+	if (!netif_is_bridge_master(upper))
+		return NOTIFY_DONE;
+
+	if (!ice_eswitch_br_is_dev_valid(dev))
+		return NOTIFY_DONE;
+
+	br_port = ice_eswitch_br_netdev_to_port(dev);
+	if (!br_port)
+		return NOTIFY_DONE;
+
+	switch (event) {
+	case SWITCHDEV_FDB_ADD_TO_DEVICE:
+	case SWITCHDEV_FDB_DEL_TO_DEVICE:
+		fdb_info = container_of(info,
+					struct switchdev_notifier_fdb_info,
+					info);
+
+		work = ice_eswitch_br_fdb_work_alloc(fdb_info, br_port, dev,
+						     event);
+		if (IS_ERR(work)) {
+			NL_SET_ERR_MSG_MOD(extack, "Failed to init switchdev fdb work");
+			return notifier_from_errno(PTR_ERR(work));
+		}
+		dev_hold(dev);
+
+		queue_work(br_offloads->wq, &work->work);
+		break;
+	default:
+		break;
+	}
+	return NOTIFY_DONE;
+}
+
 static void
 ice_eswitch_br_port_deinit(struct ice_esw_br *bridge,
 			   struct ice_esw_br_port *br_port)
 {
+	struct ice_esw_br_fdb_entry *fdb_entry, *tmp;
 	struct ice_vsi *vsi = br_port->vsi;
 
+	list_for_each_entry_safe(fdb_entry, tmp, &bridge->fdb_list, list) {
+		if (br_port == fdb_entry->br_port)
+			ice_eswitch_br_fdb_entry_delete(bridge, fdb_entry);
+	}
+
 	if (br_port->type == ICE_ESWITCH_BR_UPLINK_PORT && vsi->back)
 		vsi->back->br_port = NULL;
 	else if (vsi->vf)
@@ -129,6 +542,8 @@ ice_eswitch_br_deinit(struct ice_esw_br_offloads *br_offloads,
 	ice_eswitch_br_ports_flush(bridge);
 	WARN_ON(!xa_empty(&bridge->ports));
 	xa_destroy(&bridge->ports);
+	rhashtable_destroy(&bridge->fdb_ht);
+
 	br_offloads->bridge = NULL;
 	kfree(bridge);
 }
@@ -137,11 +552,19 @@ static struct ice_esw_br *
 ice_eswitch_br_init(struct ice_esw_br_offloads *br_offloads, int ifindex)
 {
 	struct ice_esw_br *bridge;
+	int err;
 
 	bridge = kzalloc(sizeof(*bridge), GFP_KERNEL);
 	if (!bridge)
 		return ERR_PTR(-ENOMEM);
 
+	err = rhashtable_init(&bridge->fdb_ht, &ice_fdb_ht_params);
+	if (err) {
+		kfree(bridge);
+		return ERR_PTR(err);
+	}
+
+	INIT_LIST_HEAD(&bridge->fdb_list);
 	bridge->br_offloads = br_offloads;
 	bridge->ifindex = ifindex;
 	xa_init(&bridge->ports);
@@ -335,6 +758,8 @@ ice_eswitch_br_offloads_deinit(struct ice_pf *pf)
 		return;
 
 	unregister_netdevice_notifier(&br_offloads->netdev_nb);
+	unregister_switchdev_notifier(&br_offloads->switchdev_nb);
+	destroy_workqueue(br_offloads->wq);
 	/* Although notifier block is unregistered just before,
 	 * so we don't get any new events, some events might be
 	 * already in progress. Hold the rtnl lock and wait for
@@ -360,6 +785,22 @@ ice_eswitch_br_offloads_init(struct ice_pf *pf)
 		return PTR_ERR(br_offloads);
 	}
 
+	br_offloads->wq = alloc_ordered_workqueue("ice_bridge_wq", 0);
+	if (!br_offloads->wq) {
+		err = -ENOMEM;
+		dev_err(dev, "Failed to allocate bridge workqueue\n");
+		goto err_alloc_wq;
+	}
+
+	br_offloads->switchdev_nb.notifier_call =
+		ice_eswitch_br_switchdev_event;
+	err = register_switchdev_notifier(&br_offloads->switchdev_nb);
+	if (err) {
+		dev_err(dev,
+			"Failed to register switchdev notifier\n");
+		goto err_reg_switchdev_nb;
+	}
+
 	br_offloads->netdev_nb.notifier_call = ice_eswitch_br_port_event;
 	err = register_netdevice_notifier(&br_offloads->netdev_nb);
 	if (err) {
@@ -371,6 +812,10 @@ ice_eswitch_br_offloads_init(struct ice_pf *pf)
 	return 0;
 
 err_reg_netdev_nb:
+	unregister_switchdev_notifier(&br_offloads->switchdev_nb);
+err_reg_switchdev_nb:
+	destroy_workqueue(br_offloads->wq);
+err_alloc_wq:
 	rtnl_lock();
 	ice_eswitch_br_offloads_dealloc(pf);
 	rtnl_unlock();
diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
index 53ea29569c36..4069eb45617e 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
@@ -4,6 +4,33 @@
 #ifndef _ICE_ESWITCH_BR_H_
 #define _ICE_ESWITCH_BR_H_
 
+#include <linux/rhashtable.h>
+
+struct ice_esw_br_fdb_data {
+	unsigned char addr[ETH_ALEN];
+	u16 vid;
+};
+
+struct ice_esw_br_flow {
+	struct ice_rule_query_data *fwd_rule;
+};
+
+enum {
+	ICE_ESWITCH_BR_FDB_ADDED_BY_USER = BIT(0),
+};
+
+struct ice_esw_br_fdb_entry {
+	struct ice_esw_br_fdb_data data;
+	struct rhash_head ht_node;
+	struct list_head list;
+
+	int flags;
+
+	struct net_device *dev;
+	struct ice_esw_br_port *br_port;
+	struct ice_esw_br_flow *flow;
+};
+
 enum ice_esw_br_port_type {
 	ICE_ESWITCH_BR_UPLINK_PORT = 0,
 	ICE_ESWITCH_BR_VF_REPR_PORT = 1,
@@ -21,12 +48,25 @@ struct ice_esw_br {
 	int ifindex;
 
 	struct xarray ports;
+	struct rhashtable fdb_ht;
+	struct list_head fdb_list;
 };
 
 struct ice_esw_br_offloads {
 	struct ice_pf *pf;
 	struct ice_esw_br *bridge;
 	struct notifier_block netdev_nb;
+	struct notifier_block switchdev_nb;
+
+	struct workqueue_struct *wq;
+};
+
+struct ice_esw_br_fdb_work {
+	struct work_struct work;
+	struct switchdev_notifier_fdb_info fdb_info;
+	struct ice_esw_br_port *br_port;
+	struct net_device *dev;
+	unsigned long event;
 };
 
 #define ice_nb_to_br_offloads(nb, nb_name) \
@@ -34,6 +74,11 @@ struct ice_esw_br_offloads {
 		     struct ice_esw_br_offloads, \
 		     nb_name)
 
+#define ice_work_to_fdb_work(w) \
+	container_of(w, \
+		     struct ice_esw_br_fdb_work, \
+		     work)
+
 void
 ice_eswitch_br_offloads_deinit(struct ice_pf *pf);
 int
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH net-next 06/12] ice: Add guard rule when creating FDB in switchdev
  2023-04-17  9:34 [PATCH net-next 00/12] ice: switchdev bridge offload Wojciech Drewek
                   ` (4 preceding siblings ...)
  2023-04-17  9:34 ` [PATCH net-next 05/12] ice: Switchdev FDB events support Wojciech Drewek
@ 2023-04-17  9:34 ` Wojciech Drewek
  2023-04-21 14:22   ` Alexander Lobakin
  2023-04-17  9:34 ` [PATCH net-next 07/12] ice: Accept LAG netdevs in bridge offloads Wojciech Drewek
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 46+ messages in thread
From: Wojciech Drewek @ 2023-04-17  9:34 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, alexandr.lobakin, david.m.ertman, michal.swiatkowski,
	marcin.szycik, pawel.chmielewski, sridhar.samudrala

From: Marcin Szycik <marcin.szycik@intel.com>

Introduce new "guard" rule upon FDB entry creation.

It matches on src_mac, has valid bit unset, allow_pass_l2 set
and has a nop action.

Previously introduced "forward" rule matches on dst_mac, has valid
bit set, need_pass_l2 set and has a forward action.

With these rules, a packet will be offloaded only if FDB exists in both
directions (RX and TX).

Let's assume link partner sends a packet to VF1: src_mac = LP_MAC,
dst_mac = is VF1_MAC. Bridge adds FDB, two rules are created:
1. Guard rule matching on src_mac == LP_MAC
2. Forward rule matching on dst_mac == LP_MAC
Now VF1 responds with src_mac = VF1_MAC, dst_mac = LP_MAC. Before this
change, only one rule with dst_mac == LP_MAC would have existed, and the
packet would have been offloaded, meaning the bridge wouldn't add FDB in
the opposite direction. Now, the forward rule matches (dst_mac == LP_MAC),
but it has need_pass_l2 set an there is no guard rule with
src_mac == VF1_MAC, so the packet goes through slow-path and the bridge
adds FDB. Two rules are created:
1. Guard rule matching on src_mac == VF1_MAC
2. Forward rule matching on dst_mac == VF1_MAC
Further packets in both directions will be offloaded.

The same example is true in opposite direction (i.e. VF1 is the first to
send a packet out).

Signed-off-by: Marcin Szycik <marcin.szycik@intel.com>
---
 .../net/ethernet/intel/ice/ice_eswitch_br.c   | 66 ++++++++++++++++++-
 .../net/ethernet/intel/ice/ice_eswitch_br.h   |  1 +
 drivers/net/ethernet/intel/ice/ice_switch.c   | 45 ++++++++++---
 drivers/net/ethernet/intel/ice/ice_switch.h   |  5 ++
 drivers/net/ethernet/intel/ice/ice_type.h     |  1 +
 5 files changed, 109 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
index 4008665d5383..82b5eb2020cd 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
@@ -116,6 +116,8 @@ ice_eswitch_br_fwd_rule_create(struct ice_hw *hw, u16 vsi_idx, int port_type,
 		goto err_add_rule;
 	}
 
+	rule_info.need_pass_l2 = true;
+
 	rule_info.sw_act.fltr_act = ICE_FWD_TO_VSI;
 
 	err = ice_add_adv_rule(hw, list, lkups_cnt, &rule_info, rule);
@@ -134,11 +136,56 @@ ice_eswitch_br_fwd_rule_create(struct ice_hw *hw, u16 vsi_idx, int port_type,
 	return ERR_PTR(err);
 }
 
+static struct ice_rule_query_data *
+ice_eswitch_br_guard_rule_create(struct ice_hw *hw, u16 vsi_idx,
+				 const unsigned char *mac)
+{
+	struct ice_adv_rule_info rule_info = { 0 };
+	struct ice_rule_query_data *rule;
+	struct ice_adv_lkup_elem *list;
+	const u16 lkups_cnt = 1;
+	int err;
+
+	rule = kzalloc(sizeof(*rule), GFP_KERNEL);
+	if (!rule) {
+		err = -ENOMEM;
+		goto err_exit;
+	}
+
+	list = kcalloc(lkups_cnt, sizeof(*list), GFP_ATOMIC);
+	if (!list) {
+		err = -ENOMEM;
+		goto err_list_alloc;
+	}
+
+	list[0].type = ICE_MAC_OFOS;
+	ether_addr_copy(list[0].h_u.eth_hdr.src_addr, mac);
+	eth_broadcast_addr(list[0].m_u.eth_hdr.src_addr);
+
+	rule_info.allow_pass_l2 = true;
+	rule_info.sw_act.vsi_handle = vsi_idx;
+	rule_info.sw_act.fltr_act = ICE_NOP;
+	rule_info.priority = 5;
+
+	err = ice_add_adv_rule(hw, list, lkups_cnt, &rule_info, rule);
+	if (err)
+		goto err_add_rule;
+
+	return rule;
+
+err_add_rule:
+	kfree(list);
+err_list_alloc:
+	kfree(rule);
+err_exit:
+	return ERR_PTR(err);
+}
+
 static struct ice_esw_br_flow *
 ice_eswitch_br_flow_create(struct device *dev, struct ice_hw *hw, u16 vsi_idx,
 			   int port_type, const unsigned char *mac)
 {
-	struct ice_rule_query_data *fwd_rule;
+	struct ice_rule_query_data *fwd_rule, *guard_rule;
 	struct ice_esw_br_flow *flow;
 	int err;
 
@@ -155,10 +202,22 @@ ice_eswitch_br_flow_create(struct device *dev, struct ice_hw *hw, u16 vsi_idx,
 		goto err_fwd_rule;
 	}
 
+	guard_rule = ice_eswitch_br_guard_rule_create(hw, vsi_idx, mac);
+	if (IS_ERR(guard_rule)) {
+		err = PTR_ERR(guard_rule);
+		dev_err(dev, "Failed to create eswitch bridge %sgress guard rule, err: %d\n",
+			port_type == ICE_ESWITCH_BR_UPLINK_PORT ? "e" : "in",
+			err);
+		goto err_guard_rule;
+	}
+
 	flow->fwd_rule = fwd_rule;
+	flow->guard_rule = guard_rule;
 
 	return flow;
 
+err_guard_rule:
+	ice_eswitch_br_rule_delete(hw, fwd_rule);
 err_fwd_rule:
 	kfree(flow);
 
@@ -188,6 +247,11 @@ ice_eswitch_br_flow_delete(struct ice_pf *pf, struct ice_esw_br_flow *flow)
 		dev_err(dev, "Failed to delete FDB forward rule, err: %d\n",
 			err);
 
+	err = ice_eswitch_br_rule_delete(&pf->hw, flow->guard_rule);
+	if (err)
+		dev_err(dev, "Failed to delete FDB guard rule, err: %d\n",
+			err);
+
 	kfree(flow);
 }
 
diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
index 4069eb45617e..73ad81bad655 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
@@ -13,6 +13,7 @@ struct ice_esw_br_fdb_data {
 
 struct ice_esw_br_flow {
 	struct ice_rule_query_data *fwd_rule;
+	struct ice_rule_query_data *guard_rule;
 };
 
 enum {
diff --git a/drivers/net/ethernet/intel/ice/ice_switch.c b/drivers/net/ethernet/intel/ice/ice_switch.c
index 2ea9e1ae5517..6fbed7ccc5a9 100644
--- a/drivers/net/ethernet/intel/ice/ice_switch.c
+++ b/drivers/net/ethernet/intel/ice/ice_switch.c
@@ -2277,6 +2277,10 @@ ice_get_recp_frm_fw(struct ice_hw *hw, struct ice_sw_recipe *recps, u8 rid,
 		/* Propagate some data to the recipe database */
 		recps[idx].is_root = !!is_root;
 		recps[idx].priority = root_bufs.content.act_ctrl_fwd_priority;
+		recps[idx].need_pass_l2 = root_bufs.content.act_ctrl &
+					  ICE_AQ_RECIPE_ACT_NEED_PASS_L2;
+		recps[idx].allow_pass_l2 = root_bufs.content.act_ctrl &
+					   ICE_AQ_RECIPE_ACT_ALLOW_PASS_L2;
 		bitmap_zero(recps[idx].res_idxs, ICE_MAX_FV_WORDS);
 		if (root_bufs.content.result_indx & ICE_AQ_RECIPE_RESULT_EN) {
 			recps[idx].chain_idx = root_bufs.content.result_indx &
@@ -4624,7 +4628,7 @@ static struct ice_protocol_entry ice_prot_id_tbl[ICE_PROTOCOL_LAST] = {
  */
 static u16
 ice_find_recp(struct ice_hw *hw, struct ice_prot_lkup_ext *lkup_exts,
-	      enum ice_sw_tunnel_type tun_type)
+	      struct ice_adv_rule_info *rinfo)
 {
 	bool refresh_required = true;
 	struct ice_sw_recipe *recp;
@@ -4685,9 +4689,12 @@ ice_find_recp(struct ice_hw *hw, struct ice_prot_lkup_ext *lkup_exts,
 			}
 			/* If for "i"th recipe the found was never set to false
 			 * then it means we found our match
-			 * Also tun type of recipe needs to be checked
+			 * Also tun type and *_pass_l2 of recipe needs to be
+			 * checked
 			 */
-			if (found && recp[i].tun_type == tun_type)
+			if (found && recp[i].tun_type == rinfo->tun_type &&
+			    recp[i].need_pass_l2 == rinfo->need_pass_l2 &&
+			    recp[i].allow_pass_l2 == rinfo->allow_pass_l2)
 				return i; /* Return the recipe ID */
 		}
 	}
@@ -5075,6 +5082,14 @@ ice_add_sw_recipe(struct ice_hw *hw, struct ice_sw_recipe *rm,
 		set_bit(buf[recps].recipe_indx,
 			(unsigned long *)buf[recps].recipe_bitmap);
 		buf[recps].content.act_ctrl_fwd_priority = rm->priority;
+
+		if (rm->need_pass_l2)
+			buf[recps].content.act_ctrl |=
+				ICE_AQ_RECIPE_ACT_NEED_PASS_L2;
+
+		if (rm->allow_pass_l2)
+			buf[recps].content.act_ctrl |=
+				ICE_AQ_RECIPE_ACT_ALLOW_PASS_L2;
 		recps++;
 	}
 
@@ -5225,6 +5240,8 @@ ice_add_sw_recipe(struct ice_hw *hw, struct ice_sw_recipe *rm,
 		recp->priority = buf[buf_idx].content.act_ctrl_fwd_priority;
 		recp->n_grp_count = rm->n_grp_count;
 		recp->tun_type = rm->tun_type;
+		recp->need_pass_l2 = rm->need_pass_l2;
+		recp->allow_pass_l2 = rm->allow_pass_l2;
 		recp->recp_created = true;
 	}
 	rm->root_buf = buf;
@@ -5393,6 +5410,9 @@ ice_add_adv_recipe(struct ice_hw *hw, struct ice_adv_lkup_elem *lkups,
 	/* set the recipe priority if specified */
 	rm->priority = (u8)rinfo->priority;
 
+	rm->need_pass_l2 = rinfo->need_pass_l2;
+	rm->allow_pass_l2 = rinfo->allow_pass_l2;
+
 	/* Find offsets from the field vector. Pick the first one for all the
 	 * recipes.
 	 */
@@ -5408,7 +5428,7 @@ ice_add_adv_recipe(struct ice_hw *hw, struct ice_adv_lkup_elem *lkups,
 	}
 
 	/* Look for a recipe which matches our requested fv / mask list */
-	*rid = ice_find_recp(hw, lkup_exts, rinfo->tun_type);
+	*rid = ice_find_recp(hw, lkup_exts, rinfo);
 	if (*rid < ICE_MAX_NUM_RECIPES)
 		/* Success if found a recipe that match the existing criteria */
 		goto err_unroll;
@@ -5846,7 +5866,9 @@ static bool ice_rules_equal(const struct ice_adv_rule_info *first,
 	return first->sw_act.flag == second->sw_act.flag &&
 	       first->tun_type == second->tun_type &&
 	       first->vlan_type == second->vlan_type &&
-	       first->src_vsi == second->src_vsi;
+	       first->src_vsi == second->src_vsi &&
+	       first->need_pass_l2 == second->need_pass_l2 &&
+	       first->allow_pass_l2 == second->allow_pass_l2;
 }
 
 /**
@@ -6085,7 +6107,8 @@ ice_add_adv_rule(struct ice_hw *hw, struct ice_adv_lkup_elem *lkups,
 	if (!(rinfo->sw_act.fltr_act == ICE_FWD_TO_VSI ||
 	      rinfo->sw_act.fltr_act == ICE_FWD_TO_Q ||
 	      rinfo->sw_act.fltr_act == ICE_FWD_TO_QGRP ||
-	      rinfo->sw_act.fltr_act == ICE_DROP_PACKET)) {
+	      rinfo->sw_act.fltr_act == ICE_DROP_PACKET ||
+	      rinfo->sw_act.fltr_act == ICE_NOP)) {
 		status = -EIO;
 		goto free_pkt_profile;
 	}
@@ -6096,7 +6119,8 @@ ice_add_adv_rule(struct ice_hw *hw, struct ice_adv_lkup_elem *lkups,
 		goto free_pkt_profile;
 	}
 
-	if (rinfo->sw_act.fltr_act == ICE_FWD_TO_VSI)
+	if (rinfo->sw_act.fltr_act == ICE_FWD_TO_VSI ||
+	    rinfo->sw_act.fltr_act == ICE_NOP)
 		rinfo->sw_act.fwd_id.hw_vsi_id =
 			ice_get_hw_vsi_num(hw, vsi_handle);
 
@@ -6166,6 +6190,11 @@ ice_add_adv_rule(struct ice_hw *hw, struct ice_adv_lkup_elem *lkups,
 		act |= ICE_SINGLE_ACT_VSI_FORWARDING | ICE_SINGLE_ACT_DROP |
 		       ICE_SINGLE_ACT_VALID_BIT;
 		break;
+	case ICE_NOP:
+		act |= (rinfo->sw_act.fwd_id.hw_vsi_id <<
+			ICE_SINGLE_ACT_VSI_ID_S) & ICE_SINGLE_ACT_VSI_ID_M;
+		act &= ~ICE_SINGLE_ACT_VALID_BIT;
+		break;
 	default:
 		status = -EIO;
 		goto err_ice_add_adv_rule;
@@ -6446,7 +6475,7 @@ ice_rem_adv_rule(struct ice_hw *hw, struct ice_adv_lkup_elem *lkups,
 			return -EIO;
 	}
 
-	rid = ice_find_recp(hw, &lkup_exts, rinfo->tun_type);
+	rid = ice_find_recp(hw, &lkup_exts, rinfo);
 	/* If did not find a recipe that match the existing criteria */
 	if (rid == ICE_MAX_NUM_RECIPES)
 		return -EINVAL;
diff --git a/drivers/net/ethernet/intel/ice/ice_switch.h b/drivers/net/ethernet/intel/ice/ice_switch.h
index c84b56fe84a5..5ecce39cf1f5 100644
--- a/drivers/net/ethernet/intel/ice/ice_switch.h
+++ b/drivers/net/ethernet/intel/ice/ice_switch.h
@@ -191,6 +191,8 @@ struct ice_adv_rule_info {
 	u16 vlan_type;
 	u16 fltr_rule_id;
 	u32 priority;
+	u8 need_pass_l2;
+	u8 allow_pass_l2;
 	u16 src_vsi;
 	struct ice_sw_act_ctrl sw_act;
 	struct ice_adv_rule_flags_info flags_info;
@@ -254,6 +256,9 @@ struct ice_sw_recipe {
 	 */
 	u8 priority;
 
+	u8 need_pass_l2;
+	u8 allow_pass_l2;
+
 	struct list_head rg_list;
 
 	/* AQ buffer associated with this recipe */
diff --git a/drivers/net/ethernet/intel/ice/ice_type.h b/drivers/net/ethernet/intel/ice/ice_type.h
index 5602695243a8..96977d6fc149 100644
--- a/drivers/net/ethernet/intel/ice/ice_type.h
+++ b/drivers/net/ethernet/intel/ice/ice_type.h
@@ -1034,6 +1034,7 @@ enum ice_sw_fwd_act_type {
 	ICE_FWD_TO_Q,
 	ICE_FWD_TO_QGRP,
 	ICE_DROP_PACKET,
+	ICE_NOP,
 	ICE_INVAL_ACT
 };
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH net-next 07/12] ice: Accept LAG netdevs in bridge offloads
  2023-04-17  9:34 [PATCH net-next 00/12] ice: switchdev bridge offload Wojciech Drewek
                   ` (5 preceding siblings ...)
  2023-04-17  9:34 ` [PATCH net-next 06/12] ice: Add guard rule when creating FDB in switchdev Wojciech Drewek
@ 2023-04-17  9:34 ` Wojciech Drewek
  2023-04-21 14:40   ` Alexander Lobakin
  2023-04-17  9:34 ` [PATCH net-next 08/12] ice: Add VLAN FDB support in switchdev mode Wojciech Drewek
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 46+ messages in thread
From: Wojciech Drewek @ 2023-04-17  9:34 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, alexandr.lobakin, david.m.ertman, michal.swiatkowski,
	marcin.szycik, pawel.chmielewski, sridhar.samudrala

Allow LAG interfaces to be used in bridge offload using
netif_is_lag_master. In this case, search for ice netdev in
the list of LAG's lower devices.

Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
---
 .../net/ethernet/intel/ice/ice_eswitch_br.c   | 40 ++++++++++++++++---
 1 file changed, 35 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
index 82b5eb2020cd..49381e4bf62a 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
@@ -15,8 +15,21 @@ static const struct rhashtable_params ice_fdb_ht_params = {
 
 static bool ice_eswitch_br_is_dev_valid(const struct net_device *dev)
 {
-	/* Accept only PF netdev and PRs */
-	return ice_is_port_repr_netdev(dev) || netif_is_ice(dev);
+	/* Accept only PF netdev, PRs and LAG */
+	return ice_is_port_repr_netdev(dev) || netif_is_ice(dev) ||
+		netif_is_lag_master(dev);
+}
+
+static struct net_device *
+ice_eswitch_br_get_uplnik_from_lag(struct net_device *lag_dev)
+{
+	struct net_device *lower;
+	struct list_head *iter;
+
+	netdev_for_each_lower_dev(lag_dev, lower, iter)
+		if (netif_is_ice(lower))
+			return lower;
+	return NULL;
 }
 
 static struct ice_esw_br_port *
@@ -26,8 +39,16 @@ ice_eswitch_br_netdev_to_port(struct net_device *dev)
 		struct ice_repr *repr = ice_netdev_to_repr(dev);
 
 		return repr->br_port;
-	} else if (netif_is_ice(dev)) {
-		struct ice_pf *pf = ice_netdev_to_pf(dev);
+	} else if (netif_is_ice(dev) || netif_is_lag_master(dev)) {
+		struct net_device *ice_dev = dev;
+		struct ice_pf *pf;
+
+		if (netif_is_lag_master(dev))
+			ice_dev = ice_eswitch_br_get_uplnik_from_lag(dev);
+		if (!ice_dev)
+			return NULL;
+
+		pf = ice_netdev_to_pf(ice_dev);
 
 		return pf->br_port;
 	}
@@ -719,7 +740,16 @@ ice_eswitch_br_port_link(struct ice_esw_br_offloads *br_offloads,
 
 		err = ice_eswitch_br_vf_repr_port_init(bridge, repr);
 	} else {
-		struct ice_pf *pf = ice_netdev_to_pf(dev);
+		struct net_device *ice_dev = dev;
+		struct ice_pf *pf;
+
+		if (netif_is_lag_master(dev))
+			ice_dev = ice_eswitch_br_get_uplnik_from_lag(dev);
+
+		if (!ice_dev)
+			return 0;
+
+		pf = ice_netdev_to_pf(ice_dev);
 
 		err = ice_eswitch_br_uplink_port_init(bridge, pf);
 	}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH net-next 08/12] ice: Add VLAN FDB support in switchdev mode
  2023-04-17  9:34 [PATCH net-next 00/12] ice: switchdev bridge offload Wojciech Drewek
                   ` (6 preceding siblings ...)
  2023-04-17  9:34 ` [PATCH net-next 07/12] ice: Accept LAG netdevs in bridge offloads Wojciech Drewek
@ 2023-04-17  9:34 ` Wojciech Drewek
  2023-04-21 15:25   ` Alexander Lobakin
  2023-04-17  9:34 ` [PATCH net-next 09/12] ice: implement bridge port vlan Wojciech Drewek
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 46+ messages in thread
From: Wojciech Drewek @ 2023-04-17  9:34 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, alexandr.lobakin, david.m.ertman, michal.swiatkowski,
	marcin.szycik, pawel.chmielewski, sridhar.samudrala

From: Marcin Szycik <marcin.szycik@intel.com>

Add support for matching on VLAN tag in bridge offloads.
Currently only trunk mode is supported.

To enable VLAN filtering (existing FDB entries will be deleted):
ip link set $BR type bridge vlan_filtering 1

To add VLANs to bridge in trunk mode:
bridge vlan add dev $PF1 vid 110-111
bridge vlan add dev $VF1_PR vid 110-111

Signed-off-by: Marcin Szycik <marcin.szycik@intel.com>
---
 .../net/ethernet/intel/ice/ice_eswitch_br.c   | 319 +++++++++++++++++-
 .../net/ethernet/intel/ice/ice_eswitch_br.h   |  12 +
 2 files changed, 317 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
index 49381e4bf62a..56d36e397b12 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
@@ -59,13 +59,19 @@ ice_eswitch_br_netdev_to_port(struct net_device *dev)
 static void
 ice_eswitch_br_ingress_rule_setup(struct ice_adv_lkup_elem *list,
 				  struct ice_adv_rule_info *rule_info,
-				  const unsigned char *mac,
+				  const unsigned char *mac, bool vlan, u16 vid,
 				  u8 pf_id, u16 vf_vsi_idx)
 {
 	list[0].type = ICE_MAC_OFOS;
 	ether_addr_copy(list[0].h_u.eth_hdr.dst_addr, mac);
 	eth_broadcast_addr(list[0].m_u.eth_hdr.dst_addr);
 
+	if (vlan) {
+		list[1].type = ICE_VLAN_OFOS;
+		list[1].h_u.vlan_hdr.vlan = cpu_to_be16(vid & VLAN_VID_MASK);
+		list[1].m_u.vlan_hdr.vlan = cpu_to_be16(0xFFFF);
+	}
+
 	rule_info->sw_act.vsi_handle = vf_vsi_idx;
 	rule_info->sw_act.flag |= ICE_FLTR_RX;
 	rule_info->sw_act.src = pf_id;
@@ -75,13 +81,19 @@ ice_eswitch_br_ingress_rule_setup(struct ice_adv_lkup_elem *list,
 static void
 ice_eswitch_br_egress_rule_setup(struct ice_adv_lkup_elem *list,
 				 struct ice_adv_rule_info *rule_info,
-				 const unsigned char *mac,
+				 const unsigned char *mac, bool vlan, u16 vid,
 				 u16 pf_vsi_idx)
 {
 	list[0].type = ICE_MAC_OFOS;
 	ether_addr_copy(list[0].h_u.eth_hdr.dst_addr, mac);
 	eth_broadcast_addr(list[0].m_u.eth_hdr.dst_addr);
 
+	if (vlan) {
+		list[1].type = ICE_VLAN_OFOS;
+		list[1].h_u.vlan_hdr.vlan = cpu_to_be16(vid & VLAN_VID_MASK);
+		list[1].m_u.vlan_hdr.vlan = cpu_to_be16(0xFFFF);
+	}
+
 	rule_info->sw_act.vsi_handle = pf_vsi_idx;
 	rule_info->sw_act.flag |= ICE_FLTR_TX;
 	rule_info->flags_info.act = ICE_SINGLE_ACT_LAN_ENABLE;
@@ -105,12 +117,12 @@ ice_eswitch_br_rule_delete(struct ice_hw *hw, struct ice_rule_query_data *rule)
 
 static struct ice_rule_query_data *
 ice_eswitch_br_fwd_rule_create(struct ice_hw *hw, u16 vsi_idx, int port_type,
-			       const unsigned char *mac)
+			       const unsigned char *mac, bool vlan, u16 vid)
 {
 	struct ice_adv_rule_info rule_info = { 0 };
 	struct ice_rule_query_data *rule;
 	struct ice_adv_lkup_elem *list;
-	u16 lkups_cnt = 1;
+	u16 lkups_cnt = vlan ? 2 : 1;
 	int err;
 
 	rule = kzalloc(sizeof(*rule), GFP_KERNEL);
@@ -125,12 +137,12 @@ ice_eswitch_br_fwd_rule_create(struct ice_hw *hw, u16 vsi_idx, int port_type,
 
 	switch (port_type) {
 	case ICE_ESWITCH_BR_UPLINK_PORT:
-		ice_eswitch_br_egress_rule_setup(list, &rule_info, mac,
-						 vsi_idx);
+		ice_eswitch_br_egress_rule_setup(list, &rule_info, mac, vlan,
+						 vid, vsi_idx);
 		break;
 	case ICE_ESWITCH_BR_VF_REPR_PORT:
-		ice_eswitch_br_ingress_rule_setup(list, &rule_info, mac,
-						  hw->pf_id, vsi_idx);
+		ice_eswitch_br_ingress_rule_setup(list, &rule_info, mac, vlan,
+						  vid, hw->pf_id, vsi_idx);
 		break;
 	default:
 		err = -EINVAL;
@@ -159,12 +171,12 @@ ice_eswitch_br_fwd_rule_create(struct ice_hw *hw, u16 vsi_idx, int port_type,
 
 static struct ice_rule_query_data *
 ice_eswitch_br_guard_rule_create(struct ice_hw *hw, u16 vsi_idx,
-				 const unsigned char *mac)
+				 const unsigned char *mac, bool vlan, u16 vid)
 {
 	struct ice_adv_rule_info rule_info = { 0 };
 	struct ice_rule_query_data *rule;
 	struct ice_adv_lkup_elem *list;
-	const u16 lkups_cnt = 1;
+	u16 lkups_cnt = vlan ? 2 : 1;
 	int err;
 
 	rule = kzalloc(sizeof(*rule), GFP_KERNEL);
@@ -183,6 +195,12 @@ ice_eswitch_br_guard_rule_create(struct ice_hw *hw, u16 vsi_idx,
 	ether_addr_copy(list[0].h_u.eth_hdr.src_addr, mac);
 	eth_broadcast_addr(list[0].m_u.eth_hdr.src_addr);
 
+	if (vlan) {
+		list[1].type = ICE_VLAN_OFOS;
+		list[1].h_u.vlan_hdr.vlan = cpu_to_be16(vid & VLAN_VID_MASK);
+		list[1].m_u.vlan_hdr.vlan = cpu_to_be16(0xFFFF);
+	}
+
 	rule_info.allow_pass_l2 = true;
 	rule_info.sw_act.vsi_handle = vsi_idx;
 	rule_info.sw_act.fltr_act = ICE_NOP;
@@ -204,7 +222,8 @@ ice_eswitch_br_guard_rule_create(struct ice_hw *hw, u16 vsi_idx,
 
 static struct ice_esw_br_flow *
 ice_eswitch_br_flow_create(struct device *dev, struct ice_hw *hw, u16 vsi_idx,
-			   int port_type, const unsigned char *mac)
+			   int port_type, const unsigned char *mac,
+			   bool add_vlan, u16 vid)
 {
 	struct ice_rule_query_data *fwd_rule, *guard_rule;
 	struct ice_esw_br_flow *flow;
@@ -214,7 +233,8 @@ ice_eswitch_br_flow_create(struct device *dev, struct ice_hw *hw, u16 vsi_idx,
 	if (!flow)
 		return ERR_PTR(-ENOMEM);
 
-	fwd_rule = ice_eswitch_br_fwd_rule_create(hw, vsi_idx, port_type, mac);
+	fwd_rule = ice_eswitch_br_fwd_rule_create(hw, vsi_idx, port_type, mac,
+						  add_vlan, vid);
 	if (IS_ERR(fwd_rule)) {
 		err = PTR_ERR(fwd_rule);
 		dev_err(dev, "Failed to create eswitch bridge %sgress forward rule, err: %d\n",
@@ -223,7 +243,8 @@ ice_eswitch_br_flow_create(struct device *dev, struct ice_hw *hw, u16 vsi_idx,
 		goto err_fwd_rule;
 	}
 
-	guard_rule = ice_eswitch_br_guard_rule_create(hw, vsi_idx, mac);
+	guard_rule = ice_eswitch_br_guard_rule_create(hw, vsi_idx, mac,
+						      add_vlan, vid);
 	if (IS_ERR(guard_rule)) {
 		err = PTR_ERR(guard_rule);
 		dev_err(dev, "Failed to create eswitch bridge %sgress guard rule, err: %d\n",
@@ -276,6 +297,30 @@ ice_eswitch_br_flow_delete(struct ice_pf *pf, struct ice_esw_br_flow *flow)
 	kfree(flow);
 }
 
+static struct ice_esw_br_vlan *
+ice_esw_br_port_vlan_lookup(struct ice_esw_br *bridge, u16 vsi_idx, u16 vid)
+{
+	struct ice_pf *pf = bridge->br_offloads->pf;
+	struct device *dev = ice_pf_to_dev(pf);
+	struct ice_esw_br_port *port;
+	struct ice_esw_br_vlan *vlan;
+
+	port = xa_load(&bridge->ports, vsi_idx);
+	if (!port) {
+		dev_info(dev, "Bridge port lookup failed (vsi=%u)\n", vsi_idx);
+		return ERR_PTR(-EINVAL);
+	}
+
+	vlan = xa_load(&port->vlans, vid);
+	if (!vlan) {
+		dev_info(dev, "Bridge port vlan metadata lookup failed (vsi=%u)\n",
+			 vsi_idx);
+		return ERR_PTR(-EINVAL);
+	}
+
+	return vlan;
+}
+
 static void
 ice_eswitch_br_fdb_entry_delete(struct ice_esw_br *bridge,
 				struct ice_esw_br_fdb_entry *fdb_entry)
@@ -344,10 +389,33 @@ ice_eswitch_br_fdb_entry_create(struct net_device *netdev,
 	struct device *dev = ice_pf_to_dev(pf);
 	struct ice_esw_br_fdb_entry *fdb_entry;
 	struct ice_esw_br_flow *flow;
+	struct ice_esw_br_vlan *vlan;
 	struct ice_hw *hw = &pf->hw;
+	bool add_vlan = false;
 	unsigned long event;
 	int err;
 
+	/* FIXME: untagged filtering is not yet supported
+	 */
+	if (!(bridge->flags & ICE_ESWITCH_BR_VLAN_FILTERING) && vid)
+		return;
+
+	/* In trunk VLAN mode, for untagged traffic the bridge sends requests
+	 * to offload VLAN 1 with pvid and untagged flags set. Since these
+	 * flags are not supported, add a MAC filter instead.
+	 */
+	if ((bridge->flags & ICE_ESWITCH_BR_VLAN_FILTERING) && vid != 1) {
+		vlan = ice_esw_br_port_vlan_lookup(bridge, br_port->vsi_idx,
+						   vid);
+		if (IS_ERR(vlan)) {
+			dev_err(dev, "Failed to find vlan lookup, err: %ld\n",
+				PTR_ERR(vlan));
+			return;
+		}
+
+		add_vlan = true;
+	}
+
 	fdb_entry = ice_eswitch_br_fdb_find(bridge, mac, vid);
 	if (fdb_entry)
 		ice_eswitch_br_fdb_entry_notify_and_cleanup(bridge, fdb_entry);
@@ -359,7 +427,7 @@ ice_eswitch_br_fdb_entry_create(struct net_device *netdev,
 	}
 
 	flow = ice_eswitch_br_flow_create(dev, hw, br_port->vsi_idx,
-					  br_port->type, mac);
+					  br_port->type, mac, add_vlan, vid);
 	if (IS_ERR(flow)) {
 		err = PTR_ERR(flow);
 		goto err_add_flow;
@@ -519,6 +587,214 @@ ice_eswitch_br_switchdev_event(struct notifier_block *nb,
 	return NOTIFY_DONE;
 }
 
+static void ice_eswitch_br_fdb_flush(struct ice_esw_br *bridge)
+{
+	struct ice_esw_br_fdb_entry *entry, *tmp;
+
+	list_for_each_entry_safe(entry, tmp, &bridge->fdb_list, list)
+		ice_eswitch_br_fdb_entry_notify_and_cleanup(bridge, entry);
+}
+
+static void
+ice_eswitch_br_vlan_filtering_set(struct ice_esw_br *bridge, bool enable)
+{
+	bool filtering = bridge->flags & ICE_ESWITCH_BR_VLAN_FILTERING;
+
+	if (filtering == enable)
+		return;
+
+	ice_eswitch_br_fdb_flush(bridge);
+	if (enable)
+		bridge->flags |= ICE_ESWITCH_BR_VLAN_FILTERING;
+	else
+		bridge->flags &= ~ICE_ESWITCH_BR_VLAN_FILTERING;
+}
+
+static void
+ice_eswitch_br_vlan_cleanup(struct ice_esw_br_port *port,
+			    struct ice_esw_br_vlan *vlan)
+{
+	xa_erase(&port->vlans, vlan->vid);
+	kfree(vlan);
+}
+
+static void ice_eswitch_br_port_vlans_flush(struct ice_esw_br_port *port)
+{
+	struct ice_esw_br_vlan *vlan;
+	unsigned long index;
+
+	xa_for_each(&port->vlans, index, vlan)
+		ice_eswitch_br_vlan_cleanup(port, vlan);
+}
+
+static struct ice_esw_br_vlan *
+ice_eswitch_br_vlan_create(u16 vid, u16 flags, struct ice_esw_br_port *port)
+{
+	struct ice_esw_br_vlan *vlan;
+	int err;
+
+	vlan = kzalloc(sizeof(*vlan), GFP_KERNEL);
+	if (!vlan)
+		return ERR_PTR(-ENOMEM);
+
+	vlan->vid = vid;
+	vlan->flags = flags;
+
+	err = xa_insert(&port->vlans, vlan->vid, vlan, GFP_KERNEL);
+	if (err) {
+		kfree(vlan);
+		return ERR_PTR(err);
+	}
+
+	return vlan;
+}
+
+static int
+ice_eswitch_br_port_vlan_add(struct ice_esw_br *bridge, u16 vsi_idx, u16 vid,
+			     u16 flags, struct netlink_ext_ack *extack)
+{
+	struct ice_esw_br_port *port;
+	struct ice_esw_br_vlan *vlan;
+
+	port = xa_load(&bridge->ports, vsi_idx);
+	if (!port)
+		return -EINVAL;
+
+	vlan = xa_load(&port->vlans, vid);
+	if (vlan) {
+		if (vlan->flags == flags)
+			return 0;
+
+		ice_eswitch_br_vlan_cleanup(port, vlan);
+	}
+
+	vlan = ice_eswitch_br_vlan_create(vid, flags, port);
+	if (IS_ERR(vlan)) {
+		NL_SET_ERR_MSG_MOD(extack, "Failed to create VLAN entry");
+		return PTR_ERR(vlan);
+	}
+
+	return 0;
+}
+
+static void
+ice_eswitch_br_port_vlan_del(struct ice_esw_br *bridge, u16 vsi_idx, u16 vid)
+{
+	struct ice_esw_br_port *port;
+	struct ice_esw_br_vlan *vlan;
+
+	port = xa_load(&bridge->ports, vsi_idx);
+	if (!port)
+		return;
+
+	vlan = xa_load(&port->vlans, vid);
+	if (!vlan)
+		return;
+
+	ice_eswitch_br_vlan_cleanup(port, vlan);
+}
+
+static int
+ice_eswitch_br_port_obj_add(struct net_device *netdev, const void *ctx,
+			    const struct switchdev_obj *obj,
+			    struct netlink_ext_ack *extack)
+{
+	struct ice_esw_br_port *br_port = ice_eswitch_br_netdev_to_port(netdev);
+	struct switchdev_obj_port_vlan *vlan;
+	int err;
+
+	if (!br_port)
+		return -EINVAL;
+
+	switch (obj->id) {
+	case SWITCHDEV_OBJ_ID_PORT_VLAN:
+		vlan = SWITCHDEV_OBJ_PORT_VLAN(obj);
+		err = ice_eswitch_br_port_vlan_add(br_port->bridge,
+						   br_port->vsi_idx, vlan->vid,
+						   vlan->flags, extack);
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	return err;
+}
+
+static int
+ice_eswitch_br_port_obj_del(struct net_device *netdev, const void *ctx,
+			    const struct switchdev_obj *obj)
+{
+	struct ice_esw_br_port *br_port = ice_eswitch_br_netdev_to_port(netdev);
+	struct switchdev_obj_port_vlan *vlan;
+
+	if (!br_port)
+		return -EINVAL;
+
+	switch (obj->id) {
+	case SWITCHDEV_OBJ_ID_PORT_VLAN:
+		vlan = SWITCHDEV_OBJ_PORT_VLAN(obj);
+		ice_eswitch_br_port_vlan_del(br_port->bridge, br_port->vsi_idx,
+					     vlan->vid);
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	return 0;
+}
+
+static int
+ice_eswitch_br_port_obj_attr_set(struct net_device *netdev, const void *ctx,
+				 const struct switchdev_attr *attr,
+				 struct netlink_ext_ack *extack)
+{
+	struct ice_esw_br_port *br_port = ice_eswitch_br_netdev_to_port(netdev);
+
+	if (!br_port)
+		return -EINVAL;
+
+	switch (attr->id) {
+	case SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING:
+		ice_eswitch_br_vlan_filtering_set(br_port->bridge,
+						  attr->u.vlan_filtering);
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	return 0;
+}
+
+static int
+ice_eswitch_br_event_blocking(struct notifier_block *nb, unsigned long event,
+			      void *ptr)
+{
+	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
+	int err;
+
+	switch (event) {
+	case SWITCHDEV_PORT_OBJ_ADD:
+		err = switchdev_handle_port_obj_add(dev, ptr,
+						    ice_eswitch_br_is_dev_valid,
+						    ice_eswitch_br_port_obj_add);
+		break;
+	case SWITCHDEV_PORT_OBJ_DEL:
+		err = switchdev_handle_port_obj_del(dev, ptr,
+						    ice_eswitch_br_is_dev_valid,
+						    ice_eswitch_br_port_obj_del);
+		break;
+	case SWITCHDEV_PORT_ATTR_SET:
+		err = switchdev_handle_port_attr_set(dev, ptr,
+						     ice_eswitch_br_is_dev_valid,
+						     ice_eswitch_br_port_obj_attr_set);
+		break;
+	default:
+		err = 0;
+	}
+
+	return notifier_from_errno(err);
+}
+
 static void
 ice_eswitch_br_port_deinit(struct ice_esw_br *bridge,
 			   struct ice_esw_br_port *br_port)
@@ -537,6 +813,7 @@ ice_eswitch_br_port_deinit(struct ice_esw_br *bridge,
 		vsi->vf->repr->br_port = NULL;
 
 	xa_erase(&bridge->ports, br_port->vsi_idx);
+	ice_eswitch_br_port_vlans_flush(br_port);
 	kfree(br_port);
 }
 
@@ -549,6 +826,8 @@ ice_eswitch_br_port_init(struct ice_esw_br *bridge)
 	if (!br_port)
 		return ERR_PTR(-ENOMEM);
 
+	xa_init(&br_port->vlans);
+
 	br_port->bridge = bridge;
 
 	return br_port;
@@ -852,6 +1131,7 @@ ice_eswitch_br_offloads_deinit(struct ice_pf *pf)
 		return;
 
 	unregister_netdevice_notifier(&br_offloads->netdev_nb);
+	unregister_switchdev_blocking_notifier(&br_offloads->switchdev_blk);
 	unregister_switchdev_notifier(&br_offloads->switchdev_nb);
 	destroy_workqueue(br_offloads->wq);
 	/* Although notifier block is unregistered just before,
@@ -895,6 +1175,15 @@ ice_eswitch_br_offloads_init(struct ice_pf *pf)
 		goto err_reg_switchdev_nb;
 	}
 
+	br_offloads->switchdev_blk.notifier_call =
+		ice_eswitch_br_event_blocking;
+	err = register_switchdev_blocking_notifier(&br_offloads->switchdev_blk);
+	if (err) {
+		dev_err(dev,
+			"Failed to register bridge blocking switchdev notifier\n");
+		goto err_reg_switchdev_blk;
+	}
+
 	br_offloads->netdev_nb.notifier_call = ice_eswitch_br_port_event;
 	err = register_netdevice_notifier(&br_offloads->netdev_nb);
 	if (err) {
@@ -906,6 +1195,8 @@ ice_eswitch_br_offloads_init(struct ice_pf *pf)
 	return 0;
 
 err_reg_netdev_nb:
+	unregister_switchdev_blocking_notifier(&br_offloads->switchdev_blk);
+err_reg_switchdev_blk:
 	unregister_switchdev_notifier(&br_offloads->switchdev_nb);
 err_reg_switchdev_nb:
 	destroy_workqueue(br_offloads->wq);
diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
index 73ad81bad655..cf3e2615a62a 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
@@ -42,10 +42,16 @@ struct ice_esw_br_port {
 	enum ice_esw_br_port_type type;
 	struct ice_vsi *vsi;
 	u16 vsi_idx;
+	struct xarray vlans;
+};
+
+enum {
+	ICE_ESWITCH_BR_VLAN_FILTERING = BIT(0),
 };
 
 struct ice_esw_br {
 	struct ice_esw_br_offloads *br_offloads;
+	int flags;
 	int ifindex;
 
 	struct xarray ports;
@@ -57,6 +63,7 @@ struct ice_esw_br_offloads {
 	struct ice_pf *pf;
 	struct ice_esw_br *bridge;
 	struct notifier_block netdev_nb;
+	struct notifier_block switchdev_blk;
 	struct notifier_block switchdev_nb;
 
 	struct workqueue_struct *wq;
@@ -70,6 +77,11 @@ struct ice_esw_br_fdb_work {
 	unsigned long event;
 };
 
+struct ice_esw_br_vlan {
+	u16 vid;
+	u16 flags;
+};
+
 #define ice_nb_to_br_offloads(nb, nb_name) \
 	container_of(nb, \
 		     struct ice_esw_br_offloads, \
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH net-next 09/12] ice: implement bridge port vlan
  2023-04-17  9:34 [PATCH net-next 00/12] ice: switchdev bridge offload Wojciech Drewek
                   ` (7 preceding siblings ...)
  2023-04-17  9:34 ` [PATCH net-next 08/12] ice: Add VLAN FDB support in switchdev mode Wojciech Drewek
@ 2023-04-17  9:34 ` Wojciech Drewek
  2023-04-21 16:35   ` Alexander Lobakin
  2023-04-17  9:34 ` [PATCH net-next 10/12] ice: implement static version of ageing Wojciech Drewek
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 46+ messages in thread
From: Wojciech Drewek @ 2023-04-17  9:34 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, alexandr.lobakin, david.m.ertman, michal.swiatkowski,
	marcin.szycik, pawel.chmielewski, sridhar.samudrala

From: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>

Port VLAN in this case means push and pop VLAN action on specific vid.
There are a few limitation in hardware:
- push and pop can't be used separately
- if port VLAN is used there can't be any trunk VLANs, because pop
  action is done on all trafic received by VSI in port VLAN mode
- port VLAN mode on uplink port isn't supported

Reflect these limitations in code using dev_info to inform the user
about unsupported configuration.

In bridge mode there is a need to configure port vlan without resetting
VFs. To do that implement ice_port_vlan_on/off() functions. They are
only configuring correct vlan_ops to allow setting port vlan.

We also need to clear port vlan without resetting the VF which is not
supported right now. Change it by implementing clear_port_vlan ops.
As previous VLAN configuration isn't always the same, store current
config while creating port vlan and restore it in clear function.

Configuration steps:
- configure switchdev with bridge
- #bridge vlan add dev eth0 vid 120 pvid untagged
- #bridge vlan add dev eth1 vid 120 pvid untagged
- ping from VF0 to VF1

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/ice/ice.h          |   1 +
 .../net/ethernet/intel/ice/ice_eswitch_br.c   |  88 +++++++-
 .../net/ethernet/intel/ice/ice_eswitch_br.h   |   1 +
 .../ethernet/intel/ice/ice_vf_vsi_vlan_ops.c  | 195 ++++++++++--------
 .../ethernet/intel/ice/ice_vf_vsi_vlan_ops.h  |   3 +
 .../net/ethernet/intel/ice/ice_vsi_vlan_lib.c |  84 +++++++-
 .../net/ethernet/intel/ice/ice_vsi_vlan_lib.h |   8 +
 .../net/ethernet/intel/ice/ice_vsi_vlan_ops.h |   1 +
 8 files changed, 291 insertions(+), 90 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 5b2ade5908e8..489934ddfbb8 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -369,6 +369,7 @@ struct ice_vsi {
 	u16 rx_buf_len;
 
 	struct ice_aqc_vsi_props info;	 /* VSI properties */
+	struct ice_vsi_vlan_info vlan_info;	/* vlan config to be restored */
 
 	/* VSI stats */
 	struct rtnl_link_stats64 net_stats;
diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
index 56d36e397b12..a21eca5088f7 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
@@ -5,6 +5,8 @@
 #include "ice_eswitch_br.h"
 #include "ice_repr.h"
 #include "ice_switch.h"
+#include "ice_vlan.h"
+#include "ice_vf_vsi_vlan_ops.h"
 
 static const struct rhashtable_params ice_fdb_ht_params = {
 	.key_offset = offsetof(struct ice_esw_br_fdb_entry, data),
@@ -610,11 +612,26 @@ ice_eswitch_br_vlan_filtering_set(struct ice_esw_br *bridge, bool enable)
 		bridge->flags &= ~ICE_ESWITCH_BR_VLAN_FILTERING;
 }
 
+static void
+ice_eswitch_br_clear_pvid(struct ice_esw_br_port *port)
+{
+	struct ice_vsi_vlan_ops *vlan_ops =
+		ice_get_compat_vsi_vlan_ops(port->vsi);
+
+	vlan_ops->clear_port_vlan(port->vsi);
+
+	ice_vf_vsi_disable_port_vlan(port->vsi);
+
+	port->pvid = 0;
+}
+
 static void
 ice_eswitch_br_vlan_cleanup(struct ice_esw_br_port *port,
 			    struct ice_esw_br_vlan *vlan)
 {
 	xa_erase(&port->vlans, vlan->vid);
+	if (port->pvid == vlan->vid)
+		ice_eswitch_br_clear_pvid(port);
 	kfree(vlan);
 }
 
@@ -627,9 +644,50 @@ static void ice_eswitch_br_port_vlans_flush(struct ice_esw_br_port *port)
 		ice_eswitch_br_vlan_cleanup(port, vlan);
 }
 
+static int
+ice_eswitch_br_set_pvid(struct ice_esw_br_port *port,
+			struct ice_esw_br_vlan *vlan)
+{
+	struct ice_vlan port_vlan = ICE_VLAN(ETH_P_8021Q, vlan->vid, 0);
+	struct device *dev = ice_pf_to_dev(port->vsi->back);
+	struct ice_vsi_vlan_ops *vlan_ops;
+	int err;
+
+	if (port->pvid == vlan->vid || vlan->vid == 1)
+		return 0;
+
+	/* Setting port vlan on uplink isn't supported by hw */
+	if (port->type == ICE_ESWITCH_BR_UPLINK_PORT)
+		return -EOPNOTSUPP;
+
+	if (port->pvid) {
+		dev_info(dev,
+			 "Port VLAN (vsi=%u, vid=%u) already exists on the port, remove it before adding new one\n",
+			 port->vsi_idx, port->pvid);
+		return -EEXIST;
+	}
+
+	ice_vf_vsi_enable_port_vlan(port->vsi);
+
+	vlan_ops = ice_get_compat_vsi_vlan_ops(port->vsi);
+	err = vlan_ops->set_port_vlan(port->vsi, &port_vlan);
+	if (err)
+		return err;
+
+	err = vlan_ops->add_vlan(port->vsi, &port_vlan);
+	if (err)
+		return err;
+
+	ice_eswitch_br_port_vlans_flush(port);
+	port->pvid = vlan->vid;
+
+	return 0;
+}
+
 static struct ice_esw_br_vlan *
 ice_eswitch_br_vlan_create(u16 vid, u16 flags, struct ice_esw_br_port *port)
 {
+	struct device *dev = ice_pf_to_dev(port->vsi->back);
 	struct ice_esw_br_vlan *vlan;
 	int err;
 
@@ -639,14 +697,29 @@ ice_eswitch_br_vlan_create(u16 vid, u16 flags, struct ice_esw_br_port *port)
 
 	vlan->vid = vid;
 	vlan->flags = flags;
+	if ((flags & BRIDGE_VLAN_INFO_PVID) &&
+	    (flags & BRIDGE_VLAN_INFO_UNTAGGED)) {
+		err = ice_eswitch_br_set_pvid(port, vlan);
+		if (err)
+			goto err_set_pvid;
+	} else if ((flags & BRIDGE_VLAN_INFO_PVID) ||
+		   (flags & BRIDGE_VLAN_INFO_UNTAGGED)) {
+		dev_info(dev, "VLAN push and pop are supported only simultaneously\n");
+		return ERR_PTR(-EOPNOTSUPP);
+	}
 
 	err = xa_insert(&port->vlans, vlan->vid, vlan, GFP_KERNEL);
-	if (err) {
-		kfree(vlan);
-		return ERR_PTR(err);
-	}
+	if (err)
+		goto err_insert;
 
 	return vlan;
+
+err_insert:
+	if (port->pvid)
+		ice_eswitch_br_clear_pvid(port);
+err_set_pvid:
+	kfree(vlan);
+	return ERR_PTR(err);
 }
 
 static int
@@ -660,6 +733,13 @@ ice_eswitch_br_port_vlan_add(struct ice_esw_br *bridge, u16 vsi_idx, u16 vid,
 	if (!port)
 		return -EINVAL;
 
+	if (port->pvid) {
+		dev_info(ice_pf_to_dev(port->vsi->back),
+			 "Port VLAN (vsi=%u, vid=%d) exists on the port, remove it to add trunk VLANs\n",
+			 port->vsi_idx, port->pvid);
+		return -EEXIST;
+	}
+
 	vlan = xa_load(&port->vlans, vid);
 	if (vlan) {
 		if (vlan->flags == flags)
diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
index cf3e2615a62a..b6eef068ea81 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
@@ -43,6 +43,7 @@ struct ice_esw_br_port {
 	struct ice_vsi *vsi;
 	u16 vsi_idx;
 	struct xarray vlans;
+	u16 pvid;
 };
 
 enum {
diff --git a/drivers/net/ethernet/intel/ice/ice_vf_vsi_vlan_ops.c b/drivers/net/ethernet/intel/ice/ice_vf_vsi_vlan_ops.c
index b1ffb81893d4..447b4e6ef7e4 100644
--- a/drivers/net/ethernet/intel/ice/ice_vf_vsi_vlan_ops.c
+++ b/drivers/net/ethernet/intel/ice/ice_vf_vsi_vlan_ops.c
@@ -21,6 +21,108 @@ noop_vlan(struct ice_vsi __always_unused *vsi)
 	return 0;
 }
 
+static void ice_port_vlan_on(struct ice_vsi *vsi)
+{
+	struct ice_vsi_vlan_ops *vlan_ops;
+	struct ice_pf *pf = vsi->back;
+
+	if (ice_is_dvm_ena(&pf->hw)) {
+		vlan_ops = &vsi->outer_vlan_ops;
+
+		/* setup outer VLAN ops */
+		vlan_ops->set_port_vlan = ice_vsi_set_outer_port_vlan;
+		vlan_ops->clear_port_vlan = ice_vsi_clear_outer_port_vlan;
+		vlan_ops->clear_port_vlan = ice_vsi_clear_outer_port_vlan;
+		vlan_ops->ena_rx_filtering =
+			ice_vsi_ena_rx_vlan_filtering;
+
+		/* setup inner VLAN ops */
+		vlan_ops = &vsi->inner_vlan_ops;
+		vlan_ops->add_vlan = noop_vlan_arg;
+		vlan_ops->del_vlan = noop_vlan_arg;
+		vlan_ops->ena_stripping = ice_vsi_ena_inner_stripping;
+		vlan_ops->dis_stripping = ice_vsi_dis_inner_stripping;
+		vlan_ops->ena_insertion = ice_vsi_ena_inner_insertion;
+		vlan_ops->dis_insertion = ice_vsi_dis_inner_insertion;
+	} else {
+		vlan_ops = &vsi->inner_vlan_ops;
+
+		vlan_ops->set_port_vlan = ice_vsi_set_inner_port_vlan;
+		vlan_ops->clear_port_vlan = ice_vsi_clear_inner_port_vlan;
+		vlan_ops->clear_port_vlan = ice_vsi_clear_inner_port_vlan;
+		vlan_ops->ena_rx_filtering =
+			ice_vsi_ena_rx_vlan_filtering;
+	}
+}
+
+static void ice_port_vlan_off(struct ice_vsi *vsi)
+{
+	struct ice_vsi_vlan_ops *vlan_ops;
+	struct ice_pf *pf = vsi->back;
+
+	if (ice_is_dvm_ena(&pf->hw)) {
+		/* setup inner VLAN ops */
+		vlan_ops = &vsi->inner_vlan_ops;
+
+		vlan_ops->ena_stripping = ice_vsi_ena_inner_stripping;
+		vlan_ops->dis_stripping = ice_vsi_dis_inner_stripping;
+		vlan_ops->ena_insertion = ice_vsi_ena_inner_insertion;
+		vlan_ops->dis_insertion = ice_vsi_dis_inner_insertion;
+
+		vlan_ops = &vsi->outer_vlan_ops;
+
+		vlan_ops->del_vlan = ice_vsi_del_vlan;
+		vlan_ops->ena_stripping = ice_vsi_ena_outer_stripping;
+		vlan_ops->dis_stripping = ice_vsi_dis_outer_stripping;
+		vlan_ops->ena_insertion = ice_vsi_ena_outer_insertion;
+		vlan_ops->dis_insertion = ice_vsi_dis_outer_insertion;
+	} else {
+		vlan_ops = &vsi->inner_vlan_ops;
+
+		vlan_ops->del_vlan = ice_vsi_del_vlan;
+		vlan_ops->ena_stripping = ice_vsi_ena_inner_stripping;
+		vlan_ops->dis_stripping = ice_vsi_dis_inner_stripping;
+		vlan_ops->ena_insertion = ice_vsi_ena_inner_insertion;
+		vlan_ops->dis_insertion = ice_vsi_dis_inner_insertion;
+	}
+
+	if (!test_bit(ICE_FLAG_VF_VLAN_PRUNING, pf->flags))
+		vlan_ops->ena_rx_filtering = noop_vlan;
+	else
+		vlan_ops->ena_rx_filtering =
+			ice_vsi_ena_rx_vlan_filtering;
+}
+
+/**
+ * ice_vf_vsi_enable_port_vlan - Set VSI VLAN ops to support port VLAN
+ * @vsi: VF's VSI being configured
+ *
+ * The function won't create port VLAN, it only allows to create port VLAN
+ * using VLAN ops on the VF VSI.
+ */
+void ice_vf_vsi_enable_port_vlan(struct ice_vsi *vsi)
+{
+	if (WARN_ON(!vsi->vf))
+		return;
+
+	ice_port_vlan_on(vsi);
+}
+
+/**
+ * ice_vf_vsi_disable_port_vlan - Clear VSI support for creating port VLAN
+ * @vsi: VF's VSI being configured
+ *
+ * The function should be called after removing port VLAN on VSI
+ * (using VLAN ops)
+ */
+void ice_vf_vsi_disable_port_vlan(struct ice_vsi *vsi)
+{
+	if (WARN_ON(!vsi->vf))
+		return;
+
+	ice_port_vlan_off(vsi);
+}
+
 /**
  * ice_vf_vsi_init_vlan_ops - Initialize default VSI VLAN ops for VF VSI
  * @vsi: VF's VSI being configured
@@ -39,91 +141,18 @@ void ice_vf_vsi_init_vlan_ops(struct ice_vsi *vsi)
 	if (WARN_ON(!vf))
 		return;
 
-	if (ice_is_dvm_ena(&pf->hw)) {
-		vlan_ops = &vsi->outer_vlan_ops;
+	if (ice_vf_is_port_vlan_ena(vf))
+		ice_port_vlan_on(vsi);
+	else
+		ice_port_vlan_off(vsi);
 
-		/* outer VLAN ops regardless of port VLAN config */
-		vlan_ops->add_vlan = ice_vsi_add_vlan;
-		vlan_ops->ena_tx_filtering = ice_vsi_ena_tx_vlan_filtering;
-		vlan_ops->dis_tx_filtering = ice_vsi_dis_tx_vlan_filtering;
-
-		if (ice_vf_is_port_vlan_ena(vf)) {
-			/* setup outer VLAN ops */
-			vlan_ops->set_port_vlan = ice_vsi_set_outer_port_vlan;
-			/* all Rx traffic should be in the domain of the
-			 * assigned port VLAN, so prevent disabling Rx VLAN
-			 * filtering
-			 */
-			vlan_ops->dis_rx_filtering = noop_vlan;
-			vlan_ops->ena_rx_filtering =
-				ice_vsi_ena_rx_vlan_filtering;
-
-			/* setup inner VLAN ops */
-			vlan_ops = &vsi->inner_vlan_ops;
-			vlan_ops->add_vlan = noop_vlan_arg;
-			vlan_ops->del_vlan = noop_vlan_arg;
-			vlan_ops->ena_stripping = ice_vsi_ena_inner_stripping;
-			vlan_ops->dis_stripping = ice_vsi_dis_inner_stripping;
-			vlan_ops->ena_insertion = ice_vsi_ena_inner_insertion;
-			vlan_ops->dis_insertion = ice_vsi_dis_inner_insertion;
-		} else {
-			vlan_ops->dis_rx_filtering =
-				ice_vsi_dis_rx_vlan_filtering;
-
-			if (!test_bit(ICE_FLAG_VF_VLAN_PRUNING, pf->flags))
-				vlan_ops->ena_rx_filtering = noop_vlan;
-			else
-				vlan_ops->ena_rx_filtering =
-					ice_vsi_ena_rx_vlan_filtering;
-
-			vlan_ops->del_vlan = ice_vsi_del_vlan;
-			vlan_ops->ena_stripping = ice_vsi_ena_outer_stripping;
-			vlan_ops->dis_stripping = ice_vsi_dis_outer_stripping;
-			vlan_ops->ena_insertion = ice_vsi_ena_outer_insertion;
-			vlan_ops->dis_insertion = ice_vsi_dis_outer_insertion;
-
-			/* setup inner VLAN ops */
-			vlan_ops = &vsi->inner_vlan_ops;
-
-			vlan_ops->ena_stripping = ice_vsi_ena_inner_stripping;
-			vlan_ops->dis_stripping = ice_vsi_dis_inner_stripping;
-			vlan_ops->ena_insertion = ice_vsi_ena_inner_insertion;
-			vlan_ops->dis_insertion = ice_vsi_dis_inner_insertion;
-		}
-	} else {
-		vlan_ops = &vsi->inner_vlan_ops;
+	vlan_ops = ice_is_dvm_ena(&pf->hw) ?
+		&vsi->outer_vlan_ops : &vsi->inner_vlan_ops;
 
-		/* inner VLAN ops regardless of port VLAN config */
-		vlan_ops->add_vlan = ice_vsi_add_vlan;
-		vlan_ops->dis_rx_filtering = ice_vsi_dis_rx_vlan_filtering;
-		vlan_ops->ena_tx_filtering = ice_vsi_ena_tx_vlan_filtering;
-		vlan_ops->dis_tx_filtering = ice_vsi_dis_tx_vlan_filtering;
-
-		if (ice_vf_is_port_vlan_ena(vf)) {
-			vlan_ops->set_port_vlan = ice_vsi_set_inner_port_vlan;
-			vlan_ops->ena_rx_filtering =
-				ice_vsi_ena_rx_vlan_filtering;
-			/* all Rx traffic should be in the domain of the
-			 * assigned port VLAN, so prevent disabling Rx VLAN
-			 * filtering
-			 */
-			vlan_ops->dis_rx_filtering = noop_vlan;
-		} else {
-			vlan_ops->dis_rx_filtering =
-				ice_vsi_dis_rx_vlan_filtering;
-			if (!test_bit(ICE_FLAG_VF_VLAN_PRUNING, pf->flags))
-				vlan_ops->ena_rx_filtering = noop_vlan;
-			else
-				vlan_ops->ena_rx_filtering =
-					ice_vsi_ena_rx_vlan_filtering;
-
-			vlan_ops->del_vlan = ice_vsi_del_vlan;
-			vlan_ops->ena_stripping = ice_vsi_ena_inner_stripping;
-			vlan_ops->dis_stripping = ice_vsi_dis_inner_stripping;
-			vlan_ops->ena_insertion = ice_vsi_ena_inner_insertion;
-			vlan_ops->dis_insertion = ice_vsi_dis_inner_insertion;
-		}
-	}
+	vlan_ops->add_vlan = ice_vsi_add_vlan;
+	vlan_ops->dis_rx_filtering = ice_vsi_dis_rx_vlan_filtering;
+	vlan_ops->ena_tx_filtering = ice_vsi_ena_tx_vlan_filtering;
+	vlan_ops->dis_tx_filtering = ice_vsi_dis_tx_vlan_filtering;
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/ice/ice_vf_vsi_vlan_ops.h b/drivers/net/ethernet/intel/ice/ice_vf_vsi_vlan_ops.h
index 875a4e615f39..845330b49608 100644
--- a/drivers/net/ethernet/intel/ice/ice_vf_vsi_vlan_ops.h
+++ b/drivers/net/ethernet/intel/ice/ice_vf_vsi_vlan_ops.h
@@ -11,6 +11,9 @@ struct ice_vsi;
 void ice_vf_vsi_cfg_dvm_legacy_vlan_mode(struct ice_vsi *vsi);
 void ice_vf_vsi_cfg_svm_legacy_vlan_mode(struct ice_vsi *vsi);
 
+void ice_vf_vsi_enable_port_vlan(struct ice_vsi *vsi);
+void ice_vf_vsi_disable_port_vlan(struct ice_vsi *vsi);
+
 #ifdef CONFIG_PCI_IOV
 void ice_vf_vsi_init_vlan_ops(struct ice_vsi *vsi);
 #else
diff --git a/drivers/net/ethernet/intel/ice/ice_vsi_vlan_lib.c b/drivers/net/ethernet/intel/ice/ice_vsi_vlan_lib.c
index 5b4a0abb4607..d4ce3c50672f 100644
--- a/drivers/net/ethernet/intel/ice/ice_vsi_vlan_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_vsi_vlan_lib.c
@@ -202,6 +202,24 @@ int ice_vsi_dis_inner_insertion(struct ice_vsi *vsi)
 	return ice_vsi_manage_vlan_insertion(vsi);
 }
 
+static void
+ice_save_vlan_info(struct ice_aqc_vsi_props *info,
+		   struct ice_vsi_vlan_info *vlan)
+{
+	vlan->sw_flags2 = info->sw_flags2;
+	vlan->inner_vlan_flags = info->inner_vlan_flags;
+	vlan->outer_vlan_flags = info->outer_vlan_flags;
+}
+
+static void
+ice_restore_vlan_info(struct ice_aqc_vsi_props *info,
+		      struct ice_vsi_vlan_info *vlan)
+{
+	info->sw_flags2 = vlan->sw_flags2;
+	info->inner_vlan_flags = vlan->inner_vlan_flags;
+	info->outer_vlan_flags = vlan->outer_vlan_flags;
+}
+
 /**
  * __ice_vsi_set_inner_port_vlan - set port VLAN VSI context settings to enable a port VLAN
  * @vsi: the VSI to update
@@ -218,6 +236,7 @@ static int __ice_vsi_set_inner_port_vlan(struct ice_vsi *vsi, u16 pvid_info)
 	if (!ctxt)
 		return -ENOMEM;
 
+	ice_save_vlan_info(&vsi->info, &vsi->vlan_info);
 	ctxt->info = vsi->info;
 	info = &ctxt->info;
 	info->inner_vlan_flags = ICE_AQ_VSI_INNER_VLAN_TX_MODE_ACCEPTUNTAGGED |
@@ -259,6 +278,33 @@ int ice_vsi_set_inner_port_vlan(struct ice_vsi *vsi, struct ice_vlan *vlan)
 	return __ice_vsi_set_inner_port_vlan(vsi, port_vlan_info);
 }
 
+int ice_vsi_clear_inner_port_vlan(struct ice_vsi *vsi)
+{
+	struct ice_hw *hw = &vsi->back->hw;
+	struct ice_aqc_vsi_props *info;
+	struct ice_vsi_ctx *ctxt;
+	int ret;
+
+	ctxt = kzalloc(sizeof(*ctxt), GFP_KERNEL);
+	if (!ctxt)
+		return -ENOMEM;
+
+	ice_restore_vlan_info(&vsi->info, &vsi->vlan_info);
+	vsi->info.port_based_inner_vlan = 0;
+	ctxt->info = vsi->info;
+	info = &ctxt->info;
+	info->valid_sections = cpu_to_le16(ICE_AQ_VSI_PROP_VLAN_VALID |
+					   ICE_AQ_VSI_PROP_SW_VALID);
+
+	ret = ice_update_vsi(hw, vsi->idx, ctxt, NULL);
+	if (ret)
+		dev_info(ice_hw_to_dev(hw), "update VSI for port VLAN failed, err %d aq_err %s\n",
+			 ret, ice_aq_str(hw->adminq.sq_last_status));
+
+	kfree(ctxt);
+	return ret;
+}
+
 /**
  * ice_cfg_vlan_pruning - enable or disable VLAN pruning on the VSI
  * @vsi: VSI to enable or disable VLAN pruning on
@@ -647,6 +693,7 @@ __ice_vsi_set_outer_port_vlan(struct ice_vsi *vsi, u16 vlan_info, u16 tpid)
 	if (!ctxt)
 		return -ENOMEM;
 
+	ice_save_vlan_info(&vsi->info, &vsi->vlan_info);
 	ctxt->info = vsi->info;
 
 	ctxt->info.sw_flags2 |= ICE_AQ_VSI_SW_FLAG_RX_VLAN_PRUNE_ENA;
@@ -689,9 +736,6 @@ __ice_vsi_set_outer_port_vlan(struct ice_vsi *vsi, u16 vlan_info, u16 tpid)
  * used if DVM is supported. Also, this function should never be called directly
  * as it should be part of ice_vsi_vlan_ops if it's needed.
  *
- * This function does not support clearing the port VLAN as there is currently
- * no use case for this.
- *
  * Use the ice_vlan structure passed in to set this VSI in a port VLAN.
  */
 int ice_vsi_set_outer_port_vlan(struct ice_vsi *vsi, struct ice_vlan *vlan)
@@ -705,3 +749,37 @@ int ice_vsi_set_outer_port_vlan(struct ice_vsi *vsi, struct ice_vlan *vlan)
 
 	return __ice_vsi_set_outer_port_vlan(vsi, port_vlan_info, vlan->tpid);
 }
+
+/**
+ * ice_vsi_clear_outer_port_vlan - clear outer port vlan
+ * @vsi: VSI to configure
+ *
+ * The function is restoring previously set vlan config (saved in
+ * vsi->vlan_info). Setting happens in port vlan configuration.
+ */
+int ice_vsi_clear_outer_port_vlan(struct ice_vsi *vsi)
+{
+	struct ice_hw *hw = &vsi->back->hw;
+	struct ice_vsi_ctx *ctxt;
+	int err;
+
+	ctxt = kzalloc(sizeof(*ctxt), GFP_KERNEL);
+	if (!ctxt)
+		return -ENOMEM;
+
+	ice_restore_vlan_info(&vsi->info, &vsi->vlan_info);
+	vsi->info.port_based_outer_vlan = 0;
+	ctxt->info = vsi->info;
+
+	ctxt->info.valid_sections =
+		cpu_to_le16(ICE_AQ_VSI_PROP_OUTER_TAG_VALID |
+			    ICE_AQ_VSI_PROP_SW_VALID);
+
+	err = ice_update_vsi(hw, vsi->idx, ctxt, NULL);
+	if (err)
+		dev_err(ice_pf_to_dev(vsi->back), "update VSI for clearing outer port based VLAN failed, err %d aq_err %s\n",
+			err, ice_aq_str(hw->adminq.sq_last_status));
+
+	kfree(ctxt);
+	return err;
+}
diff --git a/drivers/net/ethernet/intel/ice/ice_vsi_vlan_lib.h b/drivers/net/ethernet/intel/ice/ice_vsi_vlan_lib.h
index f459909490ec..f0d84d11bd5b 100644
--- a/drivers/net/ethernet/intel/ice/ice_vsi_vlan_lib.h
+++ b/drivers/net/ethernet/intel/ice/ice_vsi_vlan_lib.h
@@ -7,6 +7,12 @@
 #include <linux/types.h>
 #include "ice_vlan.h"
 
+struct ice_vsi_vlan_info {
+	u8 sw_flags2;
+	u8 inner_vlan_flags;
+	u8 outer_vlan_flags;
+};
+
 struct ice_vsi;
 
 int ice_vsi_add_vlan(struct ice_vsi *vsi, struct ice_vlan *vlan);
@@ -17,6 +23,7 @@ int ice_vsi_dis_inner_stripping(struct ice_vsi *vsi);
 int ice_vsi_ena_inner_insertion(struct ice_vsi *vsi, u16 tpid);
 int ice_vsi_dis_inner_insertion(struct ice_vsi *vsi);
 int ice_vsi_set_inner_port_vlan(struct ice_vsi *vsi, struct ice_vlan *vlan);
+int ice_vsi_clear_inner_port_vlan(struct ice_vsi *vsi);
 
 int ice_vsi_ena_rx_vlan_filtering(struct ice_vsi *vsi);
 int ice_vsi_dis_rx_vlan_filtering(struct ice_vsi *vsi);
@@ -28,5 +35,6 @@ int ice_vsi_dis_outer_stripping(struct ice_vsi *vsi);
 int ice_vsi_ena_outer_insertion(struct ice_vsi *vsi, u16 tpid);
 int ice_vsi_dis_outer_insertion(struct ice_vsi *vsi);
 int ice_vsi_set_outer_port_vlan(struct ice_vsi *vsi, struct ice_vlan *vlan);
+int ice_vsi_clear_outer_port_vlan(struct ice_vsi *vsi);
 
 #endif /* _ICE_VSI_VLAN_LIB_H_ */
diff --git a/drivers/net/ethernet/intel/ice/ice_vsi_vlan_ops.h b/drivers/net/ethernet/intel/ice/ice_vsi_vlan_ops.h
index 5b47568f6256..b2d2330dedcb 100644
--- a/drivers/net/ethernet/intel/ice/ice_vsi_vlan_ops.h
+++ b/drivers/net/ethernet/intel/ice/ice_vsi_vlan_ops.h
@@ -21,6 +21,7 @@ struct ice_vsi_vlan_ops {
 	int (*ena_tx_filtering)(struct ice_vsi *vsi);
 	int (*dis_tx_filtering)(struct ice_vsi *vsi);
 	int (*set_port_vlan)(struct ice_vsi *vsi, struct ice_vlan *vlan);
+	int (*clear_port_vlan)(struct ice_vsi *vsi);
 };
 
 void ice_vsi_init_vlan_ops(struct ice_vsi *vsi);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH net-next 10/12] ice: implement static version of ageing
  2023-04-17  9:34 [PATCH net-next 00/12] ice: switchdev bridge offload Wojciech Drewek
                   ` (8 preceding siblings ...)
  2023-04-17  9:34 ` [PATCH net-next 09/12] ice: implement bridge port vlan Wojciech Drewek
@ 2023-04-17  9:34 ` Wojciech Drewek
  2023-04-21 16:22   ` Alexander Lobakin
  2023-04-17  9:34 ` [PATCH net-next 11/12] ice: add tracepoints for the switchdev bridge Wojciech Drewek
  2023-04-17  9:34 ` [PATCH net-next 12/12] ice: Ethtool fdb_cnt stats Wojciech Drewek
  11 siblings, 1 reply; 46+ messages in thread
From: Wojciech Drewek @ 2023-04-17  9:34 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, alexandr.lobakin, david.m.ertman, michal.swiatkowski,
	marcin.szycik, pawel.chmielewski, sridhar.samudrala

From: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>

Remove fdb entries always when ageing time expired.

Allow user to set ageing time using port object attribute.

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 .../net/ethernet/intel/ice/ice_eswitch_br.c   | 46 +++++++++++++++++++
 .../net/ethernet/intel/ice/ice_eswitch_br.h   | 11 +++++
 2 files changed, 57 insertions(+)

diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
index a21eca5088f7..6c3144f98100 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
@@ -8,6 +8,8 @@
 #include "ice_vlan.h"
 #include "ice_vf_vsi_vlan_ops.h"
 
+#define ICE_ESW_BRIDGE_UPDATE_INTERVAL_MS 1000
+
 static const struct rhashtable_params ice_fdb_ht_params = {
 	.key_offset = offsetof(struct ice_esw_br_fdb_entry, data),
 	.key_len = sizeof(struct ice_esw_br_fdb_data),
@@ -440,6 +442,7 @@ ice_eswitch_br_fdb_entry_create(struct net_device *netdev,
 	fdb_entry->br_port = br_port;
 	fdb_entry->flow = flow;
 	fdb_entry->dev = netdev;
+	fdb_entry->last_use = jiffies;
 	event = SWITCHDEV_FDB_ADD_TO_BRIDGE;
 
 	if (added_by_user) {
@@ -838,6 +841,10 @@ ice_eswitch_br_port_obj_attr_set(struct net_device *netdev, const void *ctx,
 		ice_eswitch_br_vlan_filtering_set(br_port->bridge,
 						  attr->u.vlan_filtering);
 		break;
+	case SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME:
+		br_port->bridge->ageing_time =
+			clock_t_to_jiffies(attr->u.ageing_time);
+		break;
 	default:
 		return -EOPNOTSUPP;
 	}
@@ -1011,6 +1018,7 @@ ice_eswitch_br_init(struct ice_esw_br_offloads *br_offloads, int ifindex)
 	INIT_LIST_HEAD(&bridge->fdb_list);
 	bridge->br_offloads = br_offloads;
 	bridge->ifindex = ifindex;
+	bridge->ageing_time = clock_t_to_jiffies(BR_DEFAULT_AGEING_TIME);
 	xa_init(&bridge->ports);
 	br_offloads->bridge = bridge;
 
@@ -1210,6 +1218,7 @@ ice_eswitch_br_offloads_deinit(struct ice_pf *pf)
 	if (!br_offloads)
 		return;
 
+	cancel_delayed_work_sync(&br_offloads->update_work);
 	unregister_netdevice_notifier(&br_offloads->netdev_nb);
 	unregister_switchdev_blocking_notifier(&br_offloads->switchdev_blk);
 	unregister_switchdev_notifier(&br_offloads->switchdev_nb);
@@ -1224,6 +1233,38 @@ ice_eswitch_br_offloads_deinit(struct ice_pf *pf)
 	rtnl_unlock();
 }
 
+static void ice_eswitch_br_update(struct ice_esw_br_offloads *br_offloads)
+{
+	struct ice_esw_br *bridge = br_offloads->bridge;
+	struct ice_esw_br_fdb_entry *entry, *tmp;
+
+	if (!bridge)
+		return;
+
+	rtnl_lock();
+	list_for_each_entry_safe(entry, tmp, &bridge->fdb_list, list) {
+		if (entry->flags & ICE_ESWITCH_BR_FDB_ADDED_BY_USER)
+			continue;
+
+		if (time_is_before_jiffies(entry->last_use +
+					   bridge->ageing_time))
+			ice_eswitch_br_fdb_entry_notify_and_cleanup(bridge,
+								    entry);
+	}
+	rtnl_unlock();
+}
+
+static void ice_eswitch_br_update_work(struct work_struct *work)
+{
+	struct ice_esw_br_offloads *br_offloads =
+		ice_work_to_br_offloads(work);
+
+	ice_eswitch_br_update(br_offloads);
+
+	queue_delayed_work(br_offloads->wq, &br_offloads->update_work,
+			   msecs_to_jiffies(ICE_ESW_BRIDGE_UPDATE_INTERVAL_MS));
+}
+
 int
 ice_eswitch_br_offloads_init(struct ice_pf *pf)
 {
@@ -1272,6 +1313,11 @@ ice_eswitch_br_offloads_init(struct ice_pf *pf)
 		goto err_reg_netdev_nb;
 	}
 
+	INIT_DELAYED_WORK(&br_offloads->update_work,
+			  ice_eswitch_br_update_work);
+	queue_delayed_work(br_offloads->wq, &br_offloads->update_work,
+			   msecs_to_jiffies(ICE_ESW_BRIDGE_UPDATE_INTERVAL_MS));
+
 	return 0;
 
 err_reg_netdev_nb:
diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
index b6eef068ea81..42fff681fb71 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
@@ -5,6 +5,7 @@
 #define _ICE_ESWITCH_BR_H_
 
 #include <linux/rhashtable.h>
+#include <linux/workqueue.h>
 
 struct ice_esw_br_fdb_data {
 	unsigned char addr[ETH_ALEN];
@@ -30,6 +31,8 @@ struct ice_esw_br_fdb_entry {
 	struct net_device *dev;
 	struct ice_esw_br_port *br_port;
 	struct ice_esw_br_flow *flow;
+
+	unsigned long last_use;
 };
 
 enum ice_esw_br_port_type {
@@ -58,6 +61,8 @@ struct ice_esw_br {
 	struct xarray ports;
 	struct rhashtable fdb_ht;
 	struct list_head fdb_list;
+
+	unsigned long ageing_time;
 };
 
 struct ice_esw_br_offloads {
@@ -68,6 +73,7 @@ struct ice_esw_br_offloads {
 	struct notifier_block switchdev_nb;
 
 	struct workqueue_struct *wq;
+	struct delayed_work update_work;
 };
 
 struct ice_esw_br_fdb_work {
@@ -88,6 +94,11 @@ struct ice_esw_br_vlan {
 		     struct ice_esw_br_offloads, \
 		     nb_name)
 
+#define ice_work_to_br_offloads(w) \
+	container_of(w, \
+		     struct ice_esw_br_offloads, \
+		     update_work.work)
+
 #define ice_work_to_fdb_work(w) \
 	container_of(w, \
 		     struct ice_esw_br_fdb_work, \
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH net-next 11/12] ice: add tracepoints for the switchdev bridge
  2023-04-17  9:34 [PATCH net-next 00/12] ice: switchdev bridge offload Wojciech Drewek
                   ` (9 preceding siblings ...)
  2023-04-17  9:34 ` [PATCH net-next 10/12] ice: implement static version of ageing Wojciech Drewek
@ 2023-04-17  9:34 ` Wojciech Drewek
  2023-04-17  9:34 ` [PATCH net-next 12/12] ice: Ethtool fdb_cnt stats Wojciech Drewek
  11 siblings, 0 replies; 46+ messages in thread
From: Wojciech Drewek @ 2023-04-17  9:34 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, alexandr.lobakin, david.m.ertman, michal.swiatkowski,
	marcin.szycik, pawel.chmielewski, sridhar.samudrala

From: Pawel Chmielewski <pawel.chmielewski@intel.com>

Add tracepoints for the following events:
- Add FDB entry
- Delete FDB entry
- Create bridge VLAN
- Cleanup bridge VLAN
- Link port to the bridge
- Unlink port from the bridge

Signed-off-by: Pawel Chmielewski <pawel.chmielewski@intel.com>
---
 .../net/ethernet/intel/ice/ice_eswitch_br.c   |  9 ++
 drivers/net/ethernet/intel/ice/ice_trace.h    | 90 +++++++++++++++++++
 2 files changed, 99 insertions(+)

diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
index 6c3144f98100..4a69b3a67914 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
@@ -7,6 +7,7 @@
 #include "ice_switch.h"
 #include "ice_vlan.h"
 #include "ice_vf_vsi_vlan_ops.h"
+#include "ice_trace.h"
 
 #define ICE_ESW_BRIDGE_UPDATE_INTERVAL_MS 1000
 
@@ -379,6 +380,7 @@ ice_eswitch_br_fdb_entry_find_and_delete(struct ice_esw_br *bridge,
 		return;
 	}
 
+	trace_ice_eswitch_br_fdb_entry_find_and_delete(fdb_entry);
 	ice_eswitch_br_fdb_entry_notify_and_cleanup(bridge, fdb_entry);
 }
 
@@ -456,6 +458,7 @@ ice_eswitch_br_fdb_entry_create(struct net_device *netdev,
 		goto err_fdb_insert;
 
 	list_add(&fdb_entry->list, &bridge->fdb_list);
+	trace_ice_eswitch_br_fdb_entry_create(fdb_entry);
 
 	ice_eswitch_br_fdb_offload_notify(netdev, mac, vid, event);
 
@@ -632,6 +635,7 @@ static void
 ice_eswitch_br_vlan_cleanup(struct ice_esw_br_port *port,
 			    struct ice_esw_br_vlan *vlan)
 {
+	trace_ice_eswitch_br_vlan_cleanup(vlan);
 	xa_erase(&port->vlans, vlan->vid);
 	if (port->pvid == vlan->vid)
 		ice_eswitch_br_clear_pvid(port);
@@ -715,6 +719,8 @@ ice_eswitch_br_vlan_create(u16 vid, u16 flags, struct ice_esw_br_port *port)
 	if (err)
 		goto err_insert;
 
+	trace_ice_eswitch_br_vlan_create(vlan);
+
 	return vlan;
 
 err_insert:
@@ -1078,6 +1084,7 @@ ice_eswitch_br_port_unlink(struct ice_esw_br_offloads *br_offloads,
 		return -EINVAL;
 	}
 
+	trace_ice_eswitch_br_port_unlink(br_port);
 	ice_eswitch_br_port_deinit(br_port->bridge, br_port);
 	ice_eswitch_br_verify_deinit(br_offloads, br_port->bridge);
 
@@ -1106,6 +1113,7 @@ ice_eswitch_br_port_link(struct ice_esw_br_offloads *br_offloads,
 		struct ice_repr *repr = ice_netdev_to_repr(dev);
 
 		err = ice_eswitch_br_vf_repr_port_init(bridge, repr);
+		trace_ice_eswitch_br_port_link(repr->br_port);
 	} else {
 		struct net_device *ice_dev = dev;
 		struct ice_pf *pf;
@@ -1119,6 +1127,7 @@ ice_eswitch_br_port_link(struct ice_esw_br_offloads *br_offloads,
 		pf = ice_netdev_to_pf(ice_dev);
 
 		err = ice_eswitch_br_uplink_port_init(bridge, pf);
+		trace_ice_eswitch_br_port_link(pf->br_port);
 	}
 	if (err) {
 		NL_SET_ERR_MSG_MOD(extack, "Failed to init bridge port");
diff --git a/drivers/net/ethernet/intel/ice/ice_trace.h b/drivers/net/ethernet/intel/ice/ice_trace.h
index ae98d5a8ff60..b2f5c9fe0149 100644
--- a/drivers/net/ethernet/intel/ice/ice_trace.h
+++ b/drivers/net/ethernet/intel/ice/ice_trace.h
@@ -21,6 +21,7 @@
 #define _ICE_TRACE_H_
 
 #include <linux/tracepoint.h>
+#include "ice_eswitch_br.h"
 
 /* ice_trace() macro enables shared code to refer to trace points
  * like:
@@ -240,6 +241,95 @@ DEFINE_TX_TSTAMP_OP_EVENT(ice_tx_tstamp_fw_req);
 DEFINE_TX_TSTAMP_OP_EVENT(ice_tx_tstamp_fw_done);
 DEFINE_TX_TSTAMP_OP_EVENT(ice_tx_tstamp_complete);
 
+DECLARE_EVENT_CLASS(ice_esw_br_fdb_template,
+		    TP_PROTO(struct ice_esw_br_fdb_entry *fdb),
+		    TP_ARGS(fdb),
+		    TP_STRUCT__entry(__array(char, dev_name, IFNAMSIZ)
+				     __array(unsigned char, addr, ETH_ALEN)
+				     __field(u16, vid)
+				     __field(int, flags)),
+		    TP_fast_assign(strscpy(__entry->dev_name,
+					   netdev_name(fdb->dev),
+					   IFNAMSIZ);
+				   memcpy(__entry->addr, fdb->data.addr, ETH_ALEN);
+				   __entry->vid = fdb->data.vid;
+				   __entry->flags = fdb->flags;),
+		    TP_printk("net_device=%s addr=%pM vid=%u flags=%x",
+			      __entry->dev_name,
+			      __entry->addr,
+			      __entry->vid,
+			      __entry->flags)
+);
+
+DEFINE_EVENT(ice_esw_br_fdb_template,
+	     ice_eswitch_br_fdb_entry_create,
+	     TP_PROTO(struct ice_esw_br_fdb_entry *fdb),
+	     TP_ARGS(fdb)
+);
+
+DEFINE_EVENT(ice_esw_br_fdb_template,
+	     ice_eswitch_br_fdb_entry_find_and_delete,
+	     TP_PROTO(struct ice_esw_br_fdb_entry *fdb),
+	     TP_ARGS(fdb)
+);
+
+DECLARE_EVENT_CLASS(ice_esw_br_vlan_template,
+		    TP_PROTO(struct ice_esw_br_vlan *vlan),
+		    TP_ARGS(vlan),
+		    TP_STRUCT__entry(__field(u16, vid)
+				     __field(u16, flags)),
+		    TP_fast_assign(__entry->vid = vlan->vid;
+				   __entry->flags = vlan->flags;),
+		    TP_printk("vid=%u flags=%x",
+			      __entry->vid,
+			      __entry->flags)
+);
+
+DEFINE_EVENT(ice_esw_br_vlan_template,
+	     ice_eswitch_br_vlan_create,
+	     TP_PROTO(struct ice_esw_br_vlan *vlan),
+	     TP_ARGS(vlan)
+);
+
+DEFINE_EVENT(ice_esw_br_vlan_template,
+	     ice_eswitch_br_vlan_cleanup,
+	     TP_PROTO(struct ice_esw_br_vlan *vlan),
+	     TP_ARGS(vlan)
+);
+
+#define ICE_ESW_BR_PORT_NAME_L 16
+
+DECLARE_EVENT_CLASS(ice_esw_br_port_template,
+		    TP_PROTO(struct ice_esw_br_port *port),
+		    TP_ARGS(port),
+		    TP_STRUCT__entry(__field(u16, vport_num)
+				     __array(char, port_type, ICE_ESW_BR_PORT_NAME_L)),
+		    TP_fast_assign(__entry->vport_num = port->vsi_idx;
+					if (port->type == ICE_ESWITCH_BR_UPLINK_PORT)
+						strscpy(__entry->port_type,
+							"Uplink",
+							ICE_ESW_BR_PORT_NAME_L);
+					else
+						strscpy(__entry->port_type,
+							"VF Representor",
+							ICE_ESW_BR_PORT_NAME_L);),
+		    TP_printk("vport_num=%u port type=%s",
+			      __entry->vport_num,
+			      __entry->port_type)
+);
+
+DEFINE_EVENT(ice_esw_br_port_template,
+	     ice_eswitch_br_port_link,
+	     TP_PROTO(struct ice_esw_br_port *port),
+	     TP_ARGS(port)
+);
+
+DEFINE_EVENT(ice_esw_br_port_template,
+	     ice_eswitch_br_port_unlink,
+	     TP_PROTO(struct ice_esw_br_port *port),
+	     TP_ARGS(port)
+);
+
 /* End tracepoints */
 
 #endif /* _ICE_TRACE_H_ */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH net-next 12/12] ice: Ethtool fdb_cnt stats
  2023-04-17  9:34 [PATCH net-next 00/12] ice: switchdev bridge offload Wojciech Drewek
                   ` (10 preceding siblings ...)
  2023-04-17  9:34 ` [PATCH net-next 11/12] ice: add tracepoints for the switchdev bridge Wojciech Drewek
@ 2023-04-17  9:34 ` Wojciech Drewek
  2023-04-21 16:32   ` Alexander Lobakin
  11 siblings, 1 reply; 46+ messages in thread
From: Wojciech Drewek @ 2023-04-17  9:34 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, alexandr.lobakin, david.m.ertman, michal.swiatkowski,
	marcin.szycik, pawel.chmielewski, sridhar.samudrala

Introduce new ethtool statistic which is 'fdb_cnt'. It
provides information about how many bridge fdbs are created on
a given netdev.

Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
---
 drivers/net/ethernet/intel/ice/ice.h            | 2 ++
 drivers/net/ethernet/intel/ice/ice_eswitch_br.c | 6 ++++++
 drivers/net/ethernet/intel/ice/ice_ethtool.c    | 1 +
 3 files changed, 9 insertions(+)

diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 489934ddfbb8..90e007942af6 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -350,6 +350,8 @@ struct ice_vsi {
 	u16 num_gfltr;
 	u16 num_bfltr;
 
+	u32 fdb_cnt;
+
 	/* RSS config */
 	u16 rss_table_size;	/* HW RSS table size */
 	u16 rss_size;		/* Allocated RSS queues */
diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
index 4a69b3a67914..cfa4324bf1a2 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
@@ -330,6 +330,7 @@ static void
 ice_eswitch_br_fdb_entry_delete(struct ice_esw_br *bridge,
 				struct ice_esw_br_fdb_entry *fdb_entry)
 {
+	struct ice_vsi *vsi = fdb_entry->br_port->vsi;
 	struct ice_pf *pf = bridge->br_offloads->pf;
 
 	rhashtable_remove_fast(&bridge->fdb_ht, &fdb_entry->ht_node,
@@ -339,6 +340,7 @@ ice_eswitch_br_fdb_entry_delete(struct ice_esw_br *bridge,
 	ice_eswitch_br_flow_delete(pf, fdb_entry->flow);
 
 	kfree(fdb_entry);
+	vsi->fdb_cnt--;
 }
 
 static void
@@ -462,6 +464,8 @@ ice_eswitch_br_fdb_entry_create(struct net_device *netdev,
 
 	ice_eswitch_br_fdb_offload_notify(netdev, mac, vid, event);
 
+	br_port->vsi->fdb_cnt++;
+
 	return;
 
 err_fdb_insert:
@@ -941,6 +945,7 @@ ice_eswitch_br_vf_repr_port_init(struct ice_esw_br *bridge,
 	br_port->vsi_idx = br_port->vsi->idx;
 	br_port->type = ICE_ESWITCH_BR_VF_REPR_PORT;
 	repr->br_port = br_port;
+	repr->src_vsi->fdb_cnt = 0;
 
 	err = xa_insert(&bridge->ports, br_port->vsi_idx, br_port, GFP_KERNEL);
 	if (err) {
@@ -966,6 +971,7 @@ ice_eswitch_br_uplink_port_init(struct ice_esw_br *bridge, struct ice_pf *pf)
 	br_port->vsi_idx = br_port->vsi->idx;
 	br_port->type = ICE_ESWITCH_BR_UPLINK_PORT;
 	pf->br_port = br_port;
+	vsi->fdb_cnt = 0;
 
 	err = xa_insert(&bridge->ports, br_port->vsi_idx, br_port, GFP_KERNEL);
 	if (err) {
diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
index 8407c7175cf6..d06b2a688323 100644
--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
+++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
@@ -64,6 +64,7 @@ static const struct ice_stats ice_gstrings_vsi_stats[] = {
 	ICE_VSI_STAT("tx_linearize", tx_linearize),
 	ICE_VSI_STAT("tx_busy", tx_busy),
 	ICE_VSI_STAT("tx_restart", tx_restart),
+	ICE_VSI_STAT("fdb_cnt", fdb_cnt),
 };
 
 enum ice_ethtool_test_id {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 01/12] ice: Minor switchdev fixes
  2023-04-17  9:34 ` [PATCH net-next 01/12] ice: Minor switchdev fixes Wojciech Drewek
@ 2023-04-19 14:35   ` Alexander Lobakin
  0 siblings, 0 replies; 46+ messages in thread
From: Alexander Lobakin @ 2023-04-19 14:35 UTC (permalink / raw)
  To: Wojciech Drewek
  Cc: intel-wired-lan, netdev, alexandr.lobakin, david.m.ertman,
	michal.swiatkowski, marcin.szycik, pawel.chmielewski,
	sridhar.samudrala

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Mon, 17 Apr 2023 11:34:01 +0200

> Introduce a few fixes that are needed for bridge offload
> to work properly.
> 
> - Skip adv rule removal in ice_eswitch_disable_switchdev.
>   Advanced rules for ctrl VSI will be removed anyway when the
>   VSI will cleaned up, no need to do it explicitly.

[...]

> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> index 7c04057c524c..f198c845631f 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -385,7 +385,7 @@ static int ice_vsi_sync_fltr(struct ice_vsi *vsi)
>  	}
>  	err = 0;
>  	/* check for changes in promiscuous modes */
> -	if (changed_flags & IFF_ALLMULTI) {
> +	if (changed_flags & IFF_ALLMULTI && !ice_is_switchdev_running(pf)) {

Nit: pls enclose bitops into separate set of braces, i.e. in this case:

	if ((changed_flags & IFF_ALLMULTI) && ...

It's more safe and also more readable I'd say (clearly states that `&`
is intended, not a typoed `&&`).

>  		if (vsi->current_netdev_flags & IFF_ALLMULTI) {
>  			err = ice_set_promisc(vsi, ICE_MCAST_PROMISC_BITS);
>  			if (err) {

Thanks,
Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 02/12] ice: Remove exclusion code for RDMA+SRIOV
  2023-04-17  9:34 ` [PATCH net-next 02/12] ice: Remove exclusion code for RDMA+SRIOV Wojciech Drewek
@ 2023-04-19 14:38   ` Alexander Lobakin
  2023-04-25 15:26   ` [Intel-wired-lan] " Michal Schmidt
  1 sibling, 0 replies; 46+ messages in thread
From: Alexander Lobakin @ 2023-04-19 14:38 UTC (permalink / raw)
  To: Wojciech Drewek
  Cc: intel-wired-lan, netdev, david.m.ertman, michal.swiatkowski,
	marcin.szycik, pawel.chmielewski, sridhar.samudrala

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Mon, 17 Apr 2023 11:34:02 +0200

> From: Dave Ertman <david.m.ertman@intel.com>
> 
> There was a change previously to stop SR-IOV and LAG from existing on the
> same interface.  This was to prevent the violation of LACP (Link
> Aggregation Control Protocol).  The method to achieve this was to add a
> no-op Rx handler onto the netdev when SR-IOV VFs were present, thus
> blocking bonding, bridging, etc from claiming the interface by adding
> its own Rx handler.  Also, when an interface was added into a aggregate,
> then the SR-IOV capability was set to false.

[...]

> diff --git a/drivers/net/ethernet/intel/ice/ice_lag.h b/drivers/net/ethernet/intel/ice/ice_lag.h
> index 51b5cf467ce2..0bd6b96d7e01 100644
> --- a/drivers/net/ethernet/intel/ice/ice_lag.h
> +++ b/drivers/net/ethernet/intel/ice/ice_lag.h
> @@ -29,59 +29,9 @@ struct ice_lag {
>  	/* each thing blocking bonding will increment this value by one.
>  	 * If this value is zero, then bonding is allowed.
>  	 */

^ this comment block actually belongs to @dis_lag, so it also needs to
be removed.

> -	u16 dis_lag;
>  	u8 role;
>  };
>  
>  int ice_init_lag(struct ice_pf *pf);
>  void ice_deinit_lag(struct ice_pf *pf);
> -rx_handler_result_t ice_lag_nop_handler(struct sk_buff **pskb);
[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 03/12] ice: Unset src prune on uplink VSI
  2023-04-17  9:34 ` [PATCH net-next 03/12] ice: Unset src prune on uplink VSI Wojciech Drewek
@ 2023-04-19 14:49   ` Alexander Lobakin
  0 siblings, 0 replies; 46+ messages in thread
From: Alexander Lobakin @ 2023-04-19 14:49 UTC (permalink / raw)
  To: Wojciech Drewek
  Cc: intel-wired-lan, netdev, alexandr.lobakin, david.m.ertman,
	michal.swiatkowski, marcin.szycik, pawel.chmielewski,
	sridhar.samudrala

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Mon, 17 Apr 2023 11:34:03 +0200

> In switchdev mode uplink VSI is supposed to receive all packets that
> were not matched by existing filters. If ICE_AQ_VSI_SW_FLAG_LOCAL_LB
> bit is unset and we have a filter associated with uplink VSI
> which matches on dst mac equal to MAC1, then packets with src mac equal
> to MAC1 will be pruned from reaching uplink VSI.

[...]

> diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
> index 3de9556b89ac..60b123d3c9cf 100644
> --- a/drivers/net/ethernet/intel/ice/ice_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_lib.c
> @@ -4112,3 +4112,27 @@ void ice_vsi_ctx_clear_allow_override(struct ice_vsi_ctx *ctx)
>  {
>  	ctx->info.sec_flags &= ~ICE_AQ_VSI_SEC_FLAG_ALLOW_DEST_OVRD;
>  }
> +
> +/**
> + * ice_vsi_update_local_lb - update sw block in VSI with local loopback bit
> + * @vsi: pointer to VSI structure
> + * @set: set or unset the bit
> + */
> +int
> +ice_vsi_update_local_lb(struct ice_vsi *vsi, bool set)
> +{
> +	struct ice_vsi_ctx ctx = { 0 };

Nit: prefer `= { }` over `= { 0 }`, the latter may sometimes trigger
Wmissing-field-initializers (I might be wrong here, but anyway).

> +
> +	ctx.info = vsi->info;

Can't it be combined with init on declaration (you either way initialize
@ctx with zeros)?

	struct ice_vsi_ctx ctx = {
		.info	= vsi->info,
	};

> +	ctx.info.valid_sections = cpu_to_le16(ICE_AQ_VSI_PROP_SW_VALID);
> +	if (set)
> +		ctx.info.sw_flags |= ICE_AQ_VSI_SW_FLAG_LOCAL_LB;
> +	else
> +		ctx.info.sw_flags &= ~ICE_AQ_VSI_SW_FLAG_LOCAL_LB;
> +
> +	if (ice_update_vsi(&vsi->back->hw, vsi->idx, &ctx, NULL))
> +		return -ENODEV;
> +
> +	vsi->info = ctx.info;
> +	return 0;
> +}
[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 04/12] ice: Implement basic eswitch bridge setup
  2023-04-17  9:34 ` [PATCH net-next 04/12] ice: Implement basic eswitch bridge setup Wojciech Drewek
@ 2023-04-19 15:23   ` Alexander Lobakin
  2023-04-20  9:54     ` Drewek, Wojciech
  0 siblings, 1 reply; 46+ messages in thread
From: Alexander Lobakin @ 2023-04-19 15:23 UTC (permalink / raw)
  To: Wojciech Drewek
  Cc: intel-wired-lan, netdev, david.m.ertman, michal.swiatkowski,
	marcin.szycik, pawel.chmielewski, sridhar.samudrala

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Mon, 17 Apr 2023 11:34:04 +0200

> With this patch, ice driver is able to track if the port
> representors or uplink port were added to the linux bridge in
> switchdev mode. Listen for NETDEV_CHANGEUPPER events in order to
> detect this. ice_esw_br data structure reflects the linux bridge
> and stores all the ports of the bridge (ice_esw_br_port) in
> xarray, it's created when the first port is added to the bridge and
> freed once the last port is removed. Note that only one bridge is
> supported per eswitch.

[...]

> diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
> index ac2971073fdd..5b2ade5908e8 100644
> --- a/drivers/net/ethernet/intel/ice/ice.h
> +++ b/drivers/net/ethernet/intel/ice/ice.h
> @@ -511,6 +511,7 @@ struct ice_switchdev_info {
>  	struct ice_vsi *control_vsi;
>  	struct ice_vsi *uplink_vsi;
>  	bool is_running;
> +	struct ice_esw_br_offloads *br_offloads;

7-byte hole here unfortunately =\ After ::is_running. You can place
::br_offloads *before* ::is_running to avoid this (well, you'll still
have it, but as padding at the end of the structure).
...or change ::is_running to "unsigned long flags" to not waste 1 byte
for 1 bit and have 63 free flags more :D

>  };
>  
>  struct ice_agg_node {

[...]

> +static struct ice_esw_br_port *
> +ice_eswitch_br_netdev_to_port(struct net_device *dev)

Also const?

> +{
> +	if (ice_is_port_repr_netdev(dev)) {
> +		struct ice_repr *repr = ice_netdev_to_repr(dev);
> +
> +		return repr->br_port;
> +	} else if (netif_is_ice(dev)) {
> +		struct ice_pf *pf = ice_netdev_to_pf(dev);

Both @repr and @pf can also be const :p

> +
> +		return pf->br_port;
> +	}
> +
> +	return NULL;
> +}

[...]

> +static struct ice_esw_br_port *
> +ice_eswitch_br_port_init(struct ice_esw_br *bridge)
> +{
> +	struct ice_esw_br_port *br_port;
> +
> +	br_port = kzalloc(sizeof(*br_port), GFP_KERNEL);
> +	if (!br_port)
> +		return ERR_PTR(-ENOMEM);
> +
> +	br_port->bridge = bridge;

Since you always pass @bridge from the call site either way, does it
make sense to do that or you could just assign -> bridge on the call
sites after a successful allocation?

> +
> +	return br_port;
> +}

[...]

> +static int
> +ice_eswitch_br_port_changeupper(struct notifier_block *nb, void *ptr)
> +{
> +	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
> +	struct netdev_notifier_changeupper_info *info = ptr;
> +	struct ice_esw_br_offloads *br_offloads =
> +		ice_nb_to_br_offloads(nb, netdev_nb);

Maybe assign it outside the declaration block to avoid line wrap?

> +	struct netlink_ext_ack *extack;
> +	struct net_device *upper;
> +
> +	if (!ice_eswitch_br_is_dev_valid(dev))
> +		return 0;
> +
> +	upper = info->upper_dev;
> +	if (!netif_is_bridge_master(upper))
> +		return 0;
> +
> +	extack = netdev_notifier_info_to_extack(&info->info);
> +
> +	return info->linking ?
> +		ice_eswitch_br_port_link(br_offloads, dev, upper->ifindex,
> +					 extack) :
> +		ice_eswitch_br_port_unlink(br_offloads, dev, upper->ifindex,
> +					   extack);

And here do that via `if return else return` to avoid multi-line ternary?

> +}
> +
> +static int
> +ice_eswitch_br_port_event(struct notifier_block *nb,
> +			  unsigned long event, void *ptr)

[...]

> diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
> new file mode 100644
> index 000000000000..53ea29569c36
> --- /dev/null
> +++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
> @@ -0,0 +1,42 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright (C) 2023, Intel Corporation. */
> +
> +#ifndef _ICE_ESWITCH_BR_H_
> +#define _ICE_ESWITCH_BR_H_
> +
> +enum ice_esw_br_port_type {
> +	ICE_ESWITCH_BR_UPLINK_PORT = 0,
> +	ICE_ESWITCH_BR_VF_REPR_PORT = 1,
> +};
> +
> +struct ice_esw_br_port {
> +	struct ice_esw_br *bridge;
> +	enum ice_esw_br_port_type type;

Also hole :s I'd move it one line below.

> +	struct ice_vsi *vsi;
> +	u16 vsi_idx;
> +};
> +
> +struct ice_esw_br {
> +	struct ice_esw_br_offloads *br_offloads;
> +	int ifindex;
> +
> +	struct xarray ports;

(not sure about this one, but potentially there can be a hole between
 those two)

> +};
> +
> +struct ice_esw_br_offloads {
> +	struct ice_pf *pf;
> +	struct ice_esw_br *bridge;
> +	struct notifier_block netdev_nb;
> +};
> +
> +#define ice_nb_to_br_offloads(nb, nb_name) \
> +	container_of(nb, \
> +		     struct ice_esw_br_offloads, \
> +		     nb_name)

Hmm, you use it only once and only with `netdev_nb` field. Do you plan
to add more call sites of this macro? Otherwise you could embed the
second argument into the macro itself (mentioned `netdev_nb`) or even
just open-code the whole macro in the sole call site.

> +
> +void
> +ice_eswitch_br_offloads_deinit(struct ice_pf *pf);
> +int
> +ice_eswitch_br_offloads_init(struct ice_pf *pf);
> +
> +#endif /* _ICE_ESWITCH_BR_H_ */
[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 05/12] ice: Switchdev FDB events support
  2023-04-17  9:34 ` [PATCH net-next 05/12] ice: Switchdev FDB events support Wojciech Drewek
@ 2023-04-19 15:38   ` Alexander Lobakin
  2023-04-20 11:27     ` Drewek, Wojciech
  0 siblings, 1 reply; 46+ messages in thread
From: Alexander Lobakin @ 2023-04-19 15:38 UTC (permalink / raw)
  To: Wojciech Drewek
  Cc: intel-wired-lan, netdev, alexandr.lobakin, david.m.ertman,
	michal.swiatkowski, marcin.szycik, pawel.chmielewski,
	sridhar.samudrala

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Mon, 17 Apr 2023 11:34:05 +0200

> Listen for SWITCHDEV_FDB_{ADD|DEL}_TO_DEVICE events while in switchdev
> mode. Accept these events on both uplink and VF PR ports. Add HW
> rules in newly created workqueue. FDB entries are stored in rhashtable
> for lookup when removing the entry and in the list for cleanup
> purpose. Direction of the HW rule depends on the type of the ports
> on which the FDB event was received:

[...]

> +static int
> +ice_eswitch_br_rule_delete(struct ice_hw *hw, struct ice_rule_query_data *rule)
> +{
> +	int err;
> +
> +	if (!rule)
> +		return -EINVAL;
> +
> +	err = ice_rem_adv_rule_by_id(hw, rule);
> +	kfree(rule);
> +
> +	return err;
> +}
> +
> +static struct ice_rule_query_data *
> +ice_eswitch_br_fwd_rule_create(struct ice_hw *hw, u16 vsi_idx, int port_type,

(no types shorter than u32 on the stack reminder)

> +			       const unsigned char *mac)
> +{
> +	struct ice_adv_rule_info rule_info = { 0 };
> +	struct ice_rule_query_data *rule;
> +	struct ice_adv_lkup_elem *list;
> +	u16 lkups_cnt = 1;

Why have it as variable if it doesn't change? Just embed it into the
ice_add_adv_rule() call and replace kcalloc() with kzalloc().

> +	int err;
> +
> +	rule = kzalloc(sizeof(*rule), GFP_KERNEL);
> +	if (!rule)
> +		return ERR_PTR(-ENOMEM);
> +
> +	list = kcalloc(lkups_cnt, sizeof(*list), GFP_ATOMIC);

[...]

> +	fwd_rule = ice_eswitch_br_fwd_rule_create(hw, vsi_idx, port_type, mac);
> +	if (IS_ERR(fwd_rule)) {
> +		err = PTR_ERR(fwd_rule);
> +		dev_err(dev, "Failed to create eswitch bridge %sgress forward rule, err: %d\n",
> +			port_type == ICE_ESWITCH_BR_UPLINK_PORT ? "e" : "in",
> +			err);
> +		goto err_fwd_rule;

A bit suboptimal. To print errno pointer, you have %pe modifier, so you
can just print err as:

		... forward rule, err: %pe\n", ... : "in", fwd_rule);

Then you don't need @err at all and then below...

> +	}
> +
> +	flow->fwd_rule = fwd_rule;
> +
> +	return flow;
> +
> +err_fwd_rule:
> +	kfree(flow);
> +
> +	return ERR_PTR(err);

...you can return @fwd_rule directly.

> +}
> +
> +static struct ice_esw_br_fdb_entry *
> +ice_eswitch_br_fdb_find(struct ice_esw_br *bridge, const unsigned char *mac,
> +			u16 vid)
> +{
> +	struct ice_esw_br_fdb_data data = {};

(nit: assign `vid` here)

> +
> +	ether_addr_copy(data.addr, mac);
> +	data.vid = vid;
> +	return rhashtable_lookup_fast(&bridge->fdb_ht, &data,
> +				      ice_fdb_ht_params);
> +}

[...]

> +static void
> +ice_eswitch_br_fdb_offload_notify(struct net_device *dev,
> +				  const unsigned char *mac, u16 vid,
> +				  unsigned long val)
> +{
> +	struct switchdev_notifier_fdb_info fdb_info;
> +
> +	fdb_info.addr = mac;
> +	fdb_info.vid = vid;
> +	fdb_info.offloaded = true;

(same for all of them. Declare-time initializer is faster BTW)

> +	call_switchdev_notifiers(val, dev, &fdb_info.info, NULL);
> +}

[...]

> +static int
> +ice_eswitch_br_switchdev_event(struct notifier_block *nb,
> +			       unsigned long event, void *ptr)
> +{
> +	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
> +	struct ice_esw_br_offloads *br_offloads =
> +		ice_nb_to_br_offloads(nb, switchdev_nb);
> +	struct netlink_ext_ack *extack =
> +		switchdev_notifier_info_to_extack(ptr);

(initialize-later-to-avoid-line-breaks?)

> +	struct switchdev_notifier_fdb_info *fdb_info;
> +	struct switchdev_notifier_info *info = ptr;
> +	struct ice_esw_br_fdb_work *work;
> +	struct net_device *upper;
> +	struct ice_esw_br_port *br_port;

RCT :s

> +
> +	upper = netdev_master_upper_dev_get_rcu(dev);
> +	if (!upper)
> +		return NOTIFY_DONE;
> +
> +	if (!netif_is_bridge_master(upper))
> +		return NOTIFY_DONE;
> +
> +	if (!ice_eswitch_br_is_dev_valid(dev))
> +		return NOTIFY_DONE;
> +
> +	br_port = ice_eswitch_br_netdev_to_port(dev);
> +	if (!br_port)
> +		return NOTIFY_DONE;
> +
> +	switch (event) {
> +	case SWITCHDEV_FDB_ADD_TO_DEVICE:
> +	case SWITCHDEV_FDB_DEL_TO_DEVICE:
> +		fdb_info = container_of(info,
> +					struct switchdev_notifier_fdb_info,

Nit: `typeof(*fdb_info)` is shorter and would probably fit into the prev
line.

> +					info);
> +
> +		work = ice_eswitch_br_fdb_work_alloc(fdb_info, br_port, dev,
> +						     event);

[...]

> +enum {
> +	ICE_ESWITCH_BR_FDB_ADDED_BY_USER = BIT(0),
> +};
> +
> +struct ice_esw_br_fdb_entry {
> +	struct ice_esw_br_fdb_data data;
> +	struct rhash_head ht_node;
> +	struct list_head list;
> +
> +	int flags;

They can't be negative I believe? u32 then? Also I think here's a 4-byte
hole :s But since all of the members here except this one are 8-byte
aligned, you can't avoid it (can be filled anytime later with some other
<= 4-byte field)

> +
> +	struct net_device *dev;
> +	struct ice_esw_br_port *br_port;
> +	struct ice_esw_br_flow *flow;
> +};
[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH net-next 04/12] ice: Implement basic eswitch bridge setup
  2023-04-19 15:23   ` Alexander Lobakin
@ 2023-04-20  9:54     ` Drewek, Wojciech
  2023-04-20 10:46       ` Drewek, Wojciech
  2023-04-20 16:51       ` Alexander Lobakin
  0 siblings, 2 replies; 46+ messages in thread
From: Drewek, Wojciech @ 2023-04-20  9:54 UTC (permalink / raw)
  To: Lobakin, Aleksander
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar

Thanks for review Olek!

Most of the comments sound reasonable to me (and I will include them) with some exceptions.


> -----Original Message-----
> From: Lobakin, Aleksander <aleksander.lobakin@intel.com>
> Sent: środa, 19 kwietnia 2023 17:24
> To: Drewek, Wojciech <wojciech.drewek@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Ertman, David M <david.m.ertman@intel.com>;
> michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel <pawel.chmielewski@intel.com>;
> Samudrala, Sridhar <sridhar.samudrala@intel.com>
> Subject: Re: [PATCH net-next 04/12] ice: Implement basic eswitch bridge setup
> 
> From: Wojciech Drewek <wojciech.drewek@intel.com>
> Date: Mon, 17 Apr 2023 11:34:04 +0200
> 
> > With this patch, ice driver is able to track if the port
> > representors or uplink port were added to the linux bridge in
> > switchdev mode. Listen for NETDEV_CHANGEUPPER events in order to
> > detect this. ice_esw_br data structure reflects the linux bridge
> > and stores all the ports of the bridge (ice_esw_br_port) in
> > xarray, it's created when the first port is added to the bridge and
> > freed once the last port is removed. Note that only one bridge is
> > supported per eswitch.
> 
> [...]
> 
> > diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
> > index ac2971073fdd..5b2ade5908e8 100644
> > --- a/drivers/net/ethernet/intel/ice/ice.h
> > +++ b/drivers/net/ethernet/intel/ice/ice.h
> > @@ -511,6 +511,7 @@ struct ice_switchdev_info {
> >  	struct ice_vsi *control_vsi;
> >  	struct ice_vsi *uplink_vsi;
> >  	bool is_running;
> > +	struct ice_esw_br_offloads *br_offloads;
> 
> 7-byte hole here unfortunately =\ After ::is_running. You can place
> ::br_offloads *before* ::is_running to avoid this (well, you'll still
> have it, but as padding at the end of the structure).
> ...or change ::is_running to "unsigned long flags" to not waste 1 byte
> for 1 bit and have 63 free flags more :D
> 
> >  };
> >
> >  struct ice_agg_node {
> 
> [...]
> 
> > +static struct ice_esw_br_port *
> > +ice_eswitch_br_netdev_to_port(struct net_device *dev)
> 
> Also const?
> 
> > +{
> > +	if (ice_is_port_repr_netdev(dev)) {
> > +		struct ice_repr *repr = ice_netdev_to_repr(dev);
> > +
> > +		return repr->br_port;
> > +	} else if (netif_is_ice(dev)) {
> > +		struct ice_pf *pf = ice_netdev_to_pf(dev);
> 
> Both @repr and @pf can also be const :p
> 
> > +
> > +		return pf->br_port;
> > +	}
> > +
> > +	return NULL;
> > +}
> 
> [...]
> 
> > +static struct ice_esw_br_port *
> > +ice_eswitch_br_port_init(struct ice_esw_br *bridge)
> > +{
> > +	struct ice_esw_br_port *br_port;
> > +
> > +	br_port = kzalloc(sizeof(*br_port), GFP_KERNEL);
> > +	if (!br_port)
> > +		return ERR_PTR(-ENOMEM);
> > +
> > +	br_port->bridge = bridge;
> 
> Since you always pass @bridge from the call site either way, does it
> make sense to do that or you could just assign -> bridge on the call
> sites after a successful allocation?

I could do that but I prefer to keep it this way.
We have two types of ports and this function is generic, It setups
things common for both types, including bridge ref.
Are you ok with it? 

> 
> > +
> > +	return br_port;
> > +}
> 
> [...]
> 
> > +static int
> > +ice_eswitch_br_port_changeupper(struct notifier_block *nb, void *ptr)
> > +{
> > +	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
> > +	struct netdev_notifier_changeupper_info *info = ptr;
> > +	struct ice_esw_br_offloads *br_offloads =
> > +		ice_nb_to_br_offloads(nb, netdev_nb);
> 
> Maybe assign it outside the declaration block to avoid line wrap?
> 
> > +	struct netlink_ext_ack *extack;
> > +	struct net_device *upper;
> > +
> > +	if (!ice_eswitch_br_is_dev_valid(dev))
> > +		return 0;
> > +
> > +	upper = info->upper_dev;
> > +	if (!netif_is_bridge_master(upper))
> > +		return 0;
> > +
> > +	extack = netdev_notifier_info_to_extack(&info->info);
> > +
> > +	return info->linking ?
> > +		ice_eswitch_br_port_link(br_offloads, dev, upper->ifindex,
> > +					 extack) :
> > +		ice_eswitch_br_port_unlink(br_offloads, dev, upper->ifindex,
> > +					   extack);
> 
> And here do that via `if return else return` to avoid multi-line ternary?
> 
> > +}
> > +
> > +static int
> > +ice_eswitch_br_port_event(struct notifier_block *nb,
> > +			  unsigned long event, void *ptr)
> 
> [...]
> 
> > diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
> > new file mode 100644
> > index 000000000000..53ea29569c36
> > --- /dev/null
> > +++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
> > @@ -0,0 +1,42 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/* Copyright (C) 2023, Intel Corporation. */
> > +
> > +#ifndef _ICE_ESWITCH_BR_H_
> > +#define _ICE_ESWITCH_BR_H_
> > +
> > +enum ice_esw_br_port_type {
> > +	ICE_ESWITCH_BR_UPLINK_PORT = 0,
> > +	ICE_ESWITCH_BR_VF_REPR_PORT = 1,
> > +};
> > +
> > +struct ice_esw_br_port {
> > +	struct ice_esw_br *bridge;
> > +	enum ice_esw_br_port_type type;
> 
> Also hole :s I'd move it one line below.
> 
> > +	struct ice_vsi *vsi;
> > +	u16 vsi_idx;
> > +};
> > +
> > +struct ice_esw_br {
> > +	struct ice_esw_br_offloads *br_offloads;
> > +	int ifindex;
> > +
> > +	struct xarray ports;
> 
> (not sure about this one, but potentially there can be a hole between
>  those two)

Move ifindex at the end?

> 
> > +};
> > +
> > +struct ice_esw_br_offloads {
> > +	struct ice_pf *pf;
> > +	struct ice_esw_br *bridge;
> > +	struct notifier_block netdev_nb;
> > +};
> > +
> > +#define ice_nb_to_br_offloads(nb, nb_name) \
> > +	container_of(nb, \
> > +		     struct ice_esw_br_offloads, \
> > +		     nb_name)
> 
> Hmm, you use it only once and only with `netdev_nb` field. Do you plan
> to add more call sites of this macro? Otherwise you could embed the
> second argument into the macro itself (mentioned `netdev_nb`) or even
> just open-code the whole macro in the sole call site.

I the next patch it is used with different nb_name (switchdev_nb)

> 
> > +
> > +void
> > +ice_eswitch_br_offloads_deinit(struct ice_pf *pf);
> > +int
> > +ice_eswitch_br_offloads_init(struct ice_pf *pf);
> > +
> > +#endif /* _ICE_ESWITCH_BR_H_ */
> [...]
> 
> Thanks,
> Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH net-next 04/12] ice: Implement basic eswitch bridge setup
  2023-04-20  9:54     ` Drewek, Wojciech
@ 2023-04-20 10:46       ` Drewek, Wojciech
  2023-04-20 16:53         ` Alexander Lobakin
  2023-04-20 16:51       ` Alexander Lobakin
  1 sibling, 1 reply; 46+ messages in thread
From: Drewek, Wojciech @ 2023-04-20 10:46 UTC (permalink / raw)
  To: Lobakin, Aleksander
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar



> -----Original Message-----
> From: Drewek, Wojciech
> Sent: czwartek, 20 kwietnia 2023 11:54
> To: Lobakin, Aleksander <aleksander.lobakin@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Ertman, David M <david.m.ertman@intel.com>;
> michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel <pawel.chmielewski@intel.com>;
> Samudrala, Sridhar <sridhar.samudrala@intel.com>
> Subject: RE: [PATCH net-next 04/12] ice: Implement basic eswitch bridge setup
> 
> Thanks for review Olek!
> 
> Most of the comments sound reasonable to me (and I will include them) with some exceptions.
> 
> 
> > -----Original Message-----
> > From: Lobakin, Aleksander <aleksander.lobakin@intel.com>
> > Sent: środa, 19 kwietnia 2023 17:24
> > To: Drewek, Wojciech <wojciech.drewek@intel.com>
> > Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Ertman, David M <david.m.ertman@intel.com>;
> > michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel <pawel.chmielewski@intel.com>;
> > Samudrala, Sridhar <sridhar.samudrala@intel.com>
> > Subject: Re: [PATCH net-next 04/12] ice: Implement basic eswitch bridge setup
> >
> > From: Wojciech Drewek <wojciech.drewek@intel.com>
> > Date: Mon, 17 Apr 2023 11:34:04 +0200
> >
> > > With this patch, ice driver is able to track if the port
> > > representors or uplink port were added to the linux bridge in
> > > switchdev mode. Listen for NETDEV_CHANGEUPPER events in order to
> > > detect this. ice_esw_br data structure reflects the linux bridge
> > > and stores all the ports of the bridge (ice_esw_br_port) in
> > > xarray, it's created when the first port is added to the bridge and
> > > freed once the last port is removed. Note that only one bridge is
> > > supported per eswitch.
> >
> > [...]
> >
> > > diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
> > > index ac2971073fdd..5b2ade5908e8 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice.h
> > > +++ b/drivers/net/ethernet/intel/ice/ice.h
> > > @@ -511,6 +511,7 @@ struct ice_switchdev_info {
> > >  	struct ice_vsi *control_vsi;
> > >  	struct ice_vsi *uplink_vsi;
> > >  	bool is_running;
> > > +	struct ice_esw_br_offloads *br_offloads;
> >
> > 7-byte hole here unfortunately =\ After ::is_running. You can place
> > ::br_offloads *before* ::is_running to avoid this (well, you'll still
> > have it, but as padding at the end of the structure).
> > ...or change ::is_running to "unsigned long flags" to not waste 1 byte
> > for 1 bit and have 63 free flags more :D
> >
> > >  };
> > >
> > >  struct ice_agg_node {
> >
> > [...]
> >
> > > +static struct ice_esw_br_port *
> > > +ice_eswitch_br_netdev_to_port(struct net_device *dev)
> >
> > Also const?

This function changes a bit in "ice: Accept LAG netdevs in bridge offloads"
With the changes introduced in this commit, I think that @dev as constant is not a good option.

> >
> > > +{
> > > +	if (ice_is_port_repr_netdev(dev)) {
> > > +		struct ice_repr *repr = ice_netdev_to_repr(dev);
> > > +
> > > +		return repr->br_port;
> > > +	} else if (netif_is_ice(dev)) {
> > > +		struct ice_pf *pf = ice_netdev_to_pf(dev);
> >
> > Both @repr and @pf can also be const :p

Repr makes sense to me, the second part will change later and I think that
there is no point in making it const

> >
> > > +
> > > +		return pf->br_port;
> > > +	}
> > > +
> > > +	return NULL;
> > > +}
> >
> > [...]
> >
> > > +static struct ice_esw_br_port *
> > > +ice_eswitch_br_port_init(struct ice_esw_br *bridge)
> > > +{
> > > +	struct ice_esw_br_port *br_port;
> > > +
> > > +	br_port = kzalloc(sizeof(*br_port), GFP_KERNEL);
> > > +	if (!br_port)
> > > +		return ERR_PTR(-ENOMEM);
> > > +
> > > +	br_port->bridge = bridge;
> >
> > Since you always pass @bridge from the call site either way, does it
> > make sense to do that or you could just assign -> bridge on the call
> > sites after a successful allocation?
> 
> I could do that but I prefer to keep it this way.
> We have two types of ports and this function is generic, It setups
> things common for both types, including bridge ref.
> Are you ok with it?
> 
> >
> > > +
> > > +	return br_port;
> > > +}
> >
> > [...]
> >
> > > +static int
> > > +ice_eswitch_br_port_changeupper(struct notifier_block *nb, void *ptr)
> > > +{
> > > +	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
> > > +	struct netdev_notifier_changeupper_info *info = ptr;
> > > +	struct ice_esw_br_offloads *br_offloads =
> > > +		ice_nb_to_br_offloads(nb, netdev_nb);
> >
> > Maybe assign it outside the declaration block to avoid line wrap?
> >
> > > +	struct netlink_ext_ack *extack;
> > > +	struct net_device *upper;
> > > +
> > > +	if (!ice_eswitch_br_is_dev_valid(dev))
> > > +		return 0;
> > > +
> > > +	upper = info->upper_dev;
> > > +	if (!netif_is_bridge_master(upper))
> > > +		return 0;
> > > +
> > > +	extack = netdev_notifier_info_to_extack(&info->info);
> > > +
> > > +	return info->linking ?
> > > +		ice_eswitch_br_port_link(br_offloads, dev, upper->ifindex,
> > > +					 extack) :
> > > +		ice_eswitch_br_port_unlink(br_offloads, dev, upper->ifindex,
> > > +					   extack);
> >
> > And here do that via `if return else return` to avoid multi-line ternary?
> >
> > > +}
> > > +
> > > +static int
> > > +ice_eswitch_br_port_event(struct notifier_block *nb,
> > > +			  unsigned long event, void *ptr)
> >
> > [...]
> >
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
> > > new file mode 100644
> > > index 000000000000..53ea29569c36
> > > --- /dev/null
> > > +++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
> > > @@ -0,0 +1,42 @@
> > > +/* SPDX-License-Identifier: GPL-2.0 */
> > > +/* Copyright (C) 2023, Intel Corporation. */
> > > +
> > > +#ifndef _ICE_ESWITCH_BR_H_
> > > +#define _ICE_ESWITCH_BR_H_
> > > +
> > > +enum ice_esw_br_port_type {
> > > +	ICE_ESWITCH_BR_UPLINK_PORT = 0,
> > > +	ICE_ESWITCH_BR_VF_REPR_PORT = 1,
> > > +};
> > > +
> > > +struct ice_esw_br_port {
> > > +	struct ice_esw_br *bridge;
> > > +	enum ice_esw_br_port_type type;
> >
> > Also hole :s I'd move it one line below.
> >
> > > +	struct ice_vsi *vsi;
> > > +	u16 vsi_idx;
> > > +};
> > > +
> > > +struct ice_esw_br {
> > > +	struct ice_esw_br_offloads *br_offloads;
> > > +	int ifindex;
> > > +
> > > +	struct xarray ports;
> >
> > (not sure about this one, but potentially there can be a hole between
> >  those two)
> 
> Move ifindex at the end?
> 
> >
> > > +};
> > > +
> > > +struct ice_esw_br_offloads {
> > > +	struct ice_pf *pf;
> > > +	struct ice_esw_br *bridge;
> > > +	struct notifier_block netdev_nb;
> > > +};
> > > +
> > > +#define ice_nb_to_br_offloads(nb, nb_name) \
> > > +	container_of(nb, \
> > > +		     struct ice_esw_br_offloads, \
> > > +		     nb_name)
> >
> > Hmm, you use it only once and only with `netdev_nb` field. Do you plan
> > to add more call sites of this macro? Otherwise you could embed the
> > second argument into the macro itself (mentioned `netdev_nb`) or even
> > just open-code the whole macro in the sole call site.
> 
> I the next patch it is used with different nb_name (switchdev_nb)
> 
> >
> > > +
> > > +void
> > > +ice_eswitch_br_offloads_deinit(struct ice_pf *pf);
> > > +int
> > > +ice_eswitch_br_offloads_init(struct ice_pf *pf);
> > > +
> > > +#endif /* _ICE_ESWITCH_BR_H_ */
> > [...]
> >
> > Thanks,
> > Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH net-next 05/12] ice: Switchdev FDB events support
  2023-04-19 15:38   ` Alexander Lobakin
@ 2023-04-20 11:27     ` Drewek, Wojciech
  2023-04-20 16:59       ` Alexander Lobakin
  0 siblings, 1 reply; 46+ messages in thread
From: Drewek, Wojciech @ 2023-04-20 11:27 UTC (permalink / raw)
  To: Lobakin, Aleksander
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar



> -----Original Message-----
> From: Lobakin, Aleksander <aleksander.lobakin@intel.com>
> Sent: środa, 19 kwietnia 2023 17:39
> To: Drewek, Wojciech <wojciech.drewek@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Lobakin, Aleksander <aleksander.lobakin@intel.com>; Ertman, David M
> <david.m.ertman@intel.com>; michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel
> <pawel.chmielewski@intel.com>; Samudrala, Sridhar <sridhar.samudrala@intel.com>
> Subject: Re: [PATCH net-next 05/12] ice: Switchdev FDB events support
> 
> From: Wojciech Drewek <wojciech.drewek@intel.com>
> Date: Mon, 17 Apr 2023 11:34:05 +0200
> 
> > Listen for SWITCHDEV_FDB_{ADD|DEL}_TO_DEVICE events while in switchdev
> > mode. Accept these events on both uplink and VF PR ports. Add HW
> > rules in newly created workqueue. FDB entries are stored in rhashtable
> > for lookup when removing the entry and in the list for cleanup
> > purpose. Direction of the HW rule depends on the type of the ports
> > on which the FDB event was received:
> 
> [...]
> 
> > +static int
> > +ice_eswitch_br_rule_delete(struct ice_hw *hw, struct ice_rule_query_data *rule)
> > +{
> > +	int err;
> > +
> > +	if (!rule)
> > +		return -EINVAL;
> > +
> > +	err = ice_rem_adv_rule_by_id(hw, rule);
> > +	kfree(rule);
> > +
> > +	return err;
> > +}
> > +
> > +static struct ice_rule_query_data *
> > +ice_eswitch_br_fwd_rule_create(struct ice_hw *hw, u16 vsi_idx, int port_type,
> 
> (no types shorter than u32 on the stack reminder)
> 
> > +			       const unsigned char *mac)
> > +{
> > +	struct ice_adv_rule_info rule_info = { 0 };
> > +	struct ice_rule_query_data *rule;
> > +	struct ice_adv_lkup_elem *list;
> > +	u16 lkups_cnt = 1;
> 
> Why have it as variable if it doesn't change? Just embed it into the
> ice_add_adv_rule() call and replace kcalloc() with kzalloc().

It will be useful later, with vlans support lkups_cnt will be equal to 1 or 2.
Can we keep it as it is?

> 
> > +	int err;
> > +
> > +	rule = kzalloc(sizeof(*rule), GFP_KERNEL);
> > +	if (!rule)
> > +		return ERR_PTR(-ENOMEM);
> > +
> > +	list = kcalloc(lkups_cnt, sizeof(*list), GFP_ATOMIC);
> 
> [...]
> 
> > +	fwd_rule = ice_eswitch_br_fwd_rule_create(hw, vsi_idx, port_type, mac);
> > +	if (IS_ERR(fwd_rule)) {
> > +		err = PTR_ERR(fwd_rule);
> > +		dev_err(dev, "Failed to create eswitch bridge %sgress forward rule, err: %d\n",
> > +			port_type == ICE_ESWITCH_BR_UPLINK_PORT ? "e" : "in",
> > +			err);
> > +		goto err_fwd_rule;
> 
> A bit suboptimal. To print errno pointer, you have %pe modifier, so you
> can just print err as:
> 
> 		... forward rule, err: %pe\n", ... : "in", fwd_rule);
> 
> Then you don't need @err at all and then below...

This is really cool, but I think it won't work here. I need to keep err in order to
return it in the err flow. I can't use fwd_rule for this purpose because
return type is ice_esw_br_flow not ice_rule_query_data.

> 
> > +	}
> > +
> > +	flow->fwd_rule = fwd_rule;
> > +
> > +	return flow;
> > +
> > +err_fwd_rule:
> > +	kfree(flow);
> > +
> > +	return ERR_PTR(err);
> 
> ...you can return @fwd_rule directly.
> 

I can't return @fwd_rule here because return type is different
This function is meant to return @flow.


[...]

> > +static int
> > +ice_eswitch_br_switchdev_event(struct notifier_block *nb,
> > +			       unsigned long event, void *ptr)
> > +{
> > +	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
> > +	struct ice_esw_br_offloads *br_offloads =
> > +		ice_nb_to_br_offloads(nb, switchdev_nb);
> > +	struct netlink_ext_ack *extack =
> > +		switchdev_notifier_info_to_extack(ptr);
> 
> (initialize-later-to-avoid-line-breaks?)
> 
> > +	struct switchdev_notifier_fdb_info *fdb_info;
> > +	struct switchdev_notifier_info *info = ptr;
> > +	struct ice_esw_br_fdb_work *work;
> > +	struct net_device *upper;
> > +	struct ice_esw_br_port *br_port;
> 
> RCT :s
> 
> > +
> > +	upper = netdev_master_upper_dev_get_rcu(dev);
> > +	if (!upper)
> > +		return NOTIFY_DONE;
> > +
> > +	if (!netif_is_bridge_master(upper))
> > +		return NOTIFY_DONE;
> > +
> > +	if (!ice_eswitch_br_is_dev_valid(dev))
> > +		return NOTIFY_DONE;
> > +
> > +	br_port = ice_eswitch_br_netdev_to_port(dev);
> > +	if (!br_port)
> > +		return NOTIFY_DONE;
> > +
> > +	switch (event) {
> > +	case SWITCHDEV_FDB_ADD_TO_DEVICE:
> > +	case SWITCHDEV_FDB_DEL_TO_DEVICE:
> > +		fdb_info = container_of(info,
> > +					struct switchdev_notifier_fdb_info,
> 
> Nit: `typeof(*fdb_info)` is shorter and would probably fit into the prev
> line.

I can make it in to one line now, thanks.

> 
> > +					info);
> > +
> > +		work = ice_eswitch_br_fdb_work_alloc(fdb_info, br_port, dev,
> > +						     event);
> 
> [...]
> 
> > +enum {
> > +	ICE_ESWITCH_BR_FDB_ADDED_BY_USER = BIT(0),
> > +};
> > +
> > +struct ice_esw_br_fdb_entry {
> > +	struct ice_esw_br_fdb_data data;
> > +	struct rhash_head ht_node;
> > +	struct list_head list;
> > +
> > +	int flags;
> 
> They can't be negative I believe? u32 then? Also I think here's a 4-byte
> hole :s But since all of the members here except this one are 8-byte
> aligned, you can't avoid it (can be filled anytime later with some other
> <= 4-byte field)
> 
> > +
> > +	struct net_device *dev;
> > +	struct ice_esw_br_port *br_port;
> > +	struct ice_esw_br_flow *flow;
> > +};
> [...]
> 
> Thanks,
> Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 04/12] ice: Implement basic eswitch bridge setup
  2023-04-20  9:54     ` Drewek, Wojciech
  2023-04-20 10:46       ` Drewek, Wojciech
@ 2023-04-20 16:51       ` Alexander Lobakin
  1 sibling, 0 replies; 46+ messages in thread
From: Alexander Lobakin @ 2023-04-20 16:51 UTC (permalink / raw)
  To: Drewek, Wojciech
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Thu, 20 Apr 2023 11:54:15 +0200

> Thanks for review Olek!
> 
> Most of the comments sound reasonable to me (and I will include them) with some exceptions.

Anytime, it's always a pleasure to review your team's code :p

>>> +static struct ice_esw_br_port *
>>> +ice_eswitch_br_port_init(struct ice_esw_br *bridge)
>>> +{
>>> +	struct ice_esw_br_port *br_port;
>>> +
>>> +	br_port = kzalloc(sizeof(*br_port), GFP_KERNEL);
>>> +	if (!br_port)
>>> +		return ERR_PTR(-ENOMEM);
>>> +
>>> +	br_port->bridge = bridge;
>>
>> Since you always pass @bridge from the call site either way, does it
>> make sense to do that or you could just assign -> bridge on the call
>> sites after a successful allocation?
> 
> I could do that but I prefer to keep it this way.
> We have two types of ports and this function is generic, It setups
> things common for both types, including bridge ref.
> Are you ok with it? 

Yes, sure. I noticed after sending that keeping this function as it is
will be more consistent with another one, which is pretty similar. So
I'm taking my words back :D

[...]

>>> +struct ice_esw_br {
>>> +	struct ice_esw_br_offloads *br_offloads;
>>> +	int ifindex;
>>> +
>>> +	struct xarray ports;
>>
>> (not sure about this one, but potentially there can be a hole between
>>  those two)
> 
> Move ifindex at the end?

I think the compilers will align this struct to 8 bytes. I'd try moving
it to the end, but I think it will just convert hole into padding at the
end. Then there's no difference and it can stay where it is now.
Holes can be filled any time when we're adding new fields, so not a big
problem.

> 
>>
>>> +};
>>> +
>>> +struct ice_esw_br_offloads {
>>> +	struct ice_pf *pf;
>>> +	struct ice_esw_br *bridge;
>>> +	struct notifier_block netdev_nb;
>>> +};
>>> +
>>> +#define ice_nb_to_br_offloads(nb, nb_name) \
>>> +	container_of(nb, \
>>> +		     struct ice_esw_br_offloads, \
>>> +		     nb_name)
>>
>> Hmm, you use it only once and only with `netdev_nb` field. Do you plan
>> to add more call sites of this macro? Otherwise you could embed the
>> second argument into the macro itself (mentioned `netdev_nb`) or even
>> just open-code the whole macro in the sole call site.
> 
> I the next patch it is used with different nb_name (switchdev_nb)

I noticed that, but only after reviewing the next patch, so sorry, this
one is closed :D

> 
>>
>>> +
>>> +void
>>> +ice_eswitch_br_offloads_deinit(struct ice_pf *pf);
>>> +int
>>> +ice_eswitch_br_offloads_init(struct ice_pf *pf);
>>> +
>>> +#endif /* _ICE_ESWITCH_BR_H_ */
>> [...]
>>
>> Thanks,
>> Olek

Thanks,
Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 04/12] ice: Implement basic eswitch bridge setup
  2023-04-20 10:46       ` Drewek, Wojciech
@ 2023-04-20 16:53         ` Alexander Lobakin
  0 siblings, 0 replies; 46+ messages in thread
From: Alexander Lobakin @ 2023-04-20 16:53 UTC (permalink / raw)
  To: Drewek, Wojciech
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Thu, 20 Apr 2023 12:46:31 +0200

> 
> 
>> -----Original Message-----
>> From: Drewek, Wojciech
>> Sent: czwartek, 20 kwietnia 2023 11:54
>> To: Lobakin, Aleksander <aleksander.lobakin@intel.com>
>> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Ertman, David M <david.m.ertman@intel.com>;
>> michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel <pawel.chmielewski@intel.com>;
>> Samudrala, Sridhar <sridhar.samudrala@intel.com>
>> Subject: RE: [PATCH net-next 04/12] ice: Implement basic eswitch bridge setup
>>
>> Thanks for review Olek!
>>
>> Most of the comments sound reasonable to me (and I will include them) with some exceptions.

[...]

>>>> +static struct ice_esw_br_port *
>>>> +ice_eswitch_br_netdev_to_port(struct net_device *dev)
>>>
>>> Also const?
> 
> This function changes a bit in "ice: Accept LAG netdevs in bridge offloads"
> With the changes introduced in this commit, I think that @dev as constant is not a good option.
> 
>>>
>>>> +{
>>>> +	if (ice_is_port_repr_netdev(dev)) {
>>>> +		struct ice_repr *repr = ice_netdev_to_repr(dev);
>>>> +
>>>> +		return repr->br_port;
>>>> +	} else if (netif_is_ice(dev)) {
>>>> +		struct ice_pf *pf = ice_netdev_to_pf(dev);
>>>
>>> Both @repr and @pf can also be const :p
> 
> Repr makes sense to me, the second part will change later and I think that
> there is no point in making it const

+ for both, not a problem, esp. given that the subsequent patches
require them to be non-constant.

> 
>>>
>>>> +
>>>> +		return pf->br_port;
>>>> +	}
>>>> +
>>>> +	return NULL;
>>>> +}
[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 05/12] ice: Switchdev FDB events support
  2023-04-20 11:27     ` Drewek, Wojciech
@ 2023-04-20 16:59       ` Alexander Lobakin
  2023-04-21  8:45         ` Drewek, Wojciech
  0 siblings, 1 reply; 46+ messages in thread
From: Alexander Lobakin @ 2023-04-20 16:59 UTC (permalink / raw)
  To: Drewek, Wojciech
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Thu, 20 Apr 2023 13:27:11 +0200

> 
> 
>> -----Original Message-----
>> From: Lobakin, Aleksander <aleksander.lobakin@intel.com>
>> Sent: środa, 19 kwietnia 2023 17:39
>> To: Drewek, Wojciech <wojciech.drewek@intel.com>
>> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Lobakin, Aleksander <aleksander.lobakin@intel.com>; Ertman, David M
>> <david.m.ertman@intel.com>; michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel
>> <pawel.chmielewski@intel.com>; Samudrala, Sridhar <sridhar.samudrala@intel.com>
>> Subject: Re: [PATCH net-next 05/12] ice: Switchdev FDB events support

[...]

>> (no types shorter than u32 on the stack reminder)
>>
>>> +			       const unsigned char *mac)
>>> +{
>>> +	struct ice_adv_rule_info rule_info = { 0 };
>>> +	struct ice_rule_query_data *rule;
>>> +	struct ice_adv_lkup_elem *list;
>>> +	u16 lkups_cnt = 1;
>>
>> Why have it as variable if it doesn't change? Just embed it into the
>> ice_add_adv_rule() call and replace kcalloc() with kzalloc().
> 
> It will be useful later, with vlans support lkups_cnt will be equal to 1 or 2.
> Can we keep it as it is?

Ah, okay, then it's surely better to keep as-is. Maybe I'd only mention
then in the commit message that this variable will be expanded to have
several values later. So that other reviewers won't trigger on the same
stuff.

> 
>>
>>> +	int err;
>>> +
>>> +	rule = kzalloc(sizeof(*rule), GFP_KERNEL);
>>> +	if (!rule)
>>> +		return ERR_PTR(-ENOMEM);
>>> +
>>> +	list = kcalloc(lkups_cnt, sizeof(*list), GFP_ATOMIC);
>>
>> [...]
>>
>>> +	fwd_rule = ice_eswitch_br_fwd_rule_create(hw, vsi_idx, port_type, mac);
>>> +	if (IS_ERR(fwd_rule)) {
>>> +		err = PTR_ERR(fwd_rule);
>>> +		dev_err(dev, "Failed to create eswitch bridge %sgress forward rule, err: %d\n",
>>> +			port_type == ICE_ESWITCH_BR_UPLINK_PORT ? "e" : "in",
>>> +			err);
>>> +		goto err_fwd_rule;
>>
>> A bit suboptimal. To print errno pointer, you have %pe modifier, so you
>> can just print err as:
>>
>> 		... forward rule, err: %pe\n", ... : "in", fwd_rule);
>>
>> Then you don't need @err at all and then below...
> 
> This is really cool, but I think it won't work here. I need to keep err in order to
> return it in the err flow. I can't use fwd_rule for this purpose because
> return type is ice_esw_br_flow not ice_rule_query_data.

My bad, forgot to mention. If you want to return error pointer of a type
different from the return value's one, there's ERR_CAST(). It casts
error pointer to `void *`, so that there'll be no warnings then.
Here's nice example: [0]

> 
>>
>>> +	}
>>> +
>>> +	flow->fwd_rule = fwd_rule;
>>> +
>>> +	return flow;
>>> +
>>> +err_fwd_rule:
>>> +	kfree(flow);
>>> +
>>> +	return ERR_PTR(err);
>>
>> ...you can return @fwd_rule directly.
>>
> 
> I can't return @fwd_rule here because return type is different
> This function is meant to return @flow.

[...]

>>> +	struct net_device *dev;
>>> +	struct ice_esw_br_port *br_port;
>>> +	struct ice_esw_br_flow *flow;
>>> +};
>> [...]
>>
>> Thanks,
>> Olek

[0]
https://elixir.bootlin.com/linux/latest/source/drivers/clk/clk-fractional-divider.c#L293

Thanks,
Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH net-next 05/12] ice: Switchdev FDB events support
  2023-04-20 16:59       ` Alexander Lobakin
@ 2023-04-21  8:45         ` Drewek, Wojciech
  0 siblings, 0 replies; 46+ messages in thread
From: Drewek, Wojciech @ 2023-04-21  8:45 UTC (permalink / raw)
  To: Lobakin, Aleksander
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar



> -----Original Message-----
> From: Lobakin, Aleksander <aleksander.lobakin@intel.com>
> Sent: czwartek, 20 kwietnia 2023 19:00
> To: Drewek, Wojciech <wojciech.drewek@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Ertman, David M <david.m.ertman@intel.com>;
> michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel <pawel.chmielewski@intel.com>;
> Samudrala, Sridhar <sridhar.samudrala@intel.com>
> Subject: Re: [PATCH net-next 05/12] ice: Switchdev FDB events support
> 
> From: Wojciech Drewek <wojciech.drewek@intel.com>
> Date: Thu, 20 Apr 2023 13:27:11 +0200
> 
> >
> >
> >> -----Original Message-----
> >> From: Lobakin, Aleksander <aleksander.lobakin@intel.com>
> >> Sent: środa, 19 kwietnia 2023 17:39
> >> To: Drewek, Wojciech <wojciech.drewek@intel.com>
> >> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Lobakin, Aleksander <aleksander.lobakin@intel.com>; Ertman,
> David M
> >> <david.m.ertman@intel.com>; michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel
> >> <pawel.chmielewski@intel.com>; Samudrala, Sridhar <sridhar.samudrala@intel.com>
> >> Subject: Re: [PATCH net-next 05/12] ice: Switchdev FDB events support
> 
> [...]
> 
> >> (no types shorter than u32 on the stack reminder)
> >>
> >>> +			       const unsigned char *mac)
> >>> +{
> >>> +	struct ice_adv_rule_info rule_info = { 0 };
> >>> +	struct ice_rule_query_data *rule;
> >>> +	struct ice_adv_lkup_elem *list;
> >>> +	u16 lkups_cnt = 1;
> >>
> >> Why have it as variable if it doesn't change? Just embed it into the
> >> ice_add_adv_rule() call and replace kcalloc() with kzalloc().
> >
> > It will be useful later, with vlans support lkups_cnt will be equal to 1 or 2.
> > Can we keep it as it is?
> 
> Ah, okay, then it's surely better to keep as-is. Maybe I'd only mention
> then in the commit message that this variable will be expanded to have
> several values later. So that other reviewers won't trigger on the same
> stuff.

Sure thing

> 
> >
> >>
> >>> +	int err;
> >>> +
> >>> +	rule = kzalloc(sizeof(*rule), GFP_KERNEL);
> >>> +	if (!rule)
> >>> +		return ERR_PTR(-ENOMEM);
> >>> +
> >>> +	list = kcalloc(lkups_cnt, sizeof(*list), GFP_ATOMIC);
> >>
> >> [...]
> >>
> >>> +	fwd_rule = ice_eswitch_br_fwd_rule_create(hw, vsi_idx, port_type, mac);
> >>> +	if (IS_ERR(fwd_rule)) {
> >>> +		err = PTR_ERR(fwd_rule);
> >>> +		dev_err(dev, "Failed to create eswitch bridge %sgress forward rule, err: %d\n",
> >>> +			port_type == ICE_ESWITCH_BR_UPLINK_PORT ? "e" : "in",
> >>> +			err);
> >>> +		goto err_fwd_rule;
> >>
> >> A bit suboptimal. To print errno pointer, you have %pe modifier, so you
> >> can just print err as:
> >>
> >> 		... forward rule, err: %pe\n", ... : "in", fwd_rule);
> >>
> >> Then you don't need @err at all and then below...
> >
> > This is really cool, but I think it won't work here. I need to keep err in order to
> > return it in the err flow. I can't use fwd_rule for this purpose because
> > return type is ice_esw_br_flow not ice_rule_query_data.
> 
> My bad, forgot to mention. If you want to return error pointer of a type
> different from the return value's one, there's ERR_CAST(). It casts
> error pointer to `void *`, so that there'll be no warnings then.
> Here's nice example: [0]

Another cool thing I've learn, still I don't we can use it here.
In the next patch, another rule is created in  this function, called
guard rule. Its creation can also fail and we have second pointer for it
called (guard_rule).

> 
> >
> >>
> >>> +	}
> >>> +
> >>> +	flow->fwd_rule = fwd_rule;
> >>> +
> >>> +	return flow;
> >>> +
> >>> +err_fwd_rule:
> >>> +	kfree(flow);
> >>> +
> >>> +	return ERR_PTR(err);
> >>
> >> ...you can return @fwd_rule directly.
> >>
> >
> > I can't return @fwd_rule here because return type is different
> > This function is meant to return @flow.
> 
> [...]
> 
> >>> +	struct net_device *dev;
> >>> +	struct ice_esw_br_port *br_port;
> >>> +	struct ice_esw_br_flow *flow;
> >>> +};
> >> [...]
> >>
> >> Thanks,
> >> Olek
> 
> [0]
> https://elixir.bootlin.com/linux/latest/source/drivers/clk/clk-fractional-divider.c#L293
> 
> Thanks,
> Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 06/12] ice: Add guard rule when creating FDB in switchdev
  2023-04-17  9:34 ` [PATCH net-next 06/12] ice: Add guard rule when creating FDB in switchdev Wojciech Drewek
@ 2023-04-21 14:22   ` Alexander Lobakin
  2023-04-25  9:17     ` Drewek, Wojciech
  0 siblings, 1 reply; 46+ messages in thread
From: Alexander Lobakin @ 2023-04-21 14:22 UTC (permalink / raw)
  To: Wojciech Drewek
  Cc: intel-wired-lan, netdev, alexandr.lobakin, david.m.ertman,
	michal.swiatkowski, marcin.szycik, pawel.chmielewski,
	sridhar.samudrala

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Mon, 17 Apr 2023 11:34:06 +0200

> From: Marcin Szycik <marcin.szycik@intel.com>
> 
> Introduce new "guard" rule upon FDB entry creation.
> 
> It matches on src_mac, has valid bit unset, allow_pass_l2 set
> and has a nop action.

[...]

> +static struct ice_rule_query_data *
> +ice_eswitch_br_guard_rule_create(struct ice_hw *hw, u16 vsi_idx,
> +				 const unsigned char *mac)
> +{
> +	struct ice_adv_rule_info rule_info = { 0 };
> +	struct ice_rule_query_data *rule;
> +	struct ice_adv_lkup_elem *list;
> +	const u16 lkups_cnt = 1;
> +	int err;

You can initialize it with -%ENOMEM right here in order to...

> +
> +	rule = kzalloc(sizeof(*rule), GFP_KERNEL);
> +	if (!rule) {
> +		err = -ENOMEM;
> +		goto err_exit;
> +	}
> +
> +	list = kcalloc(lkups_cnt, sizeof(*list), GFP_ATOMIC);
> +	if (!list) {
> +		err = -ENOMEM;
> +		goto err_list_alloc;
> +	}

...make those 2 ifs goto-oneliners :3 As...

> +
> +	list[0].type = ICE_MAC_OFOS;
> +	ether_addr_copy(list[0].h_u.eth_hdr.src_addr, mac);
> +	eth_broadcast_addr(list[0].m_u.eth_hdr.src_addr);
> +
> +	rule_info.allow_pass_l2 = true;
> +	rule_info.sw_act.vsi_handle = vsi_idx;
> +	rule_info.sw_act.fltr_act = ICE_NOP;
> +	rule_info.priority = 5;
> +
> +	err = ice_add_adv_rule(hw, list, lkups_cnt, &rule_info, rule);

...it's overwritten here anyway, so it is safe to init it with an error
value.

> +	if (err)
> +		goto err_add_rule;
> +
> +	return rule;
> +
> +err_add_rule:
> +	kfree(list);
> +err_list_alloc:
> +	kfree(rule);
> +err_exit:
> +	return ERR_PTR(err);
> +}
> +
>  static struct ice_esw_br_flow *
>  ice_eswitch_br_flow_create(struct device *dev, struct ice_hw *hw, u16 vsi_idx,
>  			   int port_type, const unsigned char *mac)
>  {
> -	struct ice_rule_query_data *fwd_rule;
> +	struct ice_rule_query_data *fwd_rule, *guard_rule;
>  	struct ice_esw_br_flow *flow;
>  	int err;
>  
> @@ -155,10 +202,22 @@ ice_eswitch_br_flow_create(struct device *dev, struct ice_hw *hw, u16 vsi_idx,
>  		goto err_fwd_rule;
>  	}
>  
> +	guard_rule = ice_eswitch_br_guard_rule_create(hw, vsi_idx, mac);
> +	if (IS_ERR(guard_rule)) {
> +		err = PTR_ERR(guard_rule);

Aaah ok, that's what you meant in the previous mails. I see now.
You can either leave it like that or there's an alternative -- pick the
one that you like the most:

	guard_rule = ice_eswitch_...
	err = PTR_ERR(guard_rule);
	if (err) {
		...

> +		dev_err(dev, "Failed to create eswitch bridge %sgress guard rule, err: %d\n",
> +			port_type == ICE_ESWITCH_BR_UPLINK_PORT ? "e" : "in",
> +			err);

You still can print it via "%pe" + @guard_rule instead of @err :p (same
with @fwd_rule above)

> +		goto err_guard_rule;
> +	}
> +
>  	flow->fwd_rule = fwd_rule;
> +	flow->guard_rule = guard_rule;
>  
>  	return flow;

[...]

> @@ -4624,7 +4628,7 @@ static struct ice_protocol_entry ice_prot_id_tbl[ICE_PROTOCOL_LAST] = {
>   */
>  static u16
>  ice_find_recp(struct ice_hw *hw, struct ice_prot_lkup_ext *lkup_exts,
> -	      enum ice_sw_tunnel_type tun_type)
> +	      struct ice_adv_rule_info *rinfo)

Can be const I think?

>  {
>  	bool refresh_required = true;
>  	struct ice_sw_recipe *recp;

[...]

> @@ -5075,6 +5082,14 @@ ice_add_sw_recipe(struct ice_hw *hw, struct ice_sw_recipe *rm,
>  		set_bit(buf[recps].recipe_indx,
>  			(unsigned long *)buf[recps].recipe_bitmap);
>  		buf[recps].content.act_ctrl_fwd_priority = rm->priority;
> +
> +		if (rm->need_pass_l2)
> +			buf[recps].content.act_ctrl |=
> +				ICE_AQ_RECIPE_ACT_NEED_PASS_L2;
> +
> +		if (rm->allow_pass_l2)
> +			buf[recps].content.act_ctrl |=
> +				ICE_AQ_RECIPE_ACT_ALLOW_PASS_L2;

I don't like these line breaks :s

		type_of_content *cont;
		...

		/* As far as I can see, it can be used above as well */
		cont = &buf[recps].content;

		if (rm->need_pass_l2)
			cont->act_ctrl |= ICE_AQ_RECIPE_ACT_NEED_PASS_L2;
		if (rm->allow_pass_l2)
			cont->act_ctrl |= ICE_AQ_RECIPE_ACT_ALLOW_PASS_L2;

>  		recps++;
>  	}
>  

[...]

> @@ -6166,6 +6190,11 @@ ice_add_adv_rule(struct ice_hw *hw, struct ice_adv_lkup_elem *lkups,
>  		act |= ICE_SINGLE_ACT_VSI_FORWARDING | ICE_SINGLE_ACT_DROP |
>  		       ICE_SINGLE_ACT_VALID_BIT;
>  		break;
> +	case ICE_NOP:
> +		act |= (rinfo->sw_act.fwd_id.hw_vsi_id <<
> +			ICE_SINGLE_ACT_VSI_ID_S) & ICE_SINGLE_ACT_VSI_ID_M;

`FIELD_PREP(ICE_SINGLE_ACT_VSI_ID_M, rinfo->sw_act.fwd_id.hw_vsi_id)`?

> +		act &= ~ICE_SINGLE_ACT_VALID_BIT;
> +		break;
>  	default:
>  		status = -EIO;
>  		goto err_ice_add_adv_rule;
> @@ -6446,7 +6475,7 @@ ice_rem_adv_rule(struct ice_hw *hw, struct ice_adv_lkup_elem *lkups,
>  			return -EIO;
>  	}
>  
> -	rid = ice_find_recp(hw, &lkup_exts, rinfo->tun_type);
> +	rid = ice_find_recp(hw, &lkup_exts, rinfo);
>  	/* If did not find a recipe that match the existing criteria */
>  	if (rid == ICE_MAX_NUM_RECIPES)
>  		return -EINVAL;
> diff --git a/drivers/net/ethernet/intel/ice/ice_switch.h b/drivers/net/ethernet/intel/ice/ice_switch.h
> index c84b56fe84a5..5ecce39cf1f5 100644
> --- a/drivers/net/ethernet/intel/ice/ice_switch.h
> +++ b/drivers/net/ethernet/intel/ice/ice_switch.h
> @@ -191,6 +191,8 @@ struct ice_adv_rule_info {
>  	u16 vlan_type;
>  	u16 fltr_rule_id;
>  	u32 priority;
> +	u8 need_pass_l2;
> +	u8 allow_pass_l2;

They can be either true or false, nothing else, right? I'd make them
occupy 1 bit per var then:

	u16 need_pass_l2:1;
	u16 allow_pass_l2:1;
	u16 src_vsi;

+14 free bits for more flags, no holes (stacked with ::src_vsi).

>  	u16 src_vsi;
>  	struct ice_sw_act_ctrl sw_act;
>  	struct ice_adv_rule_flags_info flags_info;
> @@ -254,6 +256,9 @@ struct ice_sw_recipe {
>  	 */
>  	u8 priority;
>  
> +	u8 need_pass_l2;
> +	u8 allow_pass_l2;

(same with bitfields here, just use u8 :1 instead of u16 here to stack
 with ::priority)

> +
>  	struct list_head rg_list;

[...]

Thanks,
Olek


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 07/12] ice: Accept LAG netdevs in bridge offloads
  2023-04-17  9:34 ` [PATCH net-next 07/12] ice: Accept LAG netdevs in bridge offloads Wojciech Drewek
@ 2023-04-21 14:40   ` Alexander Lobakin
  2023-04-26 11:31     ` Drewek, Wojciech
  0 siblings, 1 reply; 46+ messages in thread
From: Alexander Lobakin @ 2023-04-21 14:40 UTC (permalink / raw)
  To: Wojciech Drewek
  Cc: intel-wired-lan, netdev, david.m.ertman, michal.swiatkowski,
	marcin.szycik, pawel.chmielewski, sridhar.samudrala

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Mon, 17 Apr 2023 11:34:07 +0200

> Allow LAG interfaces to be used in bridge offload using
> netif_is_lag_master. In this case, search for ice netdev in
> the list of LAG's lower devices.
> 
> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
> ---
>  .../net/ethernet/intel/ice/ice_eswitch_br.c   | 40 ++++++++++++++++---
>  1 file changed, 35 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
> index 82b5eb2020cd..49381e4bf62a 100644
> --- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
> +++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
> @@ -15,8 +15,21 @@ static const struct rhashtable_params ice_fdb_ht_params = {
>  
>  static bool ice_eswitch_br_is_dev_valid(const struct net_device *dev)
>  {
> -	/* Accept only PF netdev and PRs */
> -	return ice_is_port_repr_netdev(dev) || netif_is_ice(dev);
> +	/* Accept only PF netdev, PRs and LAG */
> +	return ice_is_port_repr_netdev(dev) || netif_is_ice(dev) ||
> +		netif_is_lag_master(dev);

Nit: usually we align to `return` (7 spaces), not with one tab:

	return ice_is_port_repr_netdev(dev) || netif_is_ice(dev) ||
	       netif_is_lag_master(dev);

> +}
> +
> +static struct net_device *
> +ice_eswitch_br_get_uplnik_from_lag(struct net_device *lag_dev)
> +{
> +	struct net_device *lower;
> +	struct list_head *iter;
> +
> +	netdev_for_each_lower_dev(lag_dev, lower, iter)
> +		if (netif_is_ice(lower))
> +			return lower;

Here I think the kernel guidelines would require to have a set of braces
(each multi-line code block must be enclosed, even if it works without).
I mean, I wasn't doing it myself using the rule "as minimum braces as
needed to work", but then my colleague showed me the doc :D

	for_each_lover(...) {
		if (is_ice(lover))
			return lover;
	}

In contrary, this:

	for_each_something()
		/* Some useful comment */
		do_something();

is not mentioned in the rules as requiring braces :s

> +	return NULL;
>  }
>  
>  static struct ice_esw_br_port *
> @@ -26,8 +39,16 @@ ice_eswitch_br_netdev_to_port(struct net_device *dev)
>  		struct ice_repr *repr = ice_netdev_to_repr(dev);
>  
>  		return repr->br_port;
> -	} else if (netif_is_ice(dev)) {
> -		struct ice_pf *pf = ice_netdev_to_pf(dev);
> +	} else if (netif_is_ice(dev) || netif_is_lag_master(dev)) {
> +		struct net_device *ice_dev = dev;
> +		struct ice_pf *pf;
> +
> +		if (netif_is_lag_master(dev))
> +			ice_dev = ice_eswitch_br_get_uplnik_from_lag(dev);

Maybe just reuse @dev instead of one more var?
Or do it this way:

		struct net_device *ice_dev;

		...

		if (netif_is_lag_master(dev))
			ice_dev = ice_eswitch ...
		else
			ice_dev = dev;
		if (!ice_dev)
			return NULL;

Otherwise it's a bit confusing to have `if` in one place and `else`
(implicit) in another one, at least it took some time for me ._.

> +		if (!ice_dev)
> +			return NULL;
> +
> +		pf = ice_netdev_to_pf(ice_dev);
>  
>  		return pf->br_port;
>  	}
> @@ -719,7 +740,16 @@ ice_eswitch_br_port_link(struct ice_esw_br_offloads *br_offloads,
>  
>  		err = ice_eswitch_br_vf_repr_port_init(bridge, repr);
>  	} else {
> -		struct ice_pf *pf = ice_netdev_to_pf(dev);
> +		struct net_device *ice_dev = dev;
> +		struct ice_pf *pf;
> +
> +		if (netif_is_lag_master(dev))
> +			ice_dev = ice_eswitch_br_get_uplnik_from_lag(dev);

(same)

> +
> +		if (!ice_dev)
> +			return 0;
> +
> +		pf = ice_netdev_to_pf(ice_dev);
>  
>  		err = ice_eswitch_br_uplink_port_init(bridge, pf);
>  	}

Thanks,
Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 08/12] ice: Add VLAN FDB support in switchdev mode
  2023-04-17  9:34 ` [PATCH net-next 08/12] ice: Add VLAN FDB support in switchdev mode Wojciech Drewek
@ 2023-04-21 15:25   ` Alexander Lobakin
  2023-04-27 10:28     ` Drewek, Wojciech
  0 siblings, 1 reply; 46+ messages in thread
From: Alexander Lobakin @ 2023-04-21 15:25 UTC (permalink / raw)
  To: Wojciech Drewek
  Cc: intel-wired-lan, netdev, alexandr.lobakin, david.m.ertman,
	michal.swiatkowski, marcin.szycik, pawel.chmielewski,
	sridhar.samudrala

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Mon, 17 Apr 2023 11:34:08 +0200

> From: Marcin Szycik <marcin.szycik@intel.com>
> 
> Add support for matching on VLAN tag in bridge offloads.
> Currently only trunk mode is supported.
> 
> To enable VLAN filtering (existing FDB entries will be deleted):
> ip link set $BR type bridge vlan_filtering 1
> 
> To add VLANs to bridge in trunk mode:
> bridge vlan add dev $PF1 vid 110-111
> bridge vlan add dev $VF1_PR vid 110-111
> 
> Signed-off-by: Marcin Szycik <marcin.szycik@intel.com>
> ---
>  .../net/ethernet/intel/ice/ice_eswitch_br.c   | 319 +++++++++++++++++-
>  .../net/ethernet/intel/ice/ice_eswitch_br.h   |  12 +
>  2 files changed, 317 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
> index 49381e4bf62a..56d36e397b12 100644
> --- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
> +++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
> @@ -59,13 +59,19 @@ ice_eswitch_br_netdev_to_port(struct net_device *dev)
>  static void
>  ice_eswitch_br_ingress_rule_setup(struct ice_adv_lkup_elem *list,
>  				  struct ice_adv_rule_info *rule_info,
> -				  const unsigned char *mac,
> +				  const unsigned char *mac, bool vlan, u16 vid,

Could we use one combined argument? Doesn't `!!vid == !!vlan`? VID 0 is
reserved IIRC...

(same in all the places below)

>  				  u8 pf_id, u16 vf_vsi_idx)
>  {
>  	list[0].type = ICE_MAC_OFOS;
>  	ether_addr_copy(list[0].h_u.eth_hdr.dst_addr, mac);
>  	eth_broadcast_addr(list[0].m_u.eth_hdr.dst_addr);

[...]

> @@ -344,10 +389,33 @@ ice_eswitch_br_fdb_entry_create(struct net_device *netdev,
>  	struct device *dev = ice_pf_to_dev(pf);
>  	struct ice_esw_br_fdb_entry *fdb_entry;
>  	struct ice_esw_br_flow *flow;
> +	struct ice_esw_br_vlan *vlan;
>  	struct ice_hw *hw = &pf->hw;
> +	bool add_vlan = false;
>  	unsigned long event;
>  	int err;
>  
> +	/* FIXME: untagged filtering is not yet supported
> +	 */

Shouldn't be present in release code I believe. I mean, the sentence is
fine (just don't forget dot at the end), but without "FIXME:". And it
can be one-liner.

> +	if (!(bridge->flags & ICE_ESWITCH_BR_VLAN_FILTERING) && vid)
> +		return;

[...]

> +static void
> +ice_eswitch_br_vlan_filtering_set(struct ice_esw_br *bridge, bool enable)
> +{
> +	bool filtering = bridge->flags & ICE_ESWITCH_BR_VLAN_FILTERING;
> +
> +	if (filtering == enable)
> +		return;

	if (enable == !!(bridge->flags & ICE_ESWITCH_BR_VLAN_FILTERING))

?

> +
> +	ice_eswitch_br_fdb_flush(bridge);
> +	if (enable)
> +		bridge->flags |= ICE_ESWITCH_BR_VLAN_FILTERING;
> +	else
> +		bridge->flags &= ~ICE_ESWITCH_BR_VLAN_FILTERING;
> +}

[...]

> +	port = xa_load(&bridge->ports, vsi_idx);
> +	if (!port)
> +		return -EINVAL;
> +
> +	vlan = xa_load(&port->vlans, vid);
> +	if (vlan) {
> +		if (vlan->flags == flags)
> +			return 0;
> +
> +		ice_eswitch_br_vlan_cleanup(port, vlan);
> +	}
> +
> +	vlan = ice_eswitch_br_vlan_create(vid, flags, port);
> +	if (IS_ERR(vlan)) {
> +		NL_SET_ERR_MSG_MOD(extack, "Failed to create VLAN entry");

FYI, there's NL_SET_ERR_MSG_FMT_MOD() landed recently (a couple releases
back), which supports format strings. E.g. you could pass VID, VSI ID,
flags etc. there to have more meaningful output (right in userspace).

> +		return PTR_ERR(vlan);
> +	}
> +
> +	return 0;
> +}

[...]

> +static int
> +ice_eswitch_br_port_obj_add(struct net_device *netdev, const void *ctx,
> +			    const struct switchdev_obj *obj,
> +			    struct netlink_ext_ack *extack)
> +{
> +	struct ice_esw_br_port *br_port = ice_eswitch_br_netdev_to_port(netdev);
> +	struct switchdev_obj_port_vlan *vlan;
> +	int err;
> +
> +	if (!br_port)
> +		return -EINVAL;
> +
> +	switch (obj->id) {
> +	case SWITCHDEV_OBJ_ID_PORT_VLAN:
> +		vlan = SWITCHDEV_OBJ_PORT_VLAN(obj);
> +		err = ice_eswitch_br_port_vlan_add(br_port->bridge,
> +						   br_port->vsi_idx, vlan->vid,
> +						   vlan->flags, extack);

return right here? You have `default` in the switch block, so the
compiler shouldn't complain if you remove it from the end of the func.

> +		break;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +
> +	return err;
> +}
> +
> +static int
> +ice_eswitch_br_port_obj_del(struct net_device *netdev, const void *ctx,
> +			    const struct switchdev_obj *obj)
> +{
> +	struct ice_esw_br_port *br_port = ice_eswitch_br_netdev_to_port(netdev);
> +	struct switchdev_obj_port_vlan *vlan;
> +
> +	if (!br_port)
> +		return -EINVAL;
> +
> +	switch (obj->id) {
> +	case SWITCHDEV_OBJ_ID_PORT_VLAN:
> +		vlan = SWITCHDEV_OBJ_PORT_VLAN(obj);
> +		ice_eswitch_br_port_vlan_del(br_port->bridge, br_port->vsi_idx,
> +					     vlan->vid);

(same)

> +		break;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +ice_eswitch_br_port_obj_attr_set(struct net_device *netdev, const void *ctx,
> +				 const struct switchdev_attr *attr,
> +				 struct netlink_ext_ack *extack)
> +{
> +	struct ice_esw_br_port *br_port = ice_eswitch_br_netdev_to_port(netdev);
> +
> +	if (!br_port)
> +		return -EINVAL;
> +
> +	switch (attr->id) {
> +	case SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING:
> +		ice_eswitch_br_vlan_filtering_set(br_port->bridge,
> +						  attr->u.vlan_filtering);

(and here)

> +		break;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +
> +	return 0;
> +}

[...]

> +	br_offloads->switchdev_blk.notifier_call =
> +		ice_eswitch_br_event_blocking;

Oh, you have two usages of ->switchdev_blk here, so you can add an
intermediate variable to avoid line breaking, which would also shorten
the line below :D

	nb = &br_offloads->switchdev_blk;
	nb->notifier_call = ice_eswitch_br_event_blocking;
	...

> +	err = register_switchdev_blocking_notifier(&br_offloads->switchdev_blk);
> +	if (err) {
> +		dev_err(dev,
> +			"Failed to register bridge blocking switchdev notifier\n");
> +		goto err_reg_switchdev_blk;
> +	}
> +
>  	br_offloads->netdev_nb.notifier_call = ice_eswitch_br_port_event;
>  	err = register_netdevice_notifier(&br_offloads->netdev_nb);

(here the same, but no line breaks, so up to you. You could reuse the
 same variable or leave it as it is)

>  	if (err) {

[...]

> diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
> index 73ad81bad655..cf3e2615a62a 100644
> --- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
> +++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
> @@ -42,10 +42,16 @@ struct ice_esw_br_port {
>  	enum ice_esw_br_port_type type;
>  	struct ice_vsi *vsi;
>  	u16 vsi_idx;
> +	struct xarray vlans;

Hmm, I feel like you can make ::type u16 and then stack it with
::vsi_idx, so that you avoid a hole here.

> +};
> +
> +enum {
> +	ICE_ESWITCH_BR_VLAN_FILTERING = BIT(0),
>  };
>  
>  struct ice_esw_br {
>  	struct ice_esw_br_offloads *br_offloads;
> +	int flags;

Unsigned types fit flags better I think?

>  	int ifindex;

(BTW, ifindex is also usually unsigned unless it's not an error)

>  
>  	struct xarray ports;
[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 10/12] ice: implement static version of ageing
  2023-04-17  9:34 ` [PATCH net-next 10/12] ice: implement static version of ageing Wojciech Drewek
@ 2023-04-21 16:22   ` Alexander Lobakin
  2023-05-09 10:55     ` Drewek, Wojciech
  0 siblings, 1 reply; 46+ messages in thread
From: Alexander Lobakin @ 2023-04-21 16:22 UTC (permalink / raw)
  To: Wojciech Drewek
  Cc: intel-wired-lan, netdev, david.m.ertman, michal.swiatkowski,
	marcin.szycik, pawel.chmielewski, sridhar.samudrala

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Mon, 17 Apr 2023 11:34:10 +0200

> From: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> 
> Remove fdb entries always when ageing time expired.
> 
> Allow user to set ageing time using port object attribute.
> 
> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> ---
>  .../net/ethernet/intel/ice/ice_eswitch_br.c   | 46 +++++++++++++++++++
>  .../net/ethernet/intel/ice/ice_eswitch_br.h   | 11 +++++
>  2 files changed, 57 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
> index a21eca5088f7..6c3144f98100 100644
> --- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
> +++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
> @@ -8,6 +8,8 @@
>  #include "ice_vlan.h"
>  #include "ice_vf_vsi_vlan_ops.h"
>  
> +#define ICE_ESW_BRIDGE_UPDATE_INTERVAL_MS 1000

I think you can define it without '_MS' and as msecs_to_jiffies(1000)
right here, so that you wouldn't need to convert it at use sites (it's
more expensive to do there in terms of chars vs line width).

> +
>  static const struct rhashtable_params ice_fdb_ht_params = {
>  	.key_offset = offsetof(struct ice_esw_br_fdb_entry, data),
>  	.key_len = sizeof(struct ice_esw_br_fdb_data),
> @@ -440,6 +442,7 @@ ice_eswitch_br_fdb_entry_create(struct net_device *netdev,
>  	fdb_entry->br_port = br_port;
>  	fdb_entry->flow = flow;
>  	fdb_entry->dev = netdev;
> +	fdb_entry->last_use = jiffies;
>  	event = SWITCHDEV_FDB_ADD_TO_BRIDGE;
>  
>  	if (added_by_user) {
> @@ -838,6 +841,10 @@ ice_eswitch_br_port_obj_attr_set(struct net_device *netdev, const void *ctx,
>  		ice_eswitch_br_vlan_filtering_set(br_port->bridge,
>  						  attr->u.vlan_filtering);
>  		break;
> +	case SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME:
> +		br_port->bridge->ageing_time =
> +			clock_t_to_jiffies(attr->u.ageing_time);

Why reviews also teach the reviewer himself -- because I never knew of
clock_t and that userspace has its own ticks, which we have to convert O_.

(sounds as a joke BTW, why not just use ms/us/ns everywhere, "tick" is
 something very intimate/internal)

> +		break;
>  	default:
>  		return -EOPNOTSUPP;
>  	}

[...]

> +	if (!bridge)
> +		return;
> +
> +	rtnl_lock();
> +	list_for_each_entry_safe(entry, tmp, &bridge->fdb_list, list) {
> +		if (entry->flags & ICE_ESWITCH_BR_FDB_ADDED_BY_USER)
> +			continue;
> +
> +		if (time_is_before_jiffies(entry->last_use +
> +					   bridge->ageing_time))
> +			ice_eswitch_br_fdb_entry_notify_and_cleanup(bridge,
> +								    entry);

Maybe invert the condition to give a bit more space for arguments?

		if (time_is_after_eq_jiffies(entry->last_use +
					     bridge->ageing_time))
			continue;

		ice_eswitch_br_fdb_entry_notify_and_cleanup(bridge, entry);
	}


> +	}
> +	rtnl_unlock();
> +}
> +
> +static void ice_eswitch_br_update_work(struct work_struct *work)
> +{
> +	struct ice_esw_br_offloads *br_offloads =
> +		ice_work_to_br_offloads(work);

Assign it in a separate line pls :s

> +
> +	ice_eswitch_br_update(br_offloads);
> +
> +	queue_delayed_work(br_offloads->wq, &br_offloads->update_work,
> +			   msecs_to_jiffies(ICE_ESW_BRIDGE_UPDATE_INTERVAL_MS));
> +}
[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 12/12] ice: Ethtool fdb_cnt stats
  2023-04-17  9:34 ` [PATCH net-next 12/12] ice: Ethtool fdb_cnt stats Wojciech Drewek
@ 2023-04-21 16:32   ` Alexander Lobakin
  2023-05-09 12:52     ` Drewek, Wojciech
  0 siblings, 1 reply; 46+ messages in thread
From: Alexander Lobakin @ 2023-04-21 16:32 UTC (permalink / raw)
  To: Wojciech Drewek
  Cc: intel-wired-lan, netdev, david.m.ertman, michal.swiatkowski,
	marcin.szycik, pawel.chmielewski, sridhar.samudrala

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Mon, 17 Apr 2023 11:34:12 +0200

> Introduce new ethtool statistic which is 'fdb_cnt'. It
> provides information about how many bridge fdbs are created on
> a given netdev.

[...]

> @@ -339,6 +340,7 @@ ice_eswitch_br_fdb_entry_delete(struct ice_esw_br *bridge,
>  	ice_eswitch_br_flow_delete(pf, fdb_entry->flow);
>  
>  	kfree(fdb_entry);
> +	vsi->fdb_cnt--;

Are FDB operations always serialized within one netdev? Because if it's
not, this probably needs to be atomic_t.

>  }
>  
>  static void

[...]

> diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
> index 8407c7175cf6..d06b2a688323 100644
> --- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
> +++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
> @@ -64,6 +64,7 @@ static const struct ice_stats ice_gstrings_vsi_stats[] = {
>  	ICE_VSI_STAT("tx_linearize", tx_linearize),
>  	ICE_VSI_STAT("tx_busy", tx_busy),
>  	ICE_VSI_STAT("tx_restart", tx_restart),
> +	ICE_VSI_STAT("fdb_cnt", fdb_cnt),

It's confusing to me to see it in the Ethtool stats. They're usually
counters, ice is no an exception. But this one is not, so it might give
wrong impression.
Have you considered alternatives? rtnl (iproute) or maybe even Devlink
(but I believe the former fits better)? This might be a good candidate
to become a generic stat, who knows.

>  };
>  
>  enum ice_ethtool_test_id {

Thanks,
Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 09/12] ice: implement bridge port vlan
  2023-04-17  9:34 ` [PATCH net-next 09/12] ice: implement bridge port vlan Wojciech Drewek
@ 2023-04-21 16:35   ` Alexander Lobakin
  2023-05-09 11:25     ` Drewek, Wojciech
  0 siblings, 1 reply; 46+ messages in thread
From: Alexander Lobakin @ 2023-04-21 16:35 UTC (permalink / raw)
  To: Wojciech Drewek
  Cc: intel-wired-lan, netdev, david.m.ertman, michal.swiatkowski,
	marcin.szycik, pawel.chmielewski, sridhar.samudrala

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Mon, 17 Apr 2023 11:34:09 +0200

> From: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> 
> Port VLAN in this case means push and pop VLAN action on specific vid.
> There are a few limitation in hardware:
> - push and pop can't be used separately
> - if port VLAN is used there can't be any trunk VLANs, because pop
>   action is done on all trafic received by VSI in port VLAN mode
> - port VLAN mode on uplink port isn't supported

[...]

> @@ -610,11 +612,26 @@ ice_eswitch_br_vlan_filtering_set(struct ice_esw_br *bridge, bool enable)
>  		bridge->flags &= ~ICE_ESWITCH_BR_VLAN_FILTERING;
>  }
>  
> +static void
> +ice_eswitch_br_clear_pvid(struct ice_esw_br_port *port)
> +{
> +	struct ice_vsi_vlan_ops *vlan_ops =
> +		ice_get_compat_vsi_vlan_ops(port->vsi);
> +

Deref in a separate line to avoid breaking?

> +	vlan_ops->clear_port_vlan(port->vsi);
> +
> +	ice_vf_vsi_disable_port_vlan(port->vsi);
> +
> +	port->pvid = 0;
> +}
> +
>  static void
>  ice_eswitch_br_vlan_cleanup(struct ice_esw_br_port *port,
>  			    struct ice_esw_br_vlan *vlan)
>  {
>  	xa_erase(&port->vlans, vlan->vid);
> +	if (port->pvid == vlan->vid)
> +		ice_eswitch_br_clear_pvid(port);
>  	kfree(vlan);
>  }
>  
> @@ -627,9 +644,50 @@ static void ice_eswitch_br_port_vlans_flush(struct ice_esw_br_port *port)
>  		ice_eswitch_br_vlan_cleanup(port, vlan);
>  }
>  
> +static int
> +ice_eswitch_br_set_pvid(struct ice_esw_br_port *port,
> +			struct ice_esw_br_vlan *vlan)
> +{
> +	struct ice_vlan port_vlan = ICE_VLAN(ETH_P_8021Q, vlan->vid, 0);
> +	struct device *dev = ice_pf_to_dev(port->vsi->back);
> +	struct ice_vsi_vlan_ops *vlan_ops;
> +	int err;
> +
> +	if (port->pvid == vlan->vid || vlan->vid == 1)
> +		return 0;
> +
> +	/* Setting port vlan on uplink isn't supported by hw */
> +	if (port->type == ICE_ESWITCH_BR_UPLINK_PORT)
> +		return -EOPNOTSUPP;
> +
> +	if (port->pvid) {
> +		dev_info(dev,

dev_err()?

> +			 "Port VLAN (vsi=%u, vid=%u) already exists on the port, remove it before adding new one\n",
> +			 port->vsi_idx, port->pvid);
> +		return -EEXIST;

Hmm, isn't -EBUSY more common for such cases?

(below as well)

> +	}
> +
> +	ice_vf_vsi_enable_port_vlan(port->vsi);

[...]

> @@ -639,14 +697,29 @@ ice_eswitch_br_vlan_create(u16 vid, u16 flags, struct ice_esw_br_port *port)
>  
>  	vlan->vid = vid;
>  	vlan->flags = flags;
> +	if ((flags & BRIDGE_VLAN_INFO_PVID) &&
> +	    (flags & BRIDGE_VLAN_INFO_UNTAGGED)) {
> +		err = ice_eswitch_br_set_pvid(port, vlan);
> +		if (err)
> +			goto err_set_pvid;
> +	} else if ((flags & BRIDGE_VLAN_INFO_PVID) ||
> +		   (flags & BRIDGE_VLAN_INFO_UNTAGGED)) {
> +		dev_info(dev, "VLAN push and pop are supported only simultaneously\n");

(same for dev_err(), as well as below)

> +		return ERR_PTR(-EOPNOTSUPP);
> +	}

[...]

> diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
> index cf3e2615a62a..b6eef068ea81 100644
> --- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
> +++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
> @@ -43,6 +43,7 @@ struct ice_esw_br_port {
>  	struct ice_vsi *vsi;
>  	u16 vsi_idx;
>  	struct xarray vlans;
> +	u16 pvid;

Oh, or you can just stack ::vsi_idx with ::pvid here to avoid spawning
holes.

>  };
>  
>  enum {
> diff --git a/drivers/net/ethernet/intel/ice/ice_vf_vsi_vlan_ops.c b/drivers/net/ethernet/intel/ice/ice_vf_vsi_vlan_ops.c
> index b1ffb81893d4..447b4e6ef7e4 100644
> --- a/drivers/net/ethernet/intel/ice/ice_vf_vsi_vlan_ops.c
> +++ b/drivers/net/ethernet/intel/ice/ice_vf_vsi_vlan_ops.c
> @@ -21,6 +21,108 @@ noop_vlan(struct ice_vsi __always_unused *vsi)
>  	return 0;
>  }
>  
> +static void ice_port_vlan_on(struct ice_vsi *vsi)
> +{
> +	struct ice_vsi_vlan_ops *vlan_ops;
> +	struct ice_pf *pf = vsi->back;
> +
> +	if (ice_is_dvm_ena(&pf->hw)) {
> +		vlan_ops = &vsi->outer_vlan_ops;
> +
> +		/* setup outer VLAN ops */
> +		vlan_ops->set_port_vlan = ice_vsi_set_outer_port_vlan;
> +		vlan_ops->clear_port_vlan = ice_vsi_clear_outer_port_vlan;
> +		vlan_ops->clear_port_vlan = ice_vsi_clear_outer_port_vlan;
> +		vlan_ops->ena_rx_filtering =
> +			ice_vsi_ena_rx_vlan_filtering;
> +
> +		/* setup inner VLAN ops */
> +		vlan_ops = &vsi->inner_vlan_ops;
> +		vlan_ops->add_vlan = noop_vlan_arg;
> +		vlan_ops->del_vlan = noop_vlan_arg;
> +		vlan_ops->ena_stripping = ice_vsi_ena_inner_stripping;
> +		vlan_ops->dis_stripping = ice_vsi_dis_inner_stripping;
> +		vlan_ops->ena_insertion = ice_vsi_ena_inner_insertion;
> +		vlan_ops->dis_insertion = ice_vsi_dis_inner_insertion;
> +	} else {
> +		vlan_ops = &vsi->inner_vlan_ops;
> +
> +		vlan_ops->set_port_vlan = ice_vsi_set_inner_port_vlan;
> +		vlan_ops->clear_port_vlan = ice_vsi_clear_inner_port_vlan;
> +		vlan_ops->clear_port_vlan = ice_vsi_clear_inner_port_vlan;
> +		vlan_ops->ena_rx_filtering =
> +			ice_vsi_ena_rx_vlan_filtering;
> +	}

->ena_rx_filtering is filled with just one possible value, so it could
be done outside ifs.

> +}
> +
> +static void ice_port_vlan_off(struct ice_vsi *vsi)
> +{
> +	struct ice_vsi_vlan_ops *vlan_ops;
> +	struct ice_pf *pf = vsi->back;
> +
> +	if (ice_is_dvm_ena(&pf->hw)) {
> +		/* setup inner VLAN ops */
> +		vlan_ops = &vsi->inner_vlan_ops;
> +
> +		vlan_ops->ena_stripping = ice_vsi_ena_inner_stripping;
> +		vlan_ops->dis_stripping = ice_vsi_dis_inner_stripping;
> +		vlan_ops->ena_insertion = ice_vsi_ena_inner_insertion;
> +		vlan_ops->dis_insertion = ice_vsi_dis_inner_insertion;
> +
> +		vlan_ops = &vsi->outer_vlan_ops;
> +
> +		vlan_ops->del_vlan = ice_vsi_del_vlan;
> +		vlan_ops->ena_stripping = ice_vsi_ena_outer_stripping;
> +		vlan_ops->dis_stripping = ice_vsi_dis_outer_stripping;
> +		vlan_ops->ena_insertion = ice_vsi_ena_outer_insertion;
> +		vlan_ops->dis_insertion = ice_vsi_dis_outer_insertion;
> +	} else {
> +		vlan_ops = &vsi->inner_vlan_ops;
> +
> +		vlan_ops->del_vlan = ice_vsi_del_vlan;
> +		vlan_ops->ena_stripping = ice_vsi_ena_inner_stripping;
> +		vlan_ops->dis_stripping = ice_vsi_dis_inner_stripping;
> +		vlan_ops->ena_insertion = ice_vsi_ena_inner_insertion;
> +		vlan_ops->dis_insertion = ice_vsi_dis_inner_insertion;
> +	}

The whole ->inner_vlan_ops is filled with the same values, the only
difference is ->del_vlan, which can be left in `else`, the rest can be
set up unconditionally.

> +
> +	if (!test_bit(ICE_FLAG_VF_VLAN_PRUNING, pf->flags))
> +		vlan_ops->ena_rx_filtering = noop_vlan;
> +	else
> +		vlan_ops->ena_rx_filtering =
> +			ice_vsi_ena_rx_vlan_filtering;
> +}
> +
> +/**
> + * ice_vf_vsi_enable_port_vlan - Set VSI VLAN ops to support port VLAN
> + * @vsi: VF's VSI being configured
> + *
> + * The function won't create port VLAN, it only allows to create port VLAN
> + * using VLAN ops on the VF VSI.
> + */
> +void ice_vf_vsi_enable_port_vlan(struct ice_vsi *vsi)
> +{
> +	if (WARN_ON(!vsi->vf))

I'd use WARN_ON_ONCE(). Otherwise, it may be possible to flood kernel
log buffer (-> CPU) from the userspace.

> +		return;
> +
> +	ice_port_vlan_on(vsi);
> +}
> +
> +/**
> + * ice_vf_vsi_disable_port_vlan - Clear VSI support for creating port VLAN
> + * @vsi: VF's VSI being configured
> + *
> + * The function should be called after removing port VLAN on VSI
> + * (using VLAN ops)
> + */
> +void ice_vf_vsi_disable_port_vlan(struct ice_vsi *vsi)
> +{
> +	if (WARN_ON(!vsi->vf))

(same)

> +		return;
> +
> +	ice_port_vlan_off(vsi);
> +}

[...]

> +	info->valid_sections = cpu_to_le16(ICE_AQ_VSI_PROP_VLAN_VALID |
> +					   ICE_AQ_VSI_PROP_SW_VALID);
> +
> +	ret = ice_update_vsi(hw, vsi->idx, ctxt, NULL);
> +	if (ret)
> +		dev_info(ice_hw_to_dev(hw), "update VSI for port VLAN failed, err %d aq_err %s\n",

(dev_err())
(+ %pe)

> +			 ret, ice_aq_str(hw->adminq.sq_last_status));
> +
> +	kfree(ctxt);
> +	return ret;
> +}
[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH net-next 06/12] ice: Add guard rule when creating FDB in switchdev
  2023-04-21 14:22   ` Alexander Lobakin
@ 2023-04-25  9:17     ` Drewek, Wojciech
  2023-04-26  9:50       ` Drewek, Wojciech
  0 siblings, 1 reply; 46+ messages in thread
From: Drewek, Wojciech @ 2023-04-25  9:17 UTC (permalink / raw)
  To: Lobakin, Aleksander
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar



> -----Original Message-----
> From: Lobakin, Aleksander <aleksander.lobakin@intel.com>
> Sent: piątek, 21 kwietnia 2023 16:23
> To: Drewek, Wojciech <wojciech.drewek@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Lobakin, Aleksander <aleksander.lobakin@intel.com>; Ertman, David M
> <david.m.ertman@intel.com>; michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel
> <pawel.chmielewski@intel.com>; Samudrala, Sridhar <sridhar.samudrala@intel.com>
> Subject: Re: [PATCH net-next 06/12] ice: Add guard rule when creating FDB in switchdev
> 
> From: Wojciech Drewek <wojciech.drewek@intel.com>
> Date: Mon, 17 Apr 2023 11:34:06 +0200
> 
> > From: Marcin Szycik <marcin.szycik@intel.com>
> >
> > Introduce new "guard" rule upon FDB entry creation.
> >
> > It matches on src_mac, has valid bit unset, allow_pass_l2 set
> > and has a nop action.
> 
> [...]
> 
> > +static struct ice_rule_query_data *
> > +ice_eswitch_br_guard_rule_create(struct ice_hw *hw, u16 vsi_idx,
> > +				 const unsigned char *mac)
> > +{
> > +	struct ice_adv_rule_info rule_info = { 0 };
> > +	struct ice_rule_query_data *rule;
> > +	struct ice_adv_lkup_elem *list;
> > +	const u16 lkups_cnt = 1;
> > +	int err;
> 
> You can initialize it with -%ENOMEM right here in order to...
> 
> > +
> > +	rule = kzalloc(sizeof(*rule), GFP_KERNEL);
> > +	if (!rule) {
> > +		err = -ENOMEM;
> > +		goto err_exit;
> > +	}
> > +
> > +	list = kcalloc(lkups_cnt, sizeof(*list), GFP_ATOMIC);
> > +	if (!list) {
> > +		err = -ENOMEM;
> > +		goto err_list_alloc;
> > +	}
> 
> ...make those 2 ifs goto-oneliners :3 As...
> 
> > +
> > +	list[0].type = ICE_MAC_OFOS;
> > +	ether_addr_copy(list[0].h_u.eth_hdr.src_addr, mac);
> > +	eth_broadcast_addr(list[0].m_u.eth_hdr.src_addr);
> > +
> > +	rule_info.allow_pass_l2 = true;
> > +	rule_info.sw_act.vsi_handle = vsi_idx;
> > +	rule_info.sw_act.fltr_act = ICE_NOP;
> > +	rule_info.priority = 5;
> > +
> > +	err = ice_add_adv_rule(hw, list, lkups_cnt, &rule_info, rule);
> 
> ...it's overwritten here anyway, so it is safe to init it with an error
> value.

Makes sense, thanks.

> 
> > +	if (err)
> > +		goto err_add_rule;
> > +
> > +	return rule;
> > +
> > +err_add_rule:
> > +	kfree(list);
> > +err_list_alloc:
> > +	kfree(rule);
> > +err_exit:
> > +	return ERR_PTR(err);
> > +}
> > +
> >  static struct ice_esw_br_flow *
> >  ice_eswitch_br_flow_create(struct device *dev, struct ice_hw *hw, u16 vsi_idx,
> >  			   int port_type, const unsigned char *mac)
> >  {
> > -	struct ice_rule_query_data *fwd_rule;
> > +	struct ice_rule_query_data *fwd_rule, *guard_rule;
> >  	struct ice_esw_br_flow *flow;
> >  	int err;
> >
> > @@ -155,10 +202,22 @@ ice_eswitch_br_flow_create(struct device *dev, struct ice_hw *hw, u16 vsi_idx,
> >  		goto err_fwd_rule;
> >  	}
> >
> > +	guard_rule = ice_eswitch_br_guard_rule_create(hw, vsi_idx, mac);
> > +	if (IS_ERR(guard_rule)) {
> > +		err = PTR_ERR(guard_rule);
> 
> Aaah ok, that's what you meant in the previous mails. I see now.
> You can either leave it like that or there's an alternative -- pick the
> one that you like the most:
> 
> 	guard_rule = ice_eswitch_...
> 	err = PTR_ERR(guard_rule);
> 	if (err) {
> 		...
> 

I like it, less ptr <-> macros

> > +		dev_err(dev, "Failed to create eswitch bridge %sgress guard rule, err: %d\n",
> > +			port_type == ICE_ESWITCH_BR_UPLINK_PORT ? "e" : "in",
> > +			err);
> 
> You still can print it via "%pe" + @guard_rule instead of @err :p (same
> with @fwd_rule above)
> 
> > +		goto err_guard_rule;
> > +	}
> > +
> >  	flow->fwd_rule = fwd_rule;
> > +	flow->guard_rule = guard_rule;
> >
> >  	return flow;
> 
> [...]
> 
> > @@ -4624,7 +4628,7 @@ static struct ice_protocol_entry ice_prot_id_tbl[ICE_PROTOCOL_LAST] = {
> >   */
> >  static u16
> >  ice_find_recp(struct ice_hw *hw, struct ice_prot_lkup_ext *lkup_exts,
> > -	      enum ice_sw_tunnel_type tun_type)
> > +	      struct ice_adv_rule_info *rinfo)
> 
> Can be const I think?

Agree

> 
> >  {
> >  	bool refresh_required = true;
> >  	struct ice_sw_recipe *recp;
> 
> [...]
> 
> > @@ -5075,6 +5082,14 @@ ice_add_sw_recipe(struct ice_hw *hw, struct ice_sw_recipe *rm,
> >  		set_bit(buf[recps].recipe_indx,
> >  			(unsigned long *)buf[recps].recipe_bitmap);
> >  		buf[recps].content.act_ctrl_fwd_priority = rm->priority;
> > +
> > +		if (rm->need_pass_l2)
> > +			buf[recps].content.act_ctrl |=
> > +				ICE_AQ_RECIPE_ACT_NEED_PASS_L2;
> > +
> > +		if (rm->allow_pass_l2)
> > +			buf[recps].content.act_ctrl |=
> > +				ICE_AQ_RECIPE_ACT_ALLOW_PASS_L2;
> 
> I don't like these line breaks :s
> 
> 		type_of_content *cont;
> 		...
> 
> 		/* As far as I can see, it can be used above as well */
> 		cont = &buf[recps].content;
> 
> 		if (rm->need_pass_l2)
> 			cont->act_ctrl |= ICE_AQ_RECIPE_ACT_NEED_PASS_L2;
> 		if (rm->allow_pass_l2)
> 			cont->act_ctrl |= ICE_AQ_RECIPE_ACT_ALLOW_PASS_L2;
> 
> >  		recps++;
> >  	}
> >
> 
> [...]
> 
> > @@ -6166,6 +6190,11 @@ ice_add_adv_rule(struct ice_hw *hw, struct ice_adv_lkup_elem *lkups,
> >  		act |= ICE_SINGLE_ACT_VSI_FORWARDING | ICE_SINGLE_ACT_DROP |
> >  		       ICE_SINGLE_ACT_VALID_BIT;
> >  		break;
> > +	case ICE_NOP:
> > +		act |= (rinfo->sw_act.fwd_id.hw_vsi_id <<
> > +			ICE_SINGLE_ACT_VSI_ID_S) & ICE_SINGLE_ACT_VSI_ID_M;
> 
> `FIELD_PREP(ICE_SINGLE_ACT_VSI_ID_M, rinfo->sw_act.fwd_id.hw_vsi_id)`?
> 
> > +		act &= ~ICE_SINGLE_ACT_VALID_BIT;
> > +		break;
> >  	default:
> >  		status = -EIO;
> >  		goto err_ice_add_adv_rule;
> > @@ -6446,7 +6475,7 @@ ice_rem_adv_rule(struct ice_hw *hw, struct ice_adv_lkup_elem *lkups,
> >  			return -EIO;
> >  	}
> >
> > -	rid = ice_find_recp(hw, &lkup_exts, rinfo->tun_type);
> > +	rid = ice_find_recp(hw, &lkup_exts, rinfo);
> >  	/* If did not find a recipe that match the existing criteria */
> >  	if (rid == ICE_MAX_NUM_RECIPES)
> >  		return -EINVAL;
> > diff --git a/drivers/net/ethernet/intel/ice/ice_switch.h b/drivers/net/ethernet/intel/ice/ice_switch.h
> > index c84b56fe84a5..5ecce39cf1f5 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_switch.h
> > +++ b/drivers/net/ethernet/intel/ice/ice_switch.h
> > @@ -191,6 +191,8 @@ struct ice_adv_rule_info {
> >  	u16 vlan_type;
> >  	u16 fltr_rule_id;
> >  	u32 priority;
> > +	u8 need_pass_l2;
> > +	u8 allow_pass_l2;
> 
> They can be either true or false, nothing else, right? I'd make them
> occupy 1 bit per var then:

Correct

> 
> 	u16 need_pass_l2:1;
> 	u16 allow_pass_l2:1;
> 	u16 src_vsi;
> 
> +14 free bits for more flags, no holes (stacked with ::src_vsi).
> 
> >  	u16 src_vsi;
> >  	struct ice_sw_act_ctrl sw_act;
> >  	struct ice_adv_rule_flags_info flags_info;
> > @@ -254,6 +256,9 @@ struct ice_sw_recipe {
> >  	 */
> >  	u8 priority;
> >
> > +	u8 need_pass_l2;
> > +	u8 allow_pass_l2;
> 
> (same with bitfields here, just use u8 :1 instead of u16 here to stack
>  with ::priority)
> 
> > +
> >  	struct list_head rg_list;
> 
> [...]
> 
> Thanks,
> Olek


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next 02/12] ice: Remove exclusion code for RDMA+SRIOV
  2023-04-17  9:34 ` [PATCH net-next 02/12] ice: Remove exclusion code for RDMA+SRIOV Wojciech Drewek
  2023-04-19 14:38   ` Alexander Lobakin
@ 2023-04-25 15:26   ` Michal Schmidt
  1 sibling, 0 replies; 46+ messages in thread
From: Michal Schmidt @ 2023-04-25 15:26 UTC (permalink / raw)
  To: Wojciech Drewek, Dave Ertman
  Cc: moderated list:INTEL ETHERNET DRIVERS, netdev

On Mon, Apr 17, 2023 at 11:35 AM Wojciech Drewek
<wojciech.drewek@intel.com> wrote:
>
> From: Dave Ertman <david.m.ertman@intel.com>
>
> There was a change previously to stop SR-IOV and LAG from existing on the
> same interface.  [...]

Why does the subject mention RDMA? The patch does not change the calls
to ice_{set,clear}_rdma_cap.
Did you mean to call it "ice: Remove exclusion code for LAG+SRIOV" ?

Michal


^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH net-next 06/12] ice: Add guard rule when creating FDB in switchdev
  2023-04-25  9:17     ` Drewek, Wojciech
@ 2023-04-26  9:50       ` Drewek, Wojciech
  2023-04-26 15:24         ` Alexander Lobakin
  0 siblings, 1 reply; 46+ messages in thread
From: Drewek, Wojciech @ 2023-04-26  9:50 UTC (permalink / raw)
  To: Lobakin, Aleksander
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar



> -----Original Message-----
> From: Drewek, Wojciech
> Sent: wtorek, 25 kwietnia 2023 11:18
> To: Lobakin, Aleksander <aleksander.lobakin@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Ertman, David M <david.m.ertman@intel.com>;
> michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel <pawel.chmielewski@intel.com>;
> Samudrala, Sridhar <sridhar.samudrala@intel.com>
> Subject: RE: [PATCH net-next 06/12] ice: Add guard rule when creating FDB in switchdev
> 
> 
> 
> > -----Original Message-----
> > From: Lobakin, Aleksander <aleksander.lobakin@intel.com>
> > Sent: piątek, 21 kwietnia 2023 16:23
> > To: Drewek, Wojciech <wojciech.drewek@intel.com>
> > Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Lobakin, Aleksander <aleksander.lobakin@intel.com>; Ertman, David
> M
> > <david.m.ertman@intel.com>; michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel
> > <pawel.chmielewski@intel.com>; Samudrala, Sridhar <sridhar.samudrala@intel.com>
> > Subject: Re: [PATCH net-next 06/12] ice: Add guard rule when creating FDB in switchdev
> >
> > From: Wojciech Drewek <wojciech.drewek@intel.com>
> > Date: Mon, 17 Apr 2023 11:34:06 +0200
> >
> > > From: Marcin Szycik <marcin.szycik@intel.com>
> > >
> > > Introduce new "guard" rule upon FDB entry creation.
> > >
> > > It matches on src_mac, has valid bit unset, allow_pass_l2 set
> > > and has a nop action.
> >
> > [...]
> >

[...]

> >
> > > +	if (err)
> > > +		goto err_add_rule;
> > > +
> > > +	return rule;
> > > +
> > > +err_add_rule:
> > > +	kfree(list);
> > > +err_list_alloc:
> > > +	kfree(rule);
> > > +err_exit:
> > > +	return ERR_PTR(err);
> > > +}
> > > +
> > >  static struct ice_esw_br_flow *
> > >  ice_eswitch_br_flow_create(struct device *dev, struct ice_hw *hw, u16 vsi_idx,
> > >  			   int port_type, const unsigned char *mac)
> > >  {
> > > -	struct ice_rule_query_data *fwd_rule;
> > > +	struct ice_rule_query_data *fwd_rule, *guard_rule;
> > >  	struct ice_esw_br_flow *flow;
> > >  	int err;
> > >
> > > @@ -155,10 +202,22 @@ ice_eswitch_br_flow_create(struct device *dev, struct ice_hw *hw, u16 vsi_idx,
> > >  		goto err_fwd_rule;
> > >  	}
> > >
> > > +	guard_rule = ice_eswitch_br_guard_rule_create(hw, vsi_idx, mac);
> > > +	if (IS_ERR(guard_rule)) {
> > > +		err = PTR_ERR(guard_rule);
> >
> > Aaah ok, that's what you meant in the previous mails. I see now.
> > You can either leave it like that or there's an alternative -- pick the
> > one that you like the most:
> >
> > 	guard_rule = ice_eswitch_...
> > 	err = PTR_ERR(guard_rule);
> > 	if (err) {
> > 		...
> >
> 
> I like it, less ptr <-> macros

Actually it won't work, PTR_ERR would not convert pointer to 0 in case of success.

> 
> > > +		dev_err(dev, "Failed to create eswitch bridge %sgress guard rule, err: %d\n",
> > > +			port_type == ICE_ESWITCH_BR_UPLINK_PORT ? "e" : "in",
> > > +			err);
> >
> > You still can print it via "%pe" + @guard_rule instead of @err :p (same
> > with @fwd_rule above)
> >
> > > +		goto err_guard_rule;
> > > +	}
> > > +
> > >  	flow->fwd_rule = fwd_rule;
> > > +	flow->guard_rule = guard_rule;
> > >
> > >  	return flow;
> >
> > [...]
> >
> > > @@ -4624,7 +4628,7 @@ static struct ice_protocol_entry ice_prot_id_tbl[ICE_PROTOCOL_LAST] = {
> > >   */
> > >  static u16
> > >  ice_find_recp(struct ice_hw *hw, struct ice_prot_lkup_ext *lkup_exts,
> > > -	      enum ice_sw_tunnel_type tun_type)
> > > +	      struct ice_adv_rule_info *rinfo)
> >
> > Can be const I think?
> 
> Agree
> 
> >
> > >  {
> > >  	bool refresh_required = true;
> > >  	struct ice_sw_recipe *recp;
> >
> > [...]
> >
> > > @@ -5075,6 +5082,14 @@ ice_add_sw_recipe(struct ice_hw *hw, struct ice_sw_recipe *rm,
> > >  		set_bit(buf[recps].recipe_indx,
> > >  			(unsigned long *)buf[recps].recipe_bitmap);
> > >  		buf[recps].content.act_ctrl_fwd_priority = rm->priority;
> > > +
> > > +		if (rm->need_pass_l2)
> > > +			buf[recps].content.act_ctrl |=
> > > +				ICE_AQ_RECIPE_ACT_NEED_PASS_L2;
> > > +
> > > +		if (rm->allow_pass_l2)
> > > +			buf[recps].content.act_ctrl |=
> > > +				ICE_AQ_RECIPE_ACT_ALLOW_PASS_L2;
> >
> > I don't like these line breaks :s
> >
> > 		type_of_content *cont;
> > 		...
> >
> > 		/* As far as I can see, it can be used above as well */
> > 		cont = &buf[recps].content;
> >
> > 		if (rm->need_pass_l2)
> > 			cont->act_ctrl |= ICE_AQ_RECIPE_ACT_NEED_PASS_L2;
> > 		if (rm->allow_pass_l2)
> > 			cont->act_ctrl |= ICE_AQ_RECIPE_ACT_ALLOW_PASS_L2;
> >
> > >  		recps++;
> > >  	}
> > >
> >
> > [...]
> >
> > > @@ -6166,6 +6190,11 @@ ice_add_adv_rule(struct ice_hw *hw, struct ice_adv_lkup_elem *lkups,
> > >  		act |= ICE_SINGLE_ACT_VSI_FORWARDING | ICE_SINGLE_ACT_DROP |
> > >  		       ICE_SINGLE_ACT_VALID_BIT;
> > >  		break;
> > > +	case ICE_NOP:
> > > +		act |= (rinfo->sw_act.fwd_id.hw_vsi_id <<
> > > +			ICE_SINGLE_ACT_VSI_ID_S) & ICE_SINGLE_ACT_VSI_ID_M;
> >
> > `FIELD_PREP(ICE_SINGLE_ACT_VSI_ID_M, rinfo->sw_act.fwd_id.hw_vsi_id)`?
> >
> > > +		act &= ~ICE_SINGLE_ACT_VALID_BIT;
> > > +		break;
> > >  	default:
> > >  		status = -EIO;
> > >  		goto err_ice_add_adv_rule;
> > > @@ -6446,7 +6475,7 @@ ice_rem_adv_rule(struct ice_hw *hw, struct ice_adv_lkup_elem *lkups,
> > >  			return -EIO;
> > >  	}
> > >
> > > -	rid = ice_find_recp(hw, &lkup_exts, rinfo->tun_type);
> > > +	rid = ice_find_recp(hw, &lkup_exts, rinfo);
> > >  	/* If did not find a recipe that match the existing criteria */
> > >  	if (rid == ICE_MAX_NUM_RECIPES)
> > >  		return -EINVAL;
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_switch.h b/drivers/net/ethernet/intel/ice/ice_switch.h
> > > index c84b56fe84a5..5ecce39cf1f5 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_switch.h
> > > +++ b/drivers/net/ethernet/intel/ice/ice_switch.h
> > > @@ -191,6 +191,8 @@ struct ice_adv_rule_info {
> > >  	u16 vlan_type;
> > >  	u16 fltr_rule_id;
> > >  	u32 priority;
> > > +	u8 need_pass_l2;
> > > +	u8 allow_pass_l2;
> >
> > They can be either true or false, nothing else, right? I'd make them
> > occupy 1 bit per var then:
> 
> Correct
> 
> >
> > 	u16 need_pass_l2:1;
> > 	u16 allow_pass_l2:1;
> > 	u16 src_vsi;
> >
> > +14 free bits for more flags, no holes (stacked with ::src_vsi).
> >
> > >  	u16 src_vsi;
> > >  	struct ice_sw_act_ctrl sw_act;
> > >  	struct ice_adv_rule_flags_info flags_info;
> > > @@ -254,6 +256,9 @@ struct ice_sw_recipe {
> > >  	 */
> > >  	u8 priority;
> > >
> > > +	u8 need_pass_l2;
> > > +	u8 allow_pass_l2;
> >
> > (same with bitfields here, just use u8 :1 instead of u16 here to stack
> >  with ::priority)
> >
> > > +
> > >  	struct list_head rg_list;
> >
> > [...]
> >
> > Thanks,
> > Olek


^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH net-next 07/12] ice: Accept LAG netdevs in bridge offloads
  2023-04-21 14:40   ` Alexander Lobakin
@ 2023-04-26 11:31     ` Drewek, Wojciech
  2023-04-26 15:31       ` Alexander Lobakin
  0 siblings, 1 reply; 46+ messages in thread
From: Drewek, Wojciech @ 2023-04-26 11:31 UTC (permalink / raw)
  To: Lobakin, Aleksander
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar



> -----Original Message-----
> From: Lobakin, Aleksander <aleksander.lobakin@intel.com>
> Sent: piątek, 21 kwietnia 2023 16:40
> To: Drewek, Wojciech <wojciech.drewek@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Ertman, David M <david.m.ertman@intel.com>;
> michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel <pawel.chmielewski@intel.com>;
> Samudrala, Sridhar <sridhar.samudrala@intel.com>
> Subject: Re: [PATCH net-next 07/12] ice: Accept LAG netdevs in bridge offloads
> 
> From: Wojciech Drewek <wojciech.drewek@intel.com>
> Date: Mon, 17 Apr 2023 11:34:07 +0200
> 
> > Allow LAG interfaces to be used in bridge offload using
> > netif_is_lag_master. In this case, search for ice netdev in
> > the list of LAG's lower devices.
> >
> > Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
> > ---
> >  .../net/ethernet/intel/ice/ice_eswitch_br.c   | 40 ++++++++++++++++---
> >  1 file changed, 35 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
> > index 82b5eb2020cd..49381e4bf62a 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
> > @@ -15,8 +15,21 @@ static const struct rhashtable_params ice_fdb_ht_params = {
> >
> >  static bool ice_eswitch_br_is_dev_valid(const struct net_device *dev)
> >  {
> > -	/* Accept only PF netdev and PRs */
> > -	return ice_is_port_repr_netdev(dev) || netif_is_ice(dev);
> > +	/* Accept only PF netdev, PRs and LAG */
> > +	return ice_is_port_repr_netdev(dev) || netif_is_ice(dev) ||
> > +		netif_is_lag_master(dev);
> 
> Nit: usually we align to `return` (7 spaces), not with one tab:
> 
> 	return ice_is_port_repr_netdev(dev) || netif_is_ice(dev) ||
> 	       netif_is_lag_master(dev);

I've seen examples of both so either way is ok I think

> 
> > +}
> > +
> > +static struct net_device *
> > +ice_eswitch_br_get_uplnik_from_lag(struct net_device *lag_dev)
> > +{
> > +	struct net_device *lower;
> > +	struct list_head *iter;
> > +
> > +	netdev_for_each_lower_dev(lag_dev, lower, iter)
> > +		if (netif_is_ice(lower))
> > +			return lower;
> 
> Here I think the kernel guidelines would require to have a set of braces
> (each multi-line code block must be enclosed, even if it works without).
> I mean, I wasn't doing it myself using the rule "as minimum braces as
> needed to work", but then my colleague showed me the doc :D
> 
> 	for_each_lover(...) {
> 		if (is_ice(lover))
> 			return lover;
> 	}
> 
> In contrary, this:
> 
> 	for_each_something()
> 		/* Some useful comment */
> 		do_something();
> 
> is not mentioned in the rules as requiring braces :s

Will be fixed

> 
> > +	return NULL;
> >  }
> >
> >  static struct ice_esw_br_port *
> > @@ -26,8 +39,16 @@ ice_eswitch_br_netdev_to_port(struct net_device *dev)
> >  		struct ice_repr *repr = ice_netdev_to_repr(dev);
> >
> >  		return repr->br_port;
> > -	} else if (netif_is_ice(dev)) {
> > -		struct ice_pf *pf = ice_netdev_to_pf(dev);
> > +	} else if (netif_is_ice(dev) || netif_is_lag_master(dev)) {
> > +		struct net_device *ice_dev = dev;
> > +		struct ice_pf *pf;
> > +
> > +		if (netif_is_lag_master(dev))
> > +			ice_dev = ice_eswitch_br_get_uplnik_from_lag(dev);
> 
> Maybe just reuse @dev instead of one more var?
> Or do it this way:
> 
> 		struct net_device *ice_dev;
> 
> 		...
> 
> 		if (netif_is_lag_master(dev))
> 			ice_dev = ice_eswitch ...
> 		else
> 			ice_dev = dev;
> 		if (!ice_dev)
> 			return NULL;
> 
> Otherwise it's a bit confusing to have `if` in one place and `else`
> (implicit) in another one, at least it took some time for me ._.

Using else makes sense to me

> 
> > +		if (!ice_dev)
> > +			return NULL;
> > +
> > +		pf = ice_netdev_to_pf(ice_dev);
> >
> >  		return pf->br_port;
> >  	}
> > @@ -719,7 +740,16 @@ ice_eswitch_br_port_link(struct ice_esw_br_offloads *br_offloads,
> >
> >  		err = ice_eswitch_br_vf_repr_port_init(bridge, repr);
> >  	} else {
> > -		struct ice_pf *pf = ice_netdev_to_pf(dev);
> > +		struct net_device *ice_dev = dev;
> > +		struct ice_pf *pf;
> > +
> > +		if (netif_is_lag_master(dev))
> > +			ice_dev = ice_eswitch_br_get_uplnik_from_lag(dev);
> 
> (same)
> 
> > +
> > +		if (!ice_dev)
> > +			return 0;
> > +
> > +		pf = ice_netdev_to_pf(ice_dev);
> >
> >  		err = ice_eswitch_br_uplink_port_init(bridge, pf);
> >  	}
> 
> Thanks,
> Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 06/12] ice: Add guard rule when creating FDB in switchdev
  2023-04-26  9:50       ` Drewek, Wojciech
@ 2023-04-26 15:24         ` Alexander Lobakin
  2023-04-27  7:24           ` Drewek, Wojciech
  0 siblings, 1 reply; 46+ messages in thread
From: Alexander Lobakin @ 2023-04-26 15:24 UTC (permalink / raw)
  To: Drewek, Wojciech
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Wed, 26 Apr 2023 11:50:56 +0200

> 
> 
>> -----Original Message-----
>> From: Drewek, Wojciech
>> Sent: wtorek, 25 kwietnia 2023 11:18
>> To: Lobakin, Aleksander <aleksander.lobakin@intel.com>

[...]

>>> 	guard_rule = ice_eswitch_...
>>> 	err = PTR_ERR(guard_rule);
>>> 	if (err) {
>>> 		...
>>>
>>
>> I like it, less ptr <-> macros
> 
> Actually it won't work, PTR_ERR would not convert pointer to 0 in case of success.

Ooops, PTR_ERR_OR_ZERO() then? I forgot there are several macros for
different cases =\

> 
>>
>>>> +		dev_err(dev, "Failed to create eswitch bridge %sgress guard rule, err: %d\n",
>>>> +			port_type == ICE_ESWITCH_BR_UPLINK_PORT ? "e" : "in",
>>>> +			err);
>>>
>>> You still can print it via "%pe" + @guard_rule instead of @err :p (same
>>> with @fwd_rule above)
[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 07/12] ice: Accept LAG netdevs in bridge offloads
  2023-04-26 11:31     ` Drewek, Wojciech
@ 2023-04-26 15:31       ` Alexander Lobakin
  0 siblings, 0 replies; 46+ messages in thread
From: Alexander Lobakin @ 2023-04-26 15:31 UTC (permalink / raw)
  To: Drewek, Wojciech
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Wed, 26 Apr 2023 13:31:17 +0200

> 
> 
>> -----Original Message-----
>> From: Lobakin, Aleksander <aleksander.lobakin@intel.com>
>> Sent: piątek, 21 kwietnia 2023 16:40
>> To: Drewek, Wojciech <wojciech.drewek@intel.com>
>> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Ertman, David M <david.m.ertman@intel.com>;
>> michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel <pawel.chmielewski@intel.com>;
>> Samudrala, Sridhar <sridhar.samudrala@intel.com>
>> Subject: Re: [PATCH net-next 07/12] ice: Accept LAG netdevs in bridge offloads
>>
>> From: Wojciech Drewek <wojciech.drewek@intel.com>
>> Date: Mon, 17 Apr 2023 11:34:07 +0200
>>
>>> Allow LAG interfaces to be used in bridge offload using
>>> netif_is_lag_master. In this case, search for ice netdev in
>>> the list of LAG's lower devices.
>>>
>>> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
>>> ---
>>>  .../net/ethernet/intel/ice/ice_eswitch_br.c   | 40 ++++++++++++++++---
>>>  1 file changed, 35 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
>>> index 82b5eb2020cd..49381e4bf62a 100644
>>> --- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
>>> +++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
>>> @@ -15,8 +15,21 @@ static const struct rhashtable_params ice_fdb_ht_params = {
>>>
>>>  static bool ice_eswitch_br_is_dev_valid(const struct net_device *dev)
>>>  {
>>> -	/* Accept only PF netdev and PRs */
>>> -	return ice_is_port_repr_netdev(dev) || netif_is_ice(dev);
>>> +	/* Accept only PF netdev, PRs and LAG */
>>> +	return ice_is_port_repr_netdev(dev) || netif_is_ice(dev) ||
>>> +		netif_is_lag_master(dev);
>>
>> Nit: usually we align to `return` (7 spaces), not with one tab:
>>
>> 	return ice_is_port_repr_netdev(dev) || netif_is_ice(dev) ||
>> 	       netif_is_lag_master(dev);
> 
> I've seen examples of both so either way is ok I think

Correct, that's more of my personal :D Or maybe I've seen a couple times
that either checkpatch or something else complained on the second line
being not aligned to the first one with `return`.

[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH net-next 06/12] ice: Add guard rule when creating FDB in switchdev
  2023-04-26 15:24         ` Alexander Lobakin
@ 2023-04-27  7:24           ` Drewek, Wojciech
  0 siblings, 0 replies; 46+ messages in thread
From: Drewek, Wojciech @ 2023-04-27  7:24 UTC (permalink / raw)
  To: Lobakin, Aleksander
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar



> -----Original Message-----
> From: Lobakin, Aleksander <aleksander.lobakin@intel.com>
> Sent: środa, 26 kwietnia 2023 17:25
> To: Drewek, Wojciech <wojciech.drewek@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Ertman, David M <david.m.ertman@intel.com>;
> michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel <pawel.chmielewski@intel.com>;
> Samudrala, Sridhar <sridhar.samudrala@intel.com>
> Subject: Re: [PATCH net-next 06/12] ice: Add guard rule when creating FDB in switchdev
> 
> From: Wojciech Drewek <wojciech.drewek@intel.com>
> Date: Wed, 26 Apr 2023 11:50:56 +0200
> 
> >
> >
> >> -----Original Message-----
> >> From: Drewek, Wojciech
> >> Sent: wtorek, 25 kwietnia 2023 11:18
> >> To: Lobakin, Aleksander <aleksander.lobakin@intel.com>
> 
> [...]
> 
> >>> 	guard_rule = ice_eswitch_...
> >>> 	err = PTR_ERR(guard_rule);
> >>> 	if (err) {
> >>> 		...
> >>>
> >>
> >> I like it, less ptr <-> macros
> >
> > Actually it won't work, PTR_ERR would not convert pointer to 0 in case of success.
> 
> Ooops, PTR_ERR_OR_ZERO() then? I forgot there are several macros for
> different cases =\

Cool, this is what we needed :)

> 
> >
> >>
> >>>> +		dev_err(dev, "Failed to create eswitch bridge %sgress guard rule, err: %d\n",
> >>>> +			port_type == ICE_ESWITCH_BR_UPLINK_PORT ? "e" : "in",
> >>>> +			err);
> >>>
> >>> You still can print it via "%pe" + @guard_rule instead of @err :p (same
> >>> with @fwd_rule above)
> [...]
> 
> Thanks,
> Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH net-next 08/12] ice: Add VLAN FDB support in switchdev mode
  2023-04-21 15:25   ` Alexander Lobakin
@ 2023-04-27 10:28     ` Drewek, Wojciech
  2023-05-08 14:09       ` Alexander Lobakin
  0 siblings, 1 reply; 46+ messages in thread
From: Drewek, Wojciech @ 2023-04-27 10:28 UTC (permalink / raw)
  To: Lobakin, Aleksander
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar



> -----Original Message-----
> From: Lobakin, Aleksander <aleksander.lobakin@intel.com>
> Sent: piątek, 21 kwietnia 2023 17:25
> To: Drewek, Wojciech <wojciech.drewek@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Lobakin, Aleksander <aleksander.lobakin@intel.com>; Ertman, David M
> <david.m.ertman@intel.com>; michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel
> <pawel.chmielewski@intel.com>; Samudrala, Sridhar <sridhar.samudrala@intel.com>
> Subject: Re: [PATCH net-next 08/12] ice: Add VLAN FDB support in switchdev mode
> 
> From: Wojciech Drewek <wojciech.drewek@intel.com>
> Date: Mon, 17 Apr 2023 11:34:08 +0200
> 
> > From: Marcin Szycik <marcin.szycik@intel.com>
> >
> > Add support for matching on VLAN tag in bridge offloads.
> > Currently only trunk mode is supported.
> >
> > To enable VLAN filtering (existing FDB entries will be deleted):
> > ip link set $BR type bridge vlan_filtering 1
> >
> > To add VLANs to bridge in trunk mode:
> > bridge vlan add dev $PF1 vid 110-111
> > bridge vlan add dev $VF1_PR vid 110-111
> >
> > Signed-off-by: Marcin Szycik <marcin.szycik@intel.com>
> > ---
> >  .../net/ethernet/intel/ice/ice_eswitch_br.c   | 319 +++++++++++++++++-
> >  .../net/ethernet/intel/ice/ice_eswitch_br.h   |  12 +
> >  2 files changed, 317 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
> > index 49381e4bf62a..56d36e397b12 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
> > @@ -59,13 +59,19 @@ ice_eswitch_br_netdev_to_port(struct net_device *dev)
> >  static void
> >  ice_eswitch_br_ingress_rule_setup(struct ice_adv_lkup_elem *list,
> >  				  struct ice_adv_rule_info *rule_info,
> > -				  const unsigned char *mac,
> > +				  const unsigned char *mac, bool vlan, u16 vid,
> 
> Could we use one combined argument? Doesn't `!!vid == !!vlan`? VID 0 is
> reserved IIRC...
> 
> (same in all the places below)

Makes sense

> 
> >  				  u8 pf_id, u16 vf_vsi_idx)
> >  {
> >  	list[0].type = ICE_MAC_OFOS;
> >  	ether_addr_copy(list[0].h_u.eth_hdr.dst_addr, mac);
> >  	eth_broadcast_addr(list[0].m_u.eth_hdr.dst_addr);
> 
> [...]
> 
> > @@ -344,10 +389,33 @@ ice_eswitch_br_fdb_entry_create(struct net_device *netdev,
> >  	struct device *dev = ice_pf_to_dev(pf);
> >  	struct ice_esw_br_fdb_entry *fdb_entry;
> >  	struct ice_esw_br_flow *flow;
> > +	struct ice_esw_br_vlan *vlan;
> >  	struct ice_hw *hw = &pf->hw;
> > +	bool add_vlan = false;
> >  	unsigned long event;
> >  	int err;
> >
> > +	/* FIXME: untagged filtering is not yet supported
> > +	 */
> 
> Shouldn't be present in release code I believe. I mean, the sentence is
> fine (just don't forget dot at the end), but without "FIXME:". And it
> can be one-liner.

Sure

> 
> > +	if (!(bridge->flags & ICE_ESWITCH_BR_VLAN_FILTERING) && vid)
> > +		return;
> 
> [...]
> 
> > +static void
> > +ice_eswitch_br_vlan_filtering_set(struct ice_esw_br *bridge, bool enable)
> > +{
> > +	bool filtering = bridge->flags & ICE_ESWITCH_BR_VLAN_FILTERING;
> > +
> > +	if (filtering == enable)
> > +		return;
> 
> 	if (enable == !!(bridge->flags & ICE_ESWITCH_BR_VLAN_FILTERING))
> 
> ?

I like it

> 
> > +
> > +	ice_eswitch_br_fdb_flush(bridge);
> > +	if (enable)
> > +		bridge->flags |= ICE_ESWITCH_BR_VLAN_FILTERING;
> > +	else
> > +		bridge->flags &= ~ICE_ESWITCH_BR_VLAN_FILTERING;
> > +}
> 
> [...]
> 
> > +	port = xa_load(&bridge->ports, vsi_idx);
> > +	if (!port)
> > +		return -EINVAL;
> > +
> > +	vlan = xa_load(&port->vlans, vid);
> > +	if (vlan) {
> > +		if (vlan->flags == flags)
> > +			return 0;
> > +
> > +		ice_eswitch_br_vlan_cleanup(port, vlan);
> > +	}
> > +
> > +	vlan = ice_eswitch_br_vlan_create(vid, flags, port);
> > +	if (IS_ERR(vlan)) {
> > +		NL_SET_ERR_MSG_MOD(extack, "Failed to create VLAN entry");
> 
> FYI, there's NL_SET_ERR_MSG_FMT_MOD() landed recently (a couple releases
> back), which supports format strings. E.g. you could pass VID, VSI ID,
> flags etc. there to have more meaningful output (right in userspace).

Sure, I guess I can improve log msgs in other patches now.

> 
> > +		return PTR_ERR(vlan);
> > +	}
> > +
> > +	return 0;
> > +}
> 
> [...]
> 
> > +static int
> > +ice_eswitch_br_port_obj_add(struct net_device *netdev, const void *ctx,
> > +			    const struct switchdev_obj *obj,
> > +			    struct netlink_ext_ack *extack)
> > +{
> > +	struct ice_esw_br_port *br_port = ice_eswitch_br_netdev_to_port(netdev);
> > +	struct switchdev_obj_port_vlan *vlan;
> > +	int err;
> > +
> > +	if (!br_port)
> > +		return -EINVAL;
> > +
> > +	switch (obj->id) {
> > +	case SWITCHDEV_OBJ_ID_PORT_VLAN:
> > +		vlan = SWITCHDEV_OBJ_PORT_VLAN(obj);
> > +		err = ice_eswitch_br_port_vlan_add(br_port->bridge,
> > +						   br_port->vsi_idx, vlan->vid,
> > +						   vlan->flags, extack);
> 
> return right here? You have `default` in the switch block, so the
> compiler shouldn't complain if you remove it from the end of the func.

Sure

> 
> > +		break;
> > +	default:
> > +		return -EOPNOTSUPP;
> > +	}
> > +
> > +	return err;
> > +}
> > +
> > +static int
> > +ice_eswitch_br_port_obj_del(struct net_device *netdev, const void *ctx,
> > +			    const struct switchdev_obj *obj)
> > +{
> > +	struct ice_esw_br_port *br_port = ice_eswitch_br_netdev_to_port(netdev);
> > +	struct switchdev_obj_port_vlan *vlan;
> > +
> > +	if (!br_port)
> > +		return -EINVAL;
> > +
> > +	switch (obj->id) {
> > +	case SWITCHDEV_OBJ_ID_PORT_VLAN:
> > +		vlan = SWITCHDEV_OBJ_PORT_VLAN(obj);
> > +		ice_eswitch_br_port_vlan_del(br_port->bridge, br_port->vsi_idx,
> > +					     vlan->vid);
> 
> (same)
> 
> > +		break;
> > +	default:
> > +		return -EOPNOTSUPP;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static int
> > +ice_eswitch_br_port_obj_attr_set(struct net_device *netdev, const void *ctx,
> > +				 const struct switchdev_attr *attr,
> > +				 struct netlink_ext_ack *extack)
> > +{
> > +	struct ice_esw_br_port *br_port = ice_eswitch_br_netdev_to_port(netdev);
> > +
> > +	if (!br_port)
> > +		return -EINVAL;
> > +
> > +	switch (attr->id) {
> > +	case SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING:
> > +		ice_eswitch_br_vlan_filtering_set(br_port->bridge,
> > +						  attr->u.vlan_filtering);
> 
> (and here)
> 
> > +		break;
> > +	default:
> > +		return -EOPNOTSUPP;
> > +	}
> > +
> > +	return 0;
> > +}
> 
> [...]
> 
> > +	br_offloads->switchdev_blk.notifier_call =
> > +		ice_eswitch_br_event_blocking;
> 
> Oh, you have two usages of ->switchdev_blk here, so you can add an
> intermediate variable to avoid line breaking, which would also shorten
> the line below :D
> 
> 	nb = &br_offloads->switchdev_blk;
> 	nb->notifier_call = ice_eswitch_br_event_blocking;
> 	...

Hmmm, I feel like it is more readable right now. It's clear that we're registering
switchdev blocking notifier block (switchdev_blk). Introducing generic variable (nb)
might a bit ambiguous IMO. So if you have nothing against it I'd leave it as it is.

> 
> > +	err = register_switchdev_blocking_notifier(&br_offloads->switchdev_blk);
> > +	if (err) {
> > +		dev_err(dev,
> > +			"Failed to register bridge blocking switchdev notifier\n");
> > +		goto err_reg_switchdev_blk;
> > +	}
> > +
> >  	br_offloads->netdev_nb.notifier_call = ice_eswitch_br_port_event;
> >  	err = register_netdevice_notifier(&br_offloads->netdev_nb);
> 
> (here the same, but no line breaks, so up to you. You could reuse the
>  same variable or leave it as it is)

(same here)

> 
> >  	if (err) {
> 
> [...]
> 
> > diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
> > index 73ad81bad655..cf3e2615a62a 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
> > +++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
> > @@ -42,10 +42,16 @@ struct ice_esw_br_port {
> >  	enum ice_esw_br_port_type type;
> >  	struct ice_vsi *vsi;
> >  	u16 vsi_idx;
> > +	struct xarray vlans;
> 
> Hmm, I feel like you can make ::type u16 and then stack it with
> ::vsi_idx, so that you avoid a hole here.
> 
> > +};
> > +
> > +enum {
> > +	ICE_ESWITCH_BR_VLAN_FILTERING = BIT(0),
> >  };
> >
> >  struct ice_esw_br {
> >  	struct ice_esw_br_offloads *br_offloads;
> > +	int flags;
> 
> Unsigned types fit flags better I think?

Yup

> 
> >  	int ifindex;
> 
> (BTW, ifindex is also usually unsigned unless it's not an error)

It is defined as int in net_device, so I don’t know

> 
> >
> >  	struct xarray ports;
> [...]
> 
> Thanks,
> Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 08/12] ice: Add VLAN FDB support in switchdev mode
  2023-04-27 10:28     ` Drewek, Wojciech
@ 2023-05-08 14:09       ` Alexander Lobakin
  0 siblings, 0 replies; 46+ messages in thread
From: Alexander Lobakin @ 2023-05-08 14:09 UTC (permalink / raw)
  To: Drewek, Wojciech
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Thu, 27 Apr 2023 12:28:21 +0200

> 
> 
>> -----Original Message-----
>> From: Lobakin, Aleksander <aleksander.lobakin@intel.com>
>> Sent: piątek, 21 kwietnia 2023 17:25
>> To: Drewek, Wojciech <wojciech.drewek@intel.com>
>> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Lobakin, Aleksander <aleksander.lobakin@intel.com>; Ertman, David M
>> <david.m.ertman@intel.com>; michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel
>> <pawel.chmielewski@intel.com>; Samudrala, Sridhar <sridhar.samudrala@intel.com>
>> Subject: Re: [PATCH net-next 08/12] ice: Add VLAN FDB support in switchdev mode

[...]

>>> +		break;
>>> +	default:
>>> +		return -EOPNOTSUPP;
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>
>> [...]
>>
>>> +	br_offloads->switchdev_blk.notifier_call =
>>> +		ice_eswitch_br_event_blocking;
>>
>> Oh, you have two usages of ->switchdev_blk here, so you can add an
>> intermediate variable to avoid line breaking, which would also shorten
>> the line below :D
>>
>> 	nb = &br_offloads->switchdev_blk;
>> 	nb->notifier_call = ice_eswitch_br_event_blocking;
>> 	...
> 
> Hmmm, I feel like it is more readable right now. It's clear that we're registering
> switchdev blocking notifier block (switchdev_blk). Introducing generic variable (nb)
> might a bit ambiguous IMO. So if you have nothing against it I'd leave it as it is.

Noprob, up to you :)

[...]

>>>  	int ifindex;
>>
>> (BTW, ifindex is also usually unsigned unless it's not an error)
> 
> It is defined as int in net_device, so I don’t know

I know, but back then people don't care much and were usually defining
everything that doesn't need explicit size as `int` :D No interfaces
have a negative index, indexes start from 1, which is reserved for
loopback, so even 0 is not used. Up to you anyway, I just usually don't
use sign when it's not needed. But it's more of a personal.

[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH net-next 10/12] ice: implement static version of ageing
  2023-04-21 16:22   ` Alexander Lobakin
@ 2023-05-09 10:55     ` Drewek, Wojciech
  2023-05-09 14:55       ` Alexander Lobakin
  0 siblings, 1 reply; 46+ messages in thread
From: Drewek, Wojciech @ 2023-05-09 10:55 UTC (permalink / raw)
  To: Lobakin, Aleksander
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar

Hi Olek

Sorry for late response, I didn't manage to answer all your comments before my vacation :)
Will continue this week.

> -----Original Message-----
> From: Lobakin, Aleksander <aleksander.lobakin@intel.com>
> Sent: piątek, 21 kwietnia 2023 18:23
> To: Drewek, Wojciech <wojciech.drewek@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Ertman, David M <david.m.ertman@intel.com>;
> michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel <pawel.chmielewski@intel.com>;
> Samudrala, Sridhar <sridhar.samudrala@intel.com>
> Subject: Re: [PATCH net-next 10/12] ice: implement static version of ageing
> 
> From: Wojciech Drewek <wojciech.drewek@intel.com>
> Date: Mon, 17 Apr 2023 11:34:10 +0200
> 
> > From: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> >
> > Remove fdb entries always when ageing time expired.
> >
> > Allow user to set ageing time using port object attribute.
> >
> > Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> > ---
> >  .../net/ethernet/intel/ice/ice_eswitch_br.c   | 46 +++++++++++++++++++
> >  .../net/ethernet/intel/ice/ice_eswitch_br.h   | 11 +++++
> >  2 files changed, 57 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
> > index a21eca5088f7..6c3144f98100 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.c
> > @@ -8,6 +8,8 @@
> >  #include "ice_vlan.h"
> >  #include "ice_vf_vsi_vlan_ops.h"
> >
> > +#define ICE_ESW_BRIDGE_UPDATE_INTERVAL_MS 1000
> 
> I think you can define it without '_MS' and as msecs_to_jiffies(1000)
> right here, so that you wouldn't need to convert it at use sites (it's
> more expensive to do there in terms of chars vs line width).

Makes sense

> 
> > +
> >  static const struct rhashtable_params ice_fdb_ht_params = {
> >  	.key_offset = offsetof(struct ice_esw_br_fdb_entry, data),
> >  	.key_len = sizeof(struct ice_esw_br_fdb_data),
> > @@ -440,6 +442,7 @@ ice_eswitch_br_fdb_entry_create(struct net_device *netdev,
> >  	fdb_entry->br_port = br_port;
> >  	fdb_entry->flow = flow;
> >  	fdb_entry->dev = netdev;
> > +	fdb_entry->last_use = jiffies;
> >  	event = SWITCHDEV_FDB_ADD_TO_BRIDGE;
> >
> >  	if (added_by_user) {
> > @@ -838,6 +841,10 @@ ice_eswitch_br_port_obj_attr_set(struct net_device *netdev, const void *ctx,
> >  		ice_eswitch_br_vlan_filtering_set(br_port->bridge,
> >  						  attr->u.vlan_filtering);
> >  		break;
> > +	case SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME:
> > +		br_port->bridge->ageing_time =
> > +			clock_t_to_jiffies(attr->u.ageing_time);
> 
> Why reviews also teach the reviewer himself -- because I never knew of
> clock_t and that userspace has its own ticks, which we have to convert O_.
> 
> (sounds as a joke BTW, why not just use ms/us/ns everywhere, "tick" is
>  something very intimate/internal)
> 
> > +		break;
> >  	default:
> >  		return -EOPNOTSUPP;
> >  	}
> 
> [...]
> 
> > +	if (!bridge)
> > +		return;
> > +
> > +	rtnl_lock();
> > +	list_for_each_entry_safe(entry, tmp, &bridge->fdb_list, list) {
> > +		if (entry->flags & ICE_ESWITCH_BR_FDB_ADDED_BY_USER)
> > +			continue;
> > +
> > +		if (time_is_before_jiffies(entry->last_use +
> > +					   bridge->ageing_time))
> > +			ice_eswitch_br_fdb_entry_notify_and_cleanup(bridge,
> > +								    entry);
> 
> Maybe invert the condition to give a bit more space for arguments?
> 
> 		if (time_is_after_eq_jiffies(entry->last_use +
> 					     bridge->ageing_time))
> 			continue;
> 
> 		ice_eswitch_br_fdb_entry_notify_and_cleanup(bridge, entry);
> 	}

sure

> 
> 
> > +	}
> > +	rtnl_unlock();
> > +}
> > +
> > +static void ice_eswitch_br_update_work(struct work_struct *work)
> > +{
> > +	struct ice_esw_br_offloads *br_offloads =
> > +		ice_work_to_br_offloads(work);
> 
> Assign it in a separate line pls :s

ok

> 
> > +
> > +	ice_eswitch_br_update(br_offloads);
> > +
> > +	queue_delayed_work(br_offloads->wq, &br_offloads->update_work,
> > +			   msecs_to_jiffies(ICE_ESW_BRIDGE_UPDATE_INTERVAL_MS));
> > +}
> [...]
> 
> Thanks,
> Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH net-next 09/12] ice: implement bridge port vlan
  2023-04-21 16:35   ` Alexander Lobakin
@ 2023-05-09 11:25     ` Drewek, Wojciech
  2023-05-09 15:06       ` Alexander Lobakin
  0 siblings, 1 reply; 46+ messages in thread
From: Drewek, Wojciech @ 2023-05-09 11:25 UTC (permalink / raw)
  To: Lobakin, Aleksander
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar



> -----Original Message-----
> From: Lobakin, Aleksander <aleksander.lobakin@intel.com>
> Sent: piątek, 21 kwietnia 2023 18:35
> To: Drewek, Wojciech <wojciech.drewek@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Ertman, David M <david.m.ertman@intel.com>;
> michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel <pawel.chmielewski@intel.com>;
> Samudrala, Sridhar <sridhar.samudrala@intel.com>
> Subject: Re: [PATCH net-next 09/12] ice: implement bridge port vlan
> 
> From: Wojciech Drewek <wojciech.drewek@intel.com>
> Date: Mon, 17 Apr 2023 11:34:09 +0200
> 
> > From: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> >
> > Port VLAN in this case means push and pop VLAN action on specific vid.
> > There are a few limitation in hardware:
> > - push and pop can't be used separately
> > - if port VLAN is used there can't be any trunk VLANs, because pop
> >   action is done on all trafic received by VSI in port VLAN mode
> > - port VLAN mode on uplink port isn't supported
> 
> [...]
> 
> > @@ -610,11 +612,26 @@ ice_eswitch_br_vlan_filtering_set(struct ice_esw_br *bridge, bool enable)
> >  		bridge->flags &= ~ICE_ESWITCH_BR_VLAN_FILTERING;
> >  }
> >
> > +static void
> > +ice_eswitch_br_clear_pvid(struct ice_esw_br_port *port)
> > +{
> > +	struct ice_vsi_vlan_ops *vlan_ops =
> > +		ice_get_compat_vsi_vlan_ops(port->vsi);
> > +
> 
> Deref in a separate line to avoid breaking?

sure

> 
> > +	vlan_ops->clear_port_vlan(port->vsi);
> > +
> > +	ice_vf_vsi_disable_port_vlan(port->vsi);
> > +
> > +	port->pvid = 0;
> > +}
> > +
> >  static void
> >  ice_eswitch_br_vlan_cleanup(struct ice_esw_br_port *port,
> >  			    struct ice_esw_br_vlan *vlan)
> >  {
> >  	xa_erase(&port->vlans, vlan->vid);
> > +	if (port->pvid == vlan->vid)
> > +		ice_eswitch_br_clear_pvid(port);
> >  	kfree(vlan);
> >  }
> >
> > @@ -627,9 +644,50 @@ static void ice_eswitch_br_port_vlans_flush(struct ice_esw_br_port *port)
> >  		ice_eswitch_br_vlan_cleanup(port, vlan);
> >  }
> >
> > +static int
> > +ice_eswitch_br_set_pvid(struct ice_esw_br_port *port,
> > +			struct ice_esw_br_vlan *vlan)
> > +{
> > +	struct ice_vlan port_vlan = ICE_VLAN(ETH_P_8021Q, vlan->vid, 0);
> > +	struct device *dev = ice_pf_to_dev(port->vsi->back);
> > +	struct ice_vsi_vlan_ops *vlan_ops;
> > +	int err;
> > +
> > +	if (port->pvid == vlan->vid || vlan->vid == 1)
> > +		return 0;
> > +
> > +	/* Setting port vlan on uplink isn't supported by hw */
> > +	if (port->type == ICE_ESWITCH_BR_UPLINK_PORT)
> > +		return -EOPNOTSUPP;
> > +
> > +	if (port->pvid) {
> > +		dev_info(dev,
> 
> dev_err()?

To me it's not an error, port vlan is already configured

> 
> > +			 "Port VLAN (vsi=%u, vid=%u) already exists on the port, remove it before adding new one\n",
> > +			 port->vsi_idx, port->pvid);
> > +		return -EEXIST;
> 
> Hmm, isn't -EBUSY more common for such cases?
> 
> (below as well)

I don't think so, user is trying to configure something that is already done.

> 
> > +	}
> > +
> > +	ice_vf_vsi_enable_port_vlan(port->vsi);
> 
> [...]
> 
> > @@ -639,14 +697,29 @@ ice_eswitch_br_vlan_create(u16 vid, u16 flags, struct ice_esw_br_port *port)
> >
> >  	vlan->vid = vid;
> >  	vlan->flags = flags;
> > +	if ((flags & BRIDGE_VLAN_INFO_PVID) &&
> > +	    (flags & BRIDGE_VLAN_INFO_UNTAGGED)) {
> > +		err = ice_eswitch_br_set_pvid(port, vlan);
> > +		if (err)
> > +			goto err_set_pvid;
> > +	} else if ((flags & BRIDGE_VLAN_INFO_PVID) ||
> > +		   (flags & BRIDGE_VLAN_INFO_UNTAGGED)) {
> > +		dev_info(dev, "VLAN push and pop are supported only simultaneously\n");
> 
> (same for dev_err(), as well as below)


Again, is this an error really? We just don't support such case.
> 
> > +		return ERR_PTR(-EOPNOTSUPP);
> > +	}
> 
> [...]
> 
> > diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
> > index cf3e2615a62a..b6eef068ea81 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
> > +++ b/drivers/net/ethernet/intel/ice/ice_eswitch_br.h
> > @@ -43,6 +43,7 @@ struct ice_esw_br_port {
> >  	struct ice_vsi *vsi;
> >  	u16 vsi_idx;
> >  	struct xarray vlans;
> > +	u16 pvid;
> 
> Oh, or you can just stack ::vsi_idx with ::pvid here to avoid spawning
> holes.

Sure

> 
> >  };
> >
> >  enum {
> > diff --git a/drivers/net/ethernet/intel/ice/ice_vf_vsi_vlan_ops.c b/drivers/net/ethernet/intel/ice/ice_vf_vsi_vlan_ops.c
> > index b1ffb81893d4..447b4e6ef7e4 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_vf_vsi_vlan_ops.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_vf_vsi_vlan_ops.c
> > @@ -21,6 +21,108 @@ noop_vlan(struct ice_vsi __always_unused *vsi)
> >  	return 0;
> >  }
> >
> > +static void ice_port_vlan_on(struct ice_vsi *vsi)
> > +{
> > +	struct ice_vsi_vlan_ops *vlan_ops;
> > +	struct ice_pf *pf = vsi->back;
> > +
> > +	if (ice_is_dvm_ena(&pf->hw)) {
> > +		vlan_ops = &vsi->outer_vlan_ops;
> > +
> > +		/* setup outer VLAN ops */
> > +		vlan_ops->set_port_vlan = ice_vsi_set_outer_port_vlan;
> > +		vlan_ops->clear_port_vlan = ice_vsi_clear_outer_port_vlan;
> > +		vlan_ops->clear_port_vlan = ice_vsi_clear_outer_port_vlan;
> > +		vlan_ops->ena_rx_filtering =
> > +			ice_vsi_ena_rx_vlan_filtering;
> > +
> > +		/* setup inner VLAN ops */
> > +		vlan_ops = &vsi->inner_vlan_ops;
> > +		vlan_ops->add_vlan = noop_vlan_arg;
> > +		vlan_ops->del_vlan = noop_vlan_arg;
> > +		vlan_ops->ena_stripping = ice_vsi_ena_inner_stripping;
> > +		vlan_ops->dis_stripping = ice_vsi_dis_inner_stripping;
> > +		vlan_ops->ena_insertion = ice_vsi_ena_inner_insertion;
> > +		vlan_ops->dis_insertion = ice_vsi_dis_inner_insertion;
> > +	} else {
> > +		vlan_ops = &vsi->inner_vlan_ops;
> > +
> > +		vlan_ops->set_port_vlan = ice_vsi_set_inner_port_vlan;
> > +		vlan_ops->clear_port_vlan = ice_vsi_clear_inner_port_vlan;
> > +		vlan_ops->clear_port_vlan = ice_vsi_clear_inner_port_vlan;
> > +		vlan_ops->ena_rx_filtering =
> > +			ice_vsi_ena_rx_vlan_filtering;
> > +	}
> 
> ->ena_rx_filtering is filled with just one possible value, so it could
> be done outside ifs.

Agree

> 
> > +}
> > +
> > +static void ice_port_vlan_off(struct ice_vsi *vsi)
> > +{
> > +	struct ice_vsi_vlan_ops *vlan_ops;
> > +	struct ice_pf *pf = vsi->back;
> > +
> > +	if (ice_is_dvm_ena(&pf->hw)) {
> > +		/* setup inner VLAN ops */
> > +		vlan_ops = &vsi->inner_vlan_ops;
> > +
> > +		vlan_ops->ena_stripping = ice_vsi_ena_inner_stripping;
> > +		vlan_ops->dis_stripping = ice_vsi_dis_inner_stripping;
> > +		vlan_ops->ena_insertion = ice_vsi_ena_inner_insertion;
> > +		vlan_ops->dis_insertion = ice_vsi_dis_inner_insertion;
> > +
> > +		vlan_ops = &vsi->outer_vlan_ops;
> > +
> > +		vlan_ops->del_vlan = ice_vsi_del_vlan;
> > +		vlan_ops->ena_stripping = ice_vsi_ena_outer_stripping;
> > +		vlan_ops->dis_stripping = ice_vsi_dis_outer_stripping;
> > +		vlan_ops->ena_insertion = ice_vsi_ena_outer_insertion;
> > +		vlan_ops->dis_insertion = ice_vsi_dis_outer_insertion;
> > +	} else {
> > +		vlan_ops = &vsi->inner_vlan_ops;
> > +
> > +		vlan_ops->del_vlan = ice_vsi_del_vlan;
> > +		vlan_ops->ena_stripping = ice_vsi_ena_inner_stripping;
> > +		vlan_ops->dis_stripping = ice_vsi_dis_inner_stripping;
> > +		vlan_ops->ena_insertion = ice_vsi_ena_inner_insertion;
> > +		vlan_ops->dis_insertion = ice_vsi_dis_inner_insertion;
> > +	}
> 
> The whole ->inner_vlan_ops is filled with the same values, the only
> difference is ->del_vlan, which can be left in `else`, the rest can be
> set up unconditionally.

Makes sense

> 
> > +
> > +	if (!test_bit(ICE_FLAG_VF_VLAN_PRUNING, pf->flags))
> > +		vlan_ops->ena_rx_filtering = noop_vlan;
> > +	else
> > +		vlan_ops->ena_rx_filtering =
> > +			ice_vsi_ena_rx_vlan_filtering;
> > +}
> > +
> > +/**
> > + * ice_vf_vsi_enable_port_vlan - Set VSI VLAN ops to support port VLAN
> > + * @vsi: VF's VSI being configured
> > + *
> > + * The function won't create port VLAN, it only allows to create port VLAN
> > + * using VLAN ops on the VF VSI.
> > + */
> > +void ice_vf_vsi_enable_port_vlan(struct ice_vsi *vsi)
> > +{
> > +	if (WARN_ON(!vsi->vf))
> 
> I'd use WARN_ON_ONCE(). Otherwise, it may be possible to flood kernel
> log buffer (-> CPU) from the userspace.

Sure

> 
> > +		return;
> > +
> > +	ice_port_vlan_on(vsi);
> > +}
> > +
> > +/**
> > + * ice_vf_vsi_disable_port_vlan - Clear VSI support for creating port VLAN
> > + * @vsi: VF's VSI being configured
> > + *
> > + * The function should be called after removing port VLAN on VSI
> > + * (using VLAN ops)
> > + */
> > +void ice_vf_vsi_disable_port_vlan(struct ice_vsi *vsi)
> > +{
> > +	if (WARN_ON(!vsi->vf))
> 
> (same)
> 
> > +		return;
> > +
> > +	ice_port_vlan_off(vsi);
> > +}
> 
> [...]
> 
> > +	info->valid_sections = cpu_to_le16(ICE_AQ_VSI_PROP_VLAN_VALID |
> > +					   ICE_AQ_VSI_PROP_SW_VALID);
> > +
> > +	ret = ice_update_vsi(hw, vsi->idx, ctxt, NULL);
> > +	if (ret)
> > +		dev_info(ice_hw_to_dev(hw), "update VSI for port VLAN failed, err %d aq_err %s\n",
> 
> (dev_err())
> (+ %pe)
> 
> > +			 ret, ice_aq_str(hw->adminq.sq_last_status));
> > +
> > +	kfree(ctxt);
> > +	return ret;
> > +}
> [...]
> 
> Thanks,
> Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH net-next 12/12] ice: Ethtool fdb_cnt stats
  2023-04-21 16:32   ` Alexander Lobakin
@ 2023-05-09 12:52     ` Drewek, Wojciech
  2023-05-09 15:14       ` Alexander Lobakin
  0 siblings, 1 reply; 46+ messages in thread
From: Drewek, Wojciech @ 2023-05-09 12:52 UTC (permalink / raw)
  To: Lobakin, Aleksander
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar



> -----Original Message-----
> From: Lobakin, Aleksander <aleksander.lobakin@intel.com>
> Sent: piątek, 21 kwietnia 2023 18:33
> To: Drewek, Wojciech <wojciech.drewek@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Ertman, David M <david.m.ertman@intel.com>;
> michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel <pawel.chmielewski@intel.com>;
> Samudrala, Sridhar <sridhar.samudrala@intel.com>
> Subject: Re: [PATCH net-next 12/12] ice: Ethtool fdb_cnt stats
> 
> From: Wojciech Drewek <wojciech.drewek@intel.com>
> Date: Mon, 17 Apr 2023 11:34:12 +0200
> 
> > Introduce new ethtool statistic which is 'fdb_cnt'. It
> > provides information about how many bridge fdbs are created on
> > a given netdev.
> 
> [...]
> 
> > @@ -339,6 +340,7 @@ ice_eswitch_br_fdb_entry_delete(struct ice_esw_br *bridge,
> >  	ice_eswitch_br_flow_delete(pf, fdb_entry->flow);
> >
> >  	kfree(fdb_entry);
> > +	vsi->fdb_cnt--;
> 
> Are FDB operations always serialized within one netdev? Because if it's
> not, this probably needs to be atomic_t.

All the FDB operations are done either from notification context so they are protected by
rtnl_lock or explicitly protected by us (see ice_eswitch_br_fdb_event_work, we use rtnl_lock there).

> 
> >  }
> >
> >  static void
> 
> [...]
> 
> > diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
> > index 8407c7175cf6..d06b2a688323 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
> > @@ -64,6 +64,7 @@ static const struct ice_stats ice_gstrings_vsi_stats[] = {
> >  	ICE_VSI_STAT("tx_linearize", tx_linearize),
> >  	ICE_VSI_STAT("tx_busy", tx_busy),
> >  	ICE_VSI_STAT("tx_restart", tx_restart),
> > +	ICE_VSI_STAT("fdb_cnt", fdb_cnt),
> 
> It's confusing to me to see it in the Ethtool stats. They're usually
> counters, ice is no an exception. But this one is not, so it might give
> wrong impression.
> Have you considered alternatives? rtnl (iproute) or maybe even Devlink
> (but I believe the former fits better)? This might be a good candidate
> to become a generic stat, who knows.

I'll do some research on alternatives

> 
> >  };
> >
> >  enum ice_ethtool_test_id {
> 
> Thanks,
> Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 10/12] ice: implement static version of ageing
  2023-05-09 10:55     ` Drewek, Wojciech
@ 2023-05-09 14:55       ` Alexander Lobakin
  0 siblings, 0 replies; 46+ messages in thread
From: Alexander Lobakin @ 2023-05-09 14:55 UTC (permalink / raw)
  To: Drewek, Wojciech
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Tue, 9 May 2023 12:55:53 +0200

> Hi Olek
> 
> Sorry for late response, I didn't manage to answer all your comments before my vacation :)
> Will continue this week.

No problems, I had a little vacation as well, so wouldn't reply either
way :D

> 
>> -----Original Message-----
>> From: Lobakin, Aleksander <aleksander.lobakin@intel.com>
>> Sent: piątek, 21 kwietnia 2023 18:23
>> To: Drewek, Wojciech <wojciech.drewek@intel.com>
>> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Ertman, David M <david.m.ertman@intel.com>;
>> michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel <pawel.chmielewski@intel.com>;
>> Samudrala, Sridhar <sridhar.samudrala@intel.com>
>> Subject: Re: [PATCH net-next 10/12] ice: implement static version of ageing
>>
>> From: Wojciech Drewek <wojciech.drewek@intel.com>
>> Date: Mon, 17 Apr 2023 11:34:10 +0200
>>
>>> From: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
>>>
>>> Remove fdb entries always when ageing time expired.
>>>
>>> Allow user to set ageing time using port object attribute.
>>>
>>> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Thanks,
Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 09/12] ice: implement bridge port vlan
  2023-05-09 11:25     ` Drewek, Wojciech
@ 2023-05-09 15:06       ` Alexander Lobakin
  0 siblings, 0 replies; 46+ messages in thread
From: Alexander Lobakin @ 2023-05-09 15:06 UTC (permalink / raw)
  To: Drewek, Wojciech
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Tue, 9 May 2023 13:25:40 +0200

> 
> 
>> -----Original Message-----
>> From: Lobakin, Aleksander <aleksander.lobakin@intel.com>
>> Sent: piątek, 21 kwietnia 2023 18:35
>> To: Drewek, Wojciech <wojciech.drewek@intel.com>
>> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Ertman, David M <david.m.ertman@intel.com>;
>> michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel <pawel.chmielewski@intel.com>;
>> Samudrala, Sridhar <sridhar.samudrala@intel.com>
>> Subject: Re: [PATCH net-next 09/12] ice: implement bridge port vlan

[...]

>>> +	/* Setting port vlan on uplink isn't supported by hw */
>>> +	if (port->type == ICE_ESWITCH_BR_UPLINK_PORT)
>>> +		return -EOPNOTSUPP;
>>> +
>>> +	if (port->pvid) {
>>> +		dev_info(dev,
>>
>> dev_err()?
> 
> To me it's not an error, port vlan is already configured

Usually, every user action leading to an errno instead of 0 (success) is
an error, it's the user who is responsible for not doing such things.
A bit more details below, I reply bottom-up this time :z

> 
>>
>>> +			 "Port VLAN (vsi=%u, vid=%u) already exists on the port, remove it before adding new one\n",
>>> +			 port->vsi_idx, port->pvid);
>>> +		return -EEXIST;
>>
>> Hmm, isn't -EBUSY more common for such cases?
>>
>> (below as well)
> 
> I don't think so, user is trying to configure something that is already done.

+

>>> @@ -639,14 +697,29 @@ ice_eswitch_br_vlan_create(u16 vid, u16 flags, struct ice_esw_br_port *port)
>>>
>>>  	vlan->vid = vid;
>>>  	vlan->flags = flags;
>>> +	if ((flags & BRIDGE_VLAN_INFO_PVID) &&
>>> +	    (flags & BRIDGE_VLAN_INFO_UNTAGGED)) {
>>> +		err = ice_eswitch_br_set_pvid(port, vlan);
>>> +		if (err)
>>> +			goto err_set_pvid;
>>> +	} else if ((flags & BRIDGE_VLAN_INFO_PVID) ||
>>> +		   (flags & BRIDGE_VLAN_INFO_UNTAGGED)) {
>>> +		dev_info(dev, "VLAN push and pop are supported only simultaneously\n");
>>
>> (same for dev_err(), as well as below)
> 
> 
> Again, is this an error really? We just don't support such case.

Well, "not supported" is an error in the kernel usually. It's like,
"user is responsible for checking the capabilities before trying to
configure/use something, if he didn't care, then we don't as well" :D
The main problem here is as follows:

1. Most distros have "quiet" in the default command line, which limits
   the default output to errors+.
2. User tries to configure something, which is not supported.
3. Essentially has a bail out with -EOPNOTSUPP.
4. The default kernel output says nothing.

It's not an issue for tools like dmesg, since they usually display the
whole log with every loglevel, but still not really consistent as for
me. Plus, even in such tools, dev_info() will just lost amidst tons of
other nonsensical output, while dev_err() would be marked bold red.

>>
>>> +		return ERR_PTR(-EOPNOTSUPP);
>>> +	}
[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH net-next 12/12] ice: Ethtool fdb_cnt stats
  2023-05-09 12:52     ` Drewek, Wojciech
@ 2023-05-09 15:14       ` Alexander Lobakin
  0 siblings, 0 replies; 46+ messages in thread
From: Alexander Lobakin @ 2023-05-09 15:14 UTC (permalink / raw)
  To: Drewek, Wojciech
  Cc: intel-wired-lan, netdev, Ertman, David M, michal.swiatkowski,
	marcin.szycik, Chmielewski, Pawel, Samudrala, Sridhar

From: Wojciech Drewek <wojciech.drewek@intel.com>
Date: Tue, 9 May 2023 14:52:26 +0200

> 
> 
>> -----Original Message-----
>> From: Lobakin, Aleksander <aleksander.lobakin@intel.com>
>> Sent: piątek, 21 kwietnia 2023 18:33
>> To: Drewek, Wojciech <wojciech.drewek@intel.com>
>> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Ertman, David M <david.m.ertman@intel.com>;
>> michal.swiatkowski@linux.intel.com; marcin.szycik@linux.intel.com; Chmielewski, Pawel <pawel.chmielewski@intel.com>;
>> Samudrala, Sridhar <sridhar.samudrala@intel.com>
>> Subject: Re: [PATCH net-next 12/12] ice: Ethtool fdb_cnt stats
>>
>> From: Wojciech Drewek <wojciech.drewek@intel.com>
>> Date: Mon, 17 Apr 2023 11:34:12 +0200
>>
>>> Introduce new ethtool statistic which is 'fdb_cnt'. It
>>> provides information about how many bridge fdbs are created on
>>> a given netdev.
>>
>> [...]
>>
>>> @@ -339,6 +340,7 @@ ice_eswitch_br_fdb_entry_delete(struct ice_esw_br *bridge,
>>>  	ice_eswitch_br_flow_delete(pf, fdb_entry->flow);
>>>
>>>  	kfree(fdb_entry);
>>> +	vsi->fdb_cnt--;
>>
>> Are FDB operations always serialized within one netdev? Because if it's
>> not, this probably needs to be atomic_t.
> 
> All the FDB operations are done either from notification context so they are protected by
> rtnl_lock or explicitly protected by us (see ice_eswitch_br_fdb_event_work, we use rtnl_lock there).

BTW, I would replace relying on RTNL lock with own locks bit-by-bit. I
would say, it was designed more for the kernel core internal usage, but
then got abused by tons of drivers.
Sure, it's outside of this series' scope, just FYI. This one is fine for
me as long as concurrent accesses from different SMP CPUs can't happen here.

[...]

Thanks,
Olek

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2023-05-09 15:16 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-17  9:34 [PATCH net-next 00/12] ice: switchdev bridge offload Wojciech Drewek
2023-04-17  9:34 ` [PATCH net-next 01/12] ice: Minor switchdev fixes Wojciech Drewek
2023-04-19 14:35   ` Alexander Lobakin
2023-04-17  9:34 ` [PATCH net-next 02/12] ice: Remove exclusion code for RDMA+SRIOV Wojciech Drewek
2023-04-19 14:38   ` Alexander Lobakin
2023-04-25 15:26   ` [Intel-wired-lan] " Michal Schmidt
2023-04-17  9:34 ` [PATCH net-next 03/12] ice: Unset src prune on uplink VSI Wojciech Drewek
2023-04-19 14:49   ` Alexander Lobakin
2023-04-17  9:34 ` [PATCH net-next 04/12] ice: Implement basic eswitch bridge setup Wojciech Drewek
2023-04-19 15:23   ` Alexander Lobakin
2023-04-20  9:54     ` Drewek, Wojciech
2023-04-20 10:46       ` Drewek, Wojciech
2023-04-20 16:53         ` Alexander Lobakin
2023-04-20 16:51       ` Alexander Lobakin
2023-04-17  9:34 ` [PATCH net-next 05/12] ice: Switchdev FDB events support Wojciech Drewek
2023-04-19 15:38   ` Alexander Lobakin
2023-04-20 11:27     ` Drewek, Wojciech
2023-04-20 16:59       ` Alexander Lobakin
2023-04-21  8:45         ` Drewek, Wojciech
2023-04-17  9:34 ` [PATCH net-next 06/12] ice: Add guard rule when creating FDB in switchdev Wojciech Drewek
2023-04-21 14:22   ` Alexander Lobakin
2023-04-25  9:17     ` Drewek, Wojciech
2023-04-26  9:50       ` Drewek, Wojciech
2023-04-26 15:24         ` Alexander Lobakin
2023-04-27  7:24           ` Drewek, Wojciech
2023-04-17  9:34 ` [PATCH net-next 07/12] ice: Accept LAG netdevs in bridge offloads Wojciech Drewek
2023-04-21 14:40   ` Alexander Lobakin
2023-04-26 11:31     ` Drewek, Wojciech
2023-04-26 15:31       ` Alexander Lobakin
2023-04-17  9:34 ` [PATCH net-next 08/12] ice: Add VLAN FDB support in switchdev mode Wojciech Drewek
2023-04-21 15:25   ` Alexander Lobakin
2023-04-27 10:28     ` Drewek, Wojciech
2023-05-08 14:09       ` Alexander Lobakin
2023-04-17  9:34 ` [PATCH net-next 09/12] ice: implement bridge port vlan Wojciech Drewek
2023-04-21 16:35   ` Alexander Lobakin
2023-05-09 11:25     ` Drewek, Wojciech
2023-05-09 15:06       ` Alexander Lobakin
2023-04-17  9:34 ` [PATCH net-next 10/12] ice: implement static version of ageing Wojciech Drewek
2023-04-21 16:22   ` Alexander Lobakin
2023-05-09 10:55     ` Drewek, Wojciech
2023-05-09 14:55       ` Alexander Lobakin
2023-04-17  9:34 ` [PATCH net-next 11/12] ice: add tracepoints for the switchdev bridge Wojciech Drewek
2023-04-17  9:34 ` [PATCH net-next 12/12] ice: Ethtool fdb_cnt stats Wojciech Drewek
2023-04-21 16:32   ` Alexander Lobakin
2023-05-09 12:52     ` Drewek, Wojciech
2023-05-09 15:14       ` Alexander Lobakin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).