All of lore.kernel.org
 help / color / mirror / Atom feed
* [pull request][net-next 00/16] mlx5 updates 2021-06-09
@ 2021-06-10  2:57 Saeed Mahameed
  2021-06-10  2:57 ` [net-next 01/16] net/mlx5: mlx5_ifc support for header insert/remove Saeed Mahameed
                   ` (16 more replies)
  0 siblings, 17 replies; 19+ messages in thread
From: Saeed Mahameed @ 2021-06-10  2:57 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski; +Cc: netdev, Saeed Mahameed

From: Saeed Mahameed <saeedm@nvidia.com>

Hi Dave and Jakub,

This series introduces insert/remove header support
for sw steering and switchdev bridge offloads support.

For more information please see tag log below.

Please pull and let me know if there is any problem.

Thanks,
Saeed.

---
The following changes since commit 0d155170d6eebbd6c50e16bc928c31b3f5473025:

  Merge branch 'ipa-mem-1' (2021-06-09 15:59:34 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2021-06-09

for you to fetch changes up to 9724fd5d9c2a0d3686b799ed5ca90cb9378ca4f2:

  net/mlx5: Bridge, add tracepoints (2021-06-09 18:36:12 -0700)

----------------------------------------------------------------
mlx5-updates-2021-06-09

Introduce steering header insert/remove and switchdev bridge offloads

1) From Yevgeny, Steering header insert/remove support

ConnectX supports offloading of various encapsulations and decapsulations
(e.g. VXLAN), which are performed by 'Packet Reformat' action.
Starting with ConnectX-6 DX, a new reformat type is supported - INSERT_HEADER.
This reformat allows inserting an arbitrary size buffer at a selected location
in the packet on RX flows.

The insert/remove header support are needed as a prerequisite for the
bridge offloads vlan pop/push supprt, see below.

2) From Vlad, Support for bridge offloads for switchdev mode

This change implements bridge offloads with VLAN-support that works on top
of mlx5 representors in switchdev mode.

HIGH-LEVEL OVERVIEW

Hardware supported by mlx5 driver doesn't provide dynamic learning or aging
functionality and requires the driver to emulate all switch-like behavior
in software. As such, all packets by default go through miss path, appear
on representor and get to software bridge, if it is the upper device of the
representor. This causes bridge to process packet in software, learn the
MAC address to FDB and send SWITCHDEV_FDB_ADD_TO_DEVICE event to all
subscribers. Upon reception of SWITCHDEV_FDB_ADD_TO_DEVICE notification
mlx5 bridge offloads the FDB to hardware and sends back
SWITCHDEV_FDB_ADD_TO_BRIDGE notification to prevent such entries from being
aged out by kernel bridge. Leaving aging to kernel bridge would result
deletion of offloaded dynamic FDB entries every aging_time period due to
packets being processed by hardware and, consecutively, 'used' timestamp
for FDB entry not being updated. Hardware aging is emulated in driver by
running periodic workqueue task that manually updates the rules according
to their hardware counter:

- If hardware counter has changed since last update, the handler updates
'used' timestamp in kernel bridge dynamic entry by sending
SWITCHDEV_FDB_ADD_TO_BRIDGE notification for the entry.

- If FDB entry wasn't updated for user-controllable aging_time period,
then the FDB entry is unoffloaded from hardware and corresponding
SWITCHDEV_FDB_DEL_TO_BRIDGE notification is sent to kernel bridge.

The mlx5 bridge offload implementation fully supports port VLAN objects,
including PVID (vlan push) and "Egress Untagged" (vlan pop).

SOFTWARE ARCHITECTURE

Mlx5_eswitch is extended with pointer to new mlx5_esw_bridge_offloads
structure which has a linked list of mlx5_esw_bridge objects. Struct
mlx5_esw_bridge is the main switch object in mlx5 that holds all data for
offloaded FDB entries and metadata for bridge ports and their vlans. The
mlx5_esw_bridge object is created when first representor of eswitch vport
is added to bridge and deleted when the last representor is detached from
it. Bridge FDB entries are saved in linked list (to iterate over all FDB
entries in aging workqueue task) and also in hashtable for quick lookup by
MAC+VLAN tuple. Bridge FDB entries are saved in linked list (to iterate
over all FDB entries in aging workqueue task) and in hashtable for quick
lookup by MAC+VLAN tuple. Port metadata is stored in struct
mlx5_esw_bridge_port that is saved in xarray to allow quick lookup by vport
number. Part of the port metadata is the set of port vlans that are
represented by mlx5_esw_bridge_vlan structure. The vlan structure points to
all FDBs on vlan/port via fdb_list linked list.

Simplified diagram of mlx5 bridge objects:

                      +------------------+
                      |  mxl5_eswitch    |
                      |                  |
                      |  br_offloads     |
                      +--------+---------+
                               |
                      +--------v-------------------+
                      |  mlx5_esw_bridge_offloads  |
                      |                            |
                   +-->  bridges                   |
                   |  +-------+--------------------+
                   |          |
                   |          |
                   |      +---v---------------+
                   |      | mlx5_esw_bridge   |
                   |      |                   |
                   |      | vports            |
                   |      |                   |
                   |      | fdb_ht            |
                   |      +---+---------------+
                   |          |
                   |      +---v---------------+
                   +------+ mlx5_esw_bridge   |
                          |                   |
+-------------------------+ vports            |
|                         |                   |
|                         | fdb_ht            +------------------------------------------+
|                         +-------------------+                                          |
|                                                                                        |
|                                                                                        |
| +----------------------+                                 +---------------------------+ |
+-> mlx5_esw_bridge_port |                              +--> mlx5_esw_bridge_fdb_entry <-+
| |                      |    +----------------------+  |  +--+------------------------+ |
| | vlans                +--+-> mlx5_esw_bridge_vlan |  |     |                          |
| |                      |  | |                      |  |  +--v------------------------+ |
| +----------------------+  | | fdb_list             +--+  | mlx5_esw_bridge_fdb_entry <-+
|                           | +-------^--------------+     +--+------------------------+ |
| +----------------------+  |         |                       |                          |
+-> mlx5_esw_bridge_port |  |         +-----------------------+                          |
  |                      |  |                                                            |
  | vlans                |  | -----------------------+                                   |
  |                      |  +-> mlx5_esw_bridge_vlan |                                   |
  +----------------------+    |                      |     +---------------------------+ |
                              | fdb_list             +-----> mlx5_esw_bridge_fdb_entry <-+
                              +-------^--------------+     +--+------------------------+
                                      |                       |
                                      +-----------------------+

HARDWARE REPRESENTATION

In order to adhere to kernel software datapath model bridge offloads must
come after TC and NF FDBs. However, since netfilter offload in mlx5 is
implemented with unmanaged tables, its miss path is not automatically
connected to next priority and requires the code to manually connect with
slow table. To keep bridge offloads encapsulated and not mix it with
eswitch offloads new FDB_TC_MISS priority is created between FDB_FT_OFFLOAD
and FDB_SLOW_PATH which allows bridge offloads to be created without
exposing its internal tables to any other modules since miss path of
managed TC-miss table is automatically wired to next priority.

The bridge tables are created with new priority FDB_BR_OFFLOAD in FDB
namespace. The new priority is between tc-miss and slow path priorities.
Priority consist of two levels: the ingress table that is global per
eswitch and matches incoming packets by src_mac/vid and redirects them to
next level (egress table) that is chosen according to ingress port bridge
membership and matches on dst_mac/vid in order to redirect packet to vport
according to the following diagram:

                +
                |
      +---------v----------+
      |                    |
      |   FDB_TC_OFFLOAD   |
      |                    |
      +---------+----------+
                |
                |
      +---------v----------+
      |                    |
      |   FDB_FT_OFFLOAD   |
      |                    |
      +---------+----------+
                |
                |
      +---------v----------+
      |                    |
      |    FDB_TC_MISS     |
      |                    |
      +---------+----------+
                |
+--------------------------------------+
|               |                      |
|        +------+                      |
|        |                             |
| +------v--------+   FDB_BR_OFFLOAD   |
| | INGRESS_TABLE |                    |
| +------+---+----+                    |
|        |   |      match              |
|        |   +---------+               |
|        |             |               |    +-------+
|        |     +-------v-------+ match |    |       |
|        |     | EGRESS_TABLE  +------------> vport |
|        |     +-------+-------+       |    |       |
|        |             |               |    +-------+
|        |    miss     |               |
|        +------+------+               |
|               |                      |
+--------------------------------------+
                |
                |
      +---------v----------+
      |                    |
      |   FDB_SLOW_PATH    |
      |                    |
      +---------+----------+
                |
                v

PATCHES OVERVIEW

1-3 - Miscellaneous refactorings and infrastructure changes.

4 - Mlx5 bridge offload infrastructure and dedicated fs_core
namespace/tables implementation.

5 - FDB entry offload.

6 - Dynamic FDB entry aging.

7-10 - VLAN filtering offload.

11 - Tracepoints for main mlx5 bridge offload events (FDB entry
offload/unoffload, VLAN add/delete, etc.)

--

----------------------------------------------------------------
Vlad Buslov (10):
      net/mlx5: Create TC-miss priority and table
      net/mlx5e: Refactor mlx5e_eswitch_{*}rep() helpers
      net/mlx5: Bridge, add offload infrastructure
      net/mlx5: Bridge, handle FDB events
      net/mlx5: Bridge, dynamic entry ageing
      net/mlx5: Bridge, implement infrastructure for vlans
      net/mlx5: Bridge, match FDB entry vlan tag
      net/mlx5: Bridge, support pvid and untagged vlan configurations
      net/mlx5: Bridge, filter tagged packets that didn't match tagged fg
      net/mlx5: Bridge, add tracepoints

Yevgeny Kliteynik (6):
      net/mlx5: mlx5_ifc support for header insert/remove
      net/mlx5: DR, Split reformat state to Encap and Decap
      net/mlx5: DR, Allow encap action for RX for supporting devices
      net/mlx5: Added new parameters to reformat context
      net/mlx5: DR, Added support for INSERT_HEADER reformat type
      net/mlx5: DR, Support EMD tag in modify header for STEv1

 .../device_drivers/ethernet/mellanox/mlx5.rst      |   88 ++
 drivers/infiniband/hw/mlx5/fs.c                    |    9 +-
 drivers/net/ethernet/mellanox/mlx5/core/Kconfig    |   10 +
 drivers/net/ethernet/mellanox/mlx5/core/Makefile   |    1 +
 .../ethernet/mellanox/mlx5/core/en/rep/bridge.c    |  424 +++++++
 .../ethernet/mellanox/mlx5/core/en/rep/bridge.h    |   21 +
 .../net/ethernet/mellanox/mlx5/core/en/tc_tun.c    |   38 +-
 .../ethernet/mellanox/mlx5/core/en/tc_tun_encap.c  |   17 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c   |    7 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.h   |    6 +-
 .../net/ethernet/mellanox/mlx5/core/esw/bridge.c   | 1299 ++++++++++++++++++++
 .../net/ethernet/mellanox/mlx5/core/esw/bridge.h   |   53 +
 .../ethernet/mellanox/mlx5/core/esw/bridge_priv.h  |   53 +
 .../mlx5/core/esw/diag/bridge_tracepoint.h         |  113 ++
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.h  |    7 +
 .../ethernet/mellanox/mlx5/core/eswitch_offloads.c |   19 +-
 drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c   |   29 +-
 drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h   |    4 +-
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c  |   21 +-
 drivers/net/ethernet/mellanox/mlx5/core/fw.c       |    6 +
 .../mellanox/mlx5/core/steering/dr_action.c        |  187 ++-
 .../ethernet/mellanox/mlx5/core/steering/dr_cmd.c  |    7 +-
 .../ethernet/mellanox/mlx5/core/steering/dr_ste.h  |    1 +
 .../mellanox/mlx5/core/steering/dr_ste_v0.c        |    5 +-
 .../mellanox/mlx5/core/steering/dr_ste_v1.c        |  120 +-
 .../mellanox/mlx5/core/steering/dr_types.h         |   22 +-
 .../ethernet/mellanox/mlx5/core/steering/fs_dr.c   |   20 +-
 .../ethernet/mellanox/mlx5/core/steering/mlx5dr.h  |    3 +
 include/linux/mlx5/device.h                        |   10 +
 include/linux/mlx5/fs.h                            |   14 +-
 include/linux/mlx5/mlx5_ifc.h                      |   40 +-
 31 files changed, 2523 insertions(+), 131 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [net-next 01/16] net/mlx5: mlx5_ifc support for header insert/remove
  2021-06-10  2:57 [pull request][net-next 00/16] mlx5 updates 2021-06-09 Saeed Mahameed
@ 2021-06-10  2:57 ` Saeed Mahameed
  2021-06-10 20:40   ` patchwork-bot+netdevbpf
  2021-06-10  2:58 ` [net-next 02/16] net/mlx5: DR, Split reformat state to Encap and Decap Saeed Mahameed
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 19+ messages in thread
From: Saeed Mahameed @ 2021-06-10  2:57 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Yevgeny Kliteynik, Vlad Buslov, Jianbo Liu, Saeed Mahameed

From: Yevgeny Kliteynik <kliteyn@nvidia.com>

Add support for HCA caps 2 that contains capabilities for the new
insert/remove header actions.

Added the required definitions for supporting the new reformat type:
added packet reformat parameters, reformat anchors and definitions
to allow copy/set into the inserted EMD (Embedded MetaData) tag.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/fw.c |  6 +++
 include/linux/mlx5/device.h                  | 10 +++++
 include/linux/mlx5/mlx5_ifc.h                | 40 +++++++++++++++++---
 3 files changed, 50 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw.c b/drivers/net/ethernet/mellanox/mlx5/core/fw.c
index 02558ac2ace6..016d26f809a5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw.c
@@ -148,6 +148,12 @@ int mlx5_query_hca_caps(struct mlx5_core_dev *dev)
 	if (err)
 		return err;
 
+	if (MLX5_CAP_GEN(dev, hca_cap_2)) {
+		err = mlx5_core_get_caps(dev, MLX5_CAP_GENERAL_2);
+		if (err)
+			return err;
+	}
+
 	if (MLX5_CAP_GEN(dev, eth_net_offloads)) {
 		err = mlx5_core_get_caps(dev, MLX5_CAP_ETHERNET_OFFLOADS);
 		if (err)
diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h
index 578c4ccae91c..0025913505ab 100644
--- a/include/linux/mlx5/device.h
+++ b/include/linux/mlx5/device.h
@@ -1179,6 +1179,7 @@ enum mlx5_cap_type {
 	MLX5_CAP_VDPA_EMULATION = 0x13,
 	MLX5_CAP_DEV_EVENT = 0x14,
 	MLX5_CAP_IPSEC,
+	MLX5_CAP_GENERAL_2 = 0x20,
 	/* NUM OF CAP Types */
 	MLX5_CAP_NUM
 };
@@ -1220,6 +1221,15 @@ enum mlx5_qcam_feature_groups {
 #define MLX5_CAP_GEN_MAX(mdev, cap) \
 	MLX5_GET(cmd_hca_cap, mdev->caps.hca_max[MLX5_CAP_GENERAL], cap)
 
+#define MLX5_CAP_GEN_2(mdev, cap) \
+	MLX5_GET(cmd_hca_cap_2, mdev->caps.hca_cur[MLX5_CAP_GENERAL_2], cap)
+
+#define MLX5_CAP_GEN_2_64(mdev, cap) \
+	MLX5_GET64(cmd_hca_cap_2, mdev->caps.hca_cur[MLX5_CAP_GENERAL_2], cap)
+
+#define MLX5_CAP_GEN_2_MAX(mdev, cap) \
+	MLX5_GET(cmd_hca_cap_2, mdev->caps.hca_max[MLX5_CAP_GENERAL_2], cap)
+
 #define MLX5_CAP_ETH(mdev, cap) \
 	MLX5_GET(per_protocol_networking_offload_caps,\
 		 mdev->caps.hca_cur[MLX5_CAP_ETHERNET_OFFLOADS], cap)
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index eb86e80e4643..057db0eaf195 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -435,7 +435,10 @@ struct mlx5_ifc_flow_table_prop_layout_bits {
 
 	u8         reserved_at_40[0x20];
 
-	u8         reserved_at_60[0x18];
+	u8         reserved_at_60[0x2];
+	u8         reformat_insert[0x1];
+	u8         reformat_remove[0x1];
+	u8         reserver_at_64[0x14];
 	u8         log_max_ft_num[0x8];
 
 	u8         reserved_at_80[0x10];
@@ -1312,7 +1315,8 @@ struct mlx5_ifc_cmd_hca_cap_bits {
 	u8         reserved_at_0[0x1f];
 	u8         vhca_resource_manager[0x1];
 
-	u8         reserved_at_20[0x3];
+	u8         hca_cap_2[0x1];
+	u8         reserved_at_21[0x2];
 	u8         event_on_vhca_state_teardown_request[0x1];
 	u8         event_on_vhca_state_in_use[0x1];
 	u8         event_on_vhca_state_active[0x1];
@@ -1732,6 +1736,17 @@ struct mlx5_ifc_cmd_hca_cap_bits {
 	u8	   reserved_at_7c0[0x40];
 };
 
+struct mlx5_ifc_cmd_hca_cap_2_bits {
+	u8	   reserved_at_0[0xa0];
+
+	u8	   max_reformat_insert_size[0x8];
+	u8	   max_reformat_insert_offset[0x8];
+	u8	   max_reformat_remove_size[0x8];
+	u8	   max_reformat_remove_offset[0x8];
+
+	u8	   reserved_at_c0[0x740];
+};
+
 enum mlx5_flow_destination_type {
 	MLX5_FLOW_DESTINATION_TYPE_VPORT        = 0x0,
 	MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE   = 0x1,
@@ -3105,6 +3120,7 @@ struct mlx5_ifc_roce_addr_layout_bits {
 
 union mlx5_ifc_hca_cap_union_bits {
 	struct mlx5_ifc_cmd_hca_cap_bits cmd_hca_cap;
+	struct mlx5_ifc_cmd_hca_cap_2_bits cmd_hca_cap_2;
 	struct mlx5_ifc_odp_cap_bits odp_cap;
 	struct mlx5_ifc_atomic_caps_bits atomic_caps;
 	struct mlx5_ifc_roce_cap_bits roce_cap;
@@ -5785,12 +5801,14 @@ struct mlx5_ifc_query_eq_in_bits {
 };
 
 struct mlx5_ifc_packet_reformat_context_in_bits {
-	u8         reserved_at_0[0x5];
-	u8         reformat_type[0x3];
-	u8         reserved_at_8[0xe];
+	u8         reformat_type[0x8];
+	u8         reserved_at_8[0x4];
+	u8         reformat_param_0[0x4];
+	u8         reserved_at_10[0x6];
 	u8         reformat_data_size[0xa];
 
-	u8         reserved_at_20[0x10];
+	u8         reformat_param_1[0x8];
+	u8         reserved_at_28[0x8];
 	u8         reformat_data[2][0x8];
 
 	u8         more_reformat_data[][0x8];
@@ -5830,12 +5848,20 @@ struct mlx5_ifc_alloc_packet_reformat_context_out_bits {
 	u8         reserved_at_60[0x20];
 };
 
+enum {
+	MLX5_REFORMAT_CONTEXT_ANCHOR_MAC_START = 0x1,
+	MLX5_REFORMAT_CONTEXT_ANCHOR_IP_START = 0x7,
+	MLX5_REFORMAT_CONTEXT_ANCHOR_TCP_UDP_START = 0x9,
+};
+
 enum mlx5_reformat_ctx_type {
 	MLX5_REFORMAT_TYPE_L2_TO_VXLAN = 0x0,
 	MLX5_REFORMAT_TYPE_L2_TO_NVGRE = 0x1,
 	MLX5_REFORMAT_TYPE_L2_TO_L2_TUNNEL = 0x2,
 	MLX5_REFORMAT_TYPE_L3_TUNNEL_TO_L2 = 0x3,
 	MLX5_REFORMAT_TYPE_L2_TO_L3_TUNNEL = 0x4,
+	MLX5_REFORMAT_TYPE_INSERT_HDR = 0xf,
+	MLX5_REFORMAT_TYPE_REMOVE_HDR = 0x10,
 };
 
 struct mlx5_ifc_alloc_packet_reformat_context_in_bits {
@@ -5956,6 +5982,8 @@ enum {
 	MLX5_ACTION_IN_FIELD_OUT_TCP_SEQ_NUM   = 0x59,
 	MLX5_ACTION_IN_FIELD_OUT_TCP_ACK_NUM   = 0x5B,
 	MLX5_ACTION_IN_FIELD_IPSEC_SYNDROME    = 0x5D,
+	MLX5_ACTION_IN_FIELD_OUT_EMD_47_32     = 0x6F,
+	MLX5_ACTION_IN_FIELD_OUT_EMD_31_0      = 0x70,
 };
 
 struct mlx5_ifc_alloc_modify_header_context_out_bits {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [net-next 02/16] net/mlx5: DR, Split reformat state to Encap and Decap
  2021-06-10  2:57 [pull request][net-next 00/16] mlx5 updates 2021-06-09 Saeed Mahameed
  2021-06-10  2:57 ` [net-next 01/16] net/mlx5: mlx5_ifc support for header insert/remove Saeed Mahameed
@ 2021-06-10  2:58 ` Saeed Mahameed
  2021-06-10  2:58 ` [net-next 03/16] net/mlx5: DR, Allow encap action for RX for supporting devices Saeed Mahameed
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Saeed Mahameed @ 2021-06-10  2:58 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Yevgeny Kliteynik, Erez Shitrit, Saeed Mahameed

From: Yevgeny Kliteynik <kliteyn@nvidia.com>

Split single reformat state into two separate states for encap and decap.
This will allow adding actions to the specific domain, such as encap on RX.

Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/steering/dr_action.c   | 69 ++++++++++---------
 1 file changed, 35 insertions(+), 34 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
index 949879cf2092..467f2eac6503 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
@@ -14,7 +14,8 @@ enum dr_action_domain {
 enum dr_action_valid_state {
 	DR_ACTION_STATE_ERR,
 	DR_ACTION_STATE_NO_ACTION,
-	DR_ACTION_STATE_REFORMAT,
+	DR_ACTION_STATE_ENCAP,
+	DR_ACTION_STATE_DECAP,
 	DR_ACTION_STATE_MODIFY_HDR,
 	DR_ACTION_STATE_MODIFY_VLAN,
 	DR_ACTION_STATE_NON_TERM,
@@ -31,17 +32,17 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_TAG]		= DR_ACTION_STATE_NON_TERM,
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_NON_TERM,
-			[DR_ACTION_TYP_TNL_L2_TO_L2]	= DR_ACTION_STATE_REFORMAT,
-			[DR_ACTION_TYP_TNL_L3_TO_L2]	= DR_ACTION_STATE_REFORMAT,
+			[DR_ACTION_TYP_TNL_L2_TO_L2]	= DR_ACTION_STATE_DECAP,
+			[DR_ACTION_TYP_TNL_L3_TO_L2]	= DR_ACTION_STATE_DECAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_POP_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 		},
-		[DR_ACTION_STATE_REFORMAT] = {
+		[DR_ACTION_STATE_DECAP] = {
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_QP]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
-			[DR_ACTION_TYP_TAG]		= DR_ACTION_STATE_REFORMAT,
-			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_REFORMAT,
+			[DR_ACTION_TYP_TAG]		= DR_ACTION_STATE_DECAP,
+			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_DECAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_POP_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 		},
@@ -67,8 +68,8 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_TAG]		= DR_ACTION_STATE_NON_TERM,
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_NON_TERM,
-			[DR_ACTION_TYP_TNL_L2_TO_L2]	= DR_ACTION_STATE_REFORMAT,
-			[DR_ACTION_TYP_TNL_L3_TO_L2]	= DR_ACTION_STATE_REFORMAT,
+			[DR_ACTION_TYP_TNL_L2_TO_L2]	= DR_ACTION_STATE_DECAP,
+			[DR_ACTION_TYP_TNL_L3_TO_L2]	= DR_ACTION_STATE_DECAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_POP_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 		},
@@ -81,22 +82,22 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_NON_TERM,
-			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_REFORMAT,
-			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_REFORMAT,
+			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_PUSH_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 		},
-		[DR_ACTION_STATE_REFORMAT] = {
+		[DR_ACTION_STATE_ENCAP] = {
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
-			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_REFORMAT,
+			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_ENCAP,
 		},
 		[DR_ACTION_STATE_MODIFY_HDR] = {
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_MODIFY_HDR,
-			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_REFORMAT,
-			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_REFORMAT,
+			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_PUSH_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 		},
 		[DR_ACTION_STATE_MODIFY_VLAN] = {
@@ -104,15 +105,15 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_PUSH_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
-			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_REFORMAT,
-			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_REFORMAT,
+			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
 		},
 		[DR_ACTION_STATE_NON_TERM] = {
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_NON_TERM,
-			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_REFORMAT,
-			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_REFORMAT,
+			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_PUSH_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 		},
@@ -125,16 +126,16 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_NON_TERM,
-			[DR_ACTION_TYP_TNL_L2_TO_L2]	= DR_ACTION_STATE_REFORMAT,
-			[DR_ACTION_TYP_TNL_L3_TO_L2]	= DR_ACTION_STATE_REFORMAT,
+			[DR_ACTION_TYP_TNL_L2_TO_L2]	= DR_ACTION_STATE_DECAP,
+			[DR_ACTION_TYP_TNL_L3_TO_L2]	= DR_ACTION_STATE_DECAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_POP_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
 		},
-		[DR_ACTION_STATE_REFORMAT] = {
+		[DR_ACTION_STATE_DECAP] = {
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
-			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_REFORMAT,
+			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_DECAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_POP_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
@@ -157,8 +158,8 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_NON_TERM,
-			[DR_ACTION_TYP_TNL_L2_TO_L2]	= DR_ACTION_STATE_REFORMAT,
-			[DR_ACTION_TYP_TNL_L3_TO_L2]	= DR_ACTION_STATE_REFORMAT,
+			[DR_ACTION_TYP_TNL_L2_TO_L2]	= DR_ACTION_STATE_DECAP,
+			[DR_ACTION_TYP_TNL_L3_TO_L2]	= DR_ACTION_STATE_DECAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_POP_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
@@ -173,23 +174,23 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_NON_TERM,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
-			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_REFORMAT,
-			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_REFORMAT,
+			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_PUSH_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
 		},
-		[DR_ACTION_STATE_REFORMAT] = {
+		[DR_ACTION_STATE_ENCAP] = {
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
-			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_REFORMAT,
+			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
 		},
 		[DR_ACTION_STATE_MODIFY_HDR] = {
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_MODIFY_HDR,
-			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_REFORMAT,
-			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_REFORMAT,
+			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_PUSH_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
 		},
@@ -198,8 +199,8 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_PUSH_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_MODIFY_VLAN,
-			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_REFORMAT,
-			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_REFORMAT,
+			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
 		},
 		[DR_ACTION_STATE_NON_TERM] = {
@@ -207,8 +208,8 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_NON_TERM,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
-			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_REFORMAT,
-			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_REFORMAT,
+			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_PUSH_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
 		},
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [net-next 03/16] net/mlx5: DR, Allow encap action for RX for supporting devices
  2021-06-10  2:57 [pull request][net-next 00/16] mlx5 updates 2021-06-09 Saeed Mahameed
  2021-06-10  2:57 ` [net-next 01/16] net/mlx5: mlx5_ifc support for header insert/remove Saeed Mahameed
  2021-06-10  2:58 ` [net-next 02/16] net/mlx5: DR, Split reformat state to Encap and Decap Saeed Mahameed
@ 2021-06-10  2:58 ` Saeed Mahameed
  2021-06-10  2:58 ` [net-next 04/16] net/mlx5: Added new parameters to reformat context Saeed Mahameed
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Saeed Mahameed @ 2021-06-10  2:58 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Yevgeny Kliteynik, Erez Shitrit, Saeed Mahameed

From: Yevgeny Kliteynik <kliteyn@nvidia.com>

Encap actions on RX flow were not supported on older devices.
However, this is no longer the case in devices that support STEv1.
This patch adds support for encap l3/l2 on RX flow for supported
devices: update actions state machine by adding the newely supported
transitions and add the required support in STEv0/1 files.
The new transitions that are supported are:
 - from decap/modify-header/pop-vlan to encap
 - from encap to termination table

Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/steering/dr_action.c   | 40 +++++++++++++
 .../mellanox/mlx5/core/steering/dr_ste.h      |  1 +
 .../mellanox/mlx5/core/steering/dr_ste_v0.c   |  1 +
 .../mellanox/mlx5/core/steering/dr_ste_v1.c   | 60 ++++++++++++++-----
 .../mellanox/mlx5/core/steering/dr_types.h    |  5 ++
 5 files changed, 93 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
index 467f2eac6503..1b7a0e94d432 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
@@ -2,6 +2,7 @@
 /* Copyright (c) 2019 Mellanox Technologies. */
 
 #include "dr_types.h"
+#include "dr_ste.h"
 
 enum dr_action_domain {
 	DR_ACTION_DOMAIN_NIC_INGRESS,
@@ -34,6 +35,8 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_NON_TERM,
 			[DR_ACTION_TYP_TNL_L2_TO_L2]	= DR_ACTION_STATE_DECAP,
 			[DR_ACTION_TYP_TNL_L3_TO_L2]	= DR_ACTION_STATE_DECAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_POP_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 		},
@@ -43,15 +46,26 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_TAG]		= DR_ACTION_STATE_DECAP,
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_DECAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_POP_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 		},
+		[DR_ACTION_STATE_ENCAP] = {
+			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
+			[DR_ACTION_TYP_QP]		= DR_ACTION_STATE_TERM,
+			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
+			[DR_ACTION_TYP_TAG]		= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_ENCAP,
+		},
 		[DR_ACTION_STATE_MODIFY_HDR] = {
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_QP]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_TAG]		= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_MODIFY_HDR,
+			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
 		},
 		[DR_ACTION_STATE_MODIFY_VLAN] = {
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
@@ -61,6 +75,8 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_POP_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
+			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
 		},
 		[DR_ACTION_STATE_NON_TERM] = {
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
@@ -70,6 +86,8 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_NON_TERM,
 			[DR_ACTION_TYP_TNL_L2_TO_L2]	= DR_ACTION_STATE_DECAP,
 			[DR_ACTION_TYP_TNL_L3_TO_L2]	= DR_ACTION_STATE_DECAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_POP_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 		},
@@ -128,6 +146,8 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_NON_TERM,
 			[DR_ACTION_TYP_TNL_L2_TO_L2]	= DR_ACTION_STATE_DECAP,
 			[DR_ACTION_TYP_TNL_L3_TO_L2]	= DR_ACTION_STATE_DECAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_POP_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
@@ -139,12 +159,23 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_POP_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
+			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
+		},
+		[DR_ACTION_STATE_ENCAP] = {
+			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
+			[DR_ACTION_TYP_QP]		= DR_ACTION_STATE_TERM,
+			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
+			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
+			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_ENCAP,
 		},
 		[DR_ACTION_STATE_MODIFY_HDR] = {
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_FT]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
+			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
 		},
 		[DR_ACTION_STATE_MODIFY_VLAN] = {
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
@@ -153,6 +184,8 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
+			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
 		},
 		[DR_ACTION_STATE_NON_TERM] = {
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
@@ -160,6 +193,8 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_NON_TERM,
 			[DR_ACTION_TYP_TNL_L2_TO_L2]	= DR_ACTION_STATE_DECAP,
 			[DR_ACTION_TYP_TNL_L3_TO_L2]	= DR_ACTION_STATE_DECAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_POP_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
@@ -455,6 +490,11 @@ int mlx5dr_actions_build_ste_arr(struct mlx5dr_matcher *matcher,
 			break;
 		case DR_ACTION_TYP_L2_TO_TNL_L2:
 		case DR_ACTION_TYP_L2_TO_TNL_L3:
+			if (rx_rule &&
+			    !(dmn->ste_ctx->actions_caps & DR_STE_CTX_ACTION_CAP_RX_ENCAP)) {
+				mlx5dr_info(dmn, "Device doesn't support Encap on RX\n");
+				goto out_invalid_arg;
+			}
 			attr.reformat_size = action->reformat->reformat_size;
 			attr.reformat_id = action->reformat->reformat_id;
 			break;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.h
index 992b591bf0c5..12a8bbbf944b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.h
@@ -156,6 +156,7 @@ struct mlx5dr_ste_ctx {
 	u16  (*get_byte_mask)(u8 *hw_ste_p);
 
 	/* Actions */
+	u32 actions_caps;
 	void (*set_actions_rx)(struct mlx5dr_domain *dmn,
 			       u8 *action_type_set,
 			       u8 *hw_ste_arr,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c
index 0757a4e8540e..7e26a9e3afc7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c
@@ -1893,6 +1893,7 @@ struct mlx5dr_ste_ctx ste_ctx_v0 = {
 	.get_byte_mask			= &dr_ste_v0_get_byte_mask,
 
 	/* Actions */
+	.actions_caps			= DR_STE_CTX_ACTION_CAP_NONE,
 	.set_actions_rx			= &dr_ste_v0_set_actions_rx,
 	.set_actions_tx			= &dr_ste_v0_set_actions_tx,
 	.modify_field_arr_sz		= ARRAY_SIZE(dr_ste_v0_action_modify_field_arr),
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c
index 054c2e2b6554..a5807d190698 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c
@@ -361,8 +361,8 @@ static void dr_ste_v1_set_reparse(u8 *hw_ste_p)
 	MLX5_SET(ste_match_bwc_v1, hw_ste_p, reparse, 1);
 }
 
-static void dr_ste_v1_set_tx_encap(u8 *hw_ste_p, u8 *d_action,
-				   u32 reformat_id, int size)
+static void dr_ste_v1_set_encap(u8 *hw_ste_p, u8 *d_action,
+				u32 reformat_id, int size)
 {
 	MLX5_SET(ste_double_action_insert_with_ptr_v1, d_action, action_id,
 		 DR_STE_V1_ACTION_ID_INSERT_POINTER);
@@ -401,11 +401,11 @@ static void dr_ste_v1_set_rx_pop_vlan(u8 *hw_ste_p, u8 *s_action, u8 vlans_num)
 	dr_ste_v1_set_reparse(hw_ste_p);
 }
 
-static void dr_ste_v1_set_tx_encap_l3(u8 *hw_ste_p,
-				      u8 *frst_s_action,
-				      u8 *scnd_d_action,
-				      u32 reformat_id,
-				      int size)
+static void dr_ste_v1_set_encap_l3(u8 *hw_ste_p,
+				   u8 *frst_s_action,
+				   u8 *scnd_d_action,
+				   u32 reformat_id,
+				   int size)
 {
 	/* Remove L2 headers */
 	MLX5_SET(ste_single_action_remove_header_v1, frst_s_action, action_id,
@@ -519,9 +519,9 @@ static void dr_ste_v1_set_actions_tx(struct mlx5dr_domain *dmn,
 			action_sz = DR_STE_ACTION_TRIPLE_SZ;
 			allow_encap = true;
 		}
-		dr_ste_v1_set_tx_encap(last_ste, action,
-				       attr->reformat_id,
-				       attr->reformat_size);
+		dr_ste_v1_set_encap(last_ste, action,
+				    attr->reformat_id,
+				    attr->reformat_size);
 		action_sz -= DR_STE_ACTION_DOUBLE_SZ;
 		action += DR_STE_ACTION_DOUBLE_SZ;
 	} else if (action_type_set[DR_ACTION_TYP_L2_TO_TNL_L3]) {
@@ -532,10 +532,10 @@ static void dr_ste_v1_set_actions_tx(struct mlx5dr_domain *dmn,
 		action_sz = DR_STE_ACTION_TRIPLE_SZ;
 		d_action = action + DR_STE_ACTION_SINGLE_SZ;
 
-		dr_ste_v1_set_tx_encap_l3(last_ste,
-					  action, d_action,
-					  attr->reformat_id,
-					  attr->reformat_size);
+		dr_ste_v1_set_encap_l3(last_ste,
+				       action, d_action,
+				       attr->reformat_id,
+				       attr->reformat_size);
 		action_sz -= DR_STE_ACTION_TRIPLE_SZ;
 		action += DR_STE_ACTION_TRIPLE_SZ;
 	}
@@ -627,6 +627,37 @@ static void dr_ste_v1_set_actions_rx(struct mlx5dr_domain *dmn,
 		dr_ste_v1_set_counter_id(last_ste, attr->ctr_id);
 	}
 
+	if (action_type_set[DR_ACTION_TYP_L2_TO_TNL_L2]) {
+		if (action_sz < DR_STE_ACTION_DOUBLE_SZ) {
+			dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi);
+			action = MLX5_ADDR_OF(ste_mask_and_match_v1, last_ste, action);
+			action_sz = DR_STE_ACTION_TRIPLE_SZ;
+		}
+		dr_ste_v1_set_encap(last_ste, action,
+				    attr->reformat_id,
+				    attr->reformat_size);
+		action_sz -= DR_STE_ACTION_DOUBLE_SZ;
+		action += DR_STE_ACTION_DOUBLE_SZ;
+		allow_modify_hdr = false;
+	} else if (action_type_set[DR_ACTION_TYP_L2_TO_TNL_L3]) {
+		u8 *d_action;
+
+		if (action_sz < DR_STE_ACTION_TRIPLE_SZ) {
+			dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi);
+			action = MLX5_ADDR_OF(ste_mask_and_match_v1, last_ste, action);
+			action_sz = DR_STE_ACTION_TRIPLE_SZ;
+		}
+
+		d_action = action + DR_STE_ACTION_SINGLE_SZ;
+
+		dr_ste_v1_set_encap_l3(last_ste,
+				       action, d_action,
+				       attr->reformat_id,
+				       attr->reformat_size);
+		action_sz -= DR_STE_ACTION_TRIPLE_SZ;
+		allow_modify_hdr = false;
+	}
+
 	dr_ste_v1_set_hit_gvmi(last_ste, attr->hit_gvmi);
 	dr_ste_v1_set_hit_addr(last_ste, attr->final_icm_addr, 1);
 }
@@ -1865,6 +1896,7 @@ struct mlx5dr_ste_ctx ste_ctx_v1 = {
 	.set_byte_mask			= &dr_ste_v1_set_byte_mask,
 	.get_byte_mask			= &dr_ste_v1_get_byte_mask,
 	/* Actions */
+	.actions_caps			= DR_STE_CTX_ACTION_CAP_RX_ENCAP,
 	.set_actions_rx			= &dr_ste_v1_set_actions_rx,
 	.set_actions_tx			= &dr_ste_v1_set_actions_tx,
 	.modify_field_arr_sz		= ARRAY_SIZE(dr_ste_v1_action_modify_field_arr),
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
index 7600004d79a8..b34018d49326 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
@@ -89,6 +89,11 @@ enum {
 	DR_STE_SIZE_REDUCED = DR_STE_SIZE - DR_STE_SIZE_MASK,
 };
 
+enum mlx5dr_ste_ctx_action_cap {
+	DR_STE_CTX_ACTION_CAP_NONE = 0,
+	DR_STE_CTX_ACTION_CAP_RX_ENCAP = 1 << 0,
+};
+
 enum {
 	DR_MODIFY_ACTION_SIZE = 8,
 };
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [net-next 04/16] net/mlx5: Added new parameters to reformat context
  2021-06-10  2:57 [pull request][net-next 00/16] mlx5 updates 2021-06-09 Saeed Mahameed
                   ` (2 preceding siblings ...)
  2021-06-10  2:58 ` [net-next 03/16] net/mlx5: DR, Allow encap action for RX for supporting devices Saeed Mahameed
@ 2021-06-10  2:58 ` Saeed Mahameed
  2021-06-10  2:58 ` [net-next 05/16] net/mlx5: DR, Added support for INSERT_HEADER reformat type Saeed Mahameed
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Saeed Mahameed @ 2021-06-10  2:58 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski; +Cc: netdev, Yevgeny Kliteynik, Saeed Mahameed

From: Yevgeny Kliteynik <kliteyn@nvidia.com>

Adding new reformat context type (INSERT_HEADER) requires adding two new
parameters to reformat context - reformat_param_0 and reformat_param_1.
As defined by HW spec, these parameters have different meaning for
different reformat context type.

The first parameter (reformat_param_0) is not new to HW spec, but it
wasn't used by any of the supported reformats. The second parameter
(reformat_param_1) is new to the HW spec - it was added to allow
supporting INSERT_HEADER.

For NSERT_HEADER, reformat_param_0 indicates the header used to
reference the location of the inserted header, and reformat_param_1
indicates the offset of the inserted header from the reference point
defined by reformat_param_0.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/infiniband/hw/mlx5/fs.c               |  9 ++++-
 .../ethernet/mellanox/mlx5/core/en/tc_tun.c   | 38 +++++++++++++------
 .../mellanox/mlx5/core/en/tc_tun_encap.c      | 17 ++++++---
 .../net/ethernet/mellanox/mlx5/core/fs_cmd.c  | 29 +++++++-------
 .../net/ethernet/mellanox/mlx5/core/fs_cmd.h  |  4 +-
 .../net/ethernet/mellanox/mlx5/core/fs_core.c |  9 ++---
 .../mellanox/mlx5/core/steering/dr_action.c   |  2 +
 .../mellanox/mlx5/core/steering/fs_dr.c       | 17 +++++----
 .../mellanox/mlx5/core/steering/mlx5dr.h      |  2 +
 include/linux/mlx5/fs.h                       | 12 ++++--
 10 files changed, 86 insertions(+), 53 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/fs.c b/drivers/infiniband/hw/mlx5/fs.c
index 2fc6a60c4e77..941adf5cf3d0 100644
--- a/drivers/infiniband/hw/mlx5/fs.c
+++ b/drivers/infiniband/hw/mlx5/fs.c
@@ -2280,6 +2280,7 @@ static int mlx5_ib_flow_action_create_packet_reformat_ctx(
 	u8 ft_type, u8 dv_prt,
 	void *in, size_t len)
 {
+	struct mlx5_pkt_reformat_params reformat_params;
 	enum mlx5_flow_namespace_type namespace;
 	u8 prm_prt;
 	int ret;
@@ -2292,9 +2293,13 @@ static int mlx5_ib_flow_action_create_packet_reformat_ctx(
 	if (ret)
 		return ret;
 
+	memset(&reformat_params, 0, sizeof(reformat_params));
+	reformat_params.type = prm_prt;
+	reformat_params.size = len;
+	reformat_params.data = in;
 	maction->flow_action_raw.pkt_reformat =
-		mlx5_packet_reformat_alloc(dev->mdev, prm_prt, len,
-					   in, namespace);
+		mlx5_packet_reformat_alloc(dev->mdev, &reformat_params,
+					   namespace);
 	if (IS_ERR(maction->flow_action_raw.pkt_reformat)) {
 		ret = PTR_ERR(maction->flow_action_raw.pkt_reformat);
 		return ret;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c
index 172e0474f2e6..8f79f04eccd6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c
@@ -212,6 +212,7 @@ int mlx5e_tc_tun_create_header_ipv4(struct mlx5e_priv *priv,
 {
 	int max_encap_size = MLX5_CAP_ESW(priv->mdev, max_encap_header_size);
 	const struct ip_tunnel_key *tun_key = &e->tun_info->key;
+	struct mlx5_pkt_reformat_params reformat_params;
 	struct mlx5e_neigh m_neigh = {};
 	TC_TUN_ROUTE_ATTR_INIT(attr);
 	int ipv4_encap_size;
@@ -295,9 +296,12 @@ int mlx5e_tc_tun_create_header_ipv4(struct mlx5e_priv *priv,
 		 */
 		goto release_neigh;
 	}
-	e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev,
-						     e->reformat_type,
-						     ipv4_encap_size, encap_header,
+
+	memset(&reformat_params, 0, sizeof(reformat_params));
+	reformat_params.type = e->reformat_type;
+	reformat_params.size = ipv4_encap_size;
+	reformat_params.data = encap_header;
+	e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev, &reformat_params,
 						     MLX5_FLOW_NAMESPACE_FDB);
 	if (IS_ERR(e->pkt_reformat)) {
 		err = PTR_ERR(e->pkt_reformat);
@@ -324,6 +328,7 @@ int mlx5e_tc_tun_update_header_ipv4(struct mlx5e_priv *priv,
 {
 	int max_encap_size = MLX5_CAP_ESW(priv->mdev, max_encap_header_size);
 	const struct ip_tunnel_key *tun_key = &e->tun_info->key;
+	struct mlx5_pkt_reformat_params reformat_params;
 	TC_TUN_ROUTE_ATTR_INIT(attr);
 	int ipv4_encap_size;
 	char *encap_header;
@@ -396,9 +401,12 @@ int mlx5e_tc_tun_update_header_ipv4(struct mlx5e_priv *priv,
 		 */
 		goto release_neigh;
 	}
-	e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev,
-						     e->reformat_type,
-						     ipv4_encap_size, encap_header,
+
+	memset(&reformat_params, 0, sizeof(reformat_params));
+	reformat_params.type = e->reformat_type;
+	reformat_params.size = ipv4_encap_size;
+	reformat_params.data = encap_header;
+	e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev, &reformat_params,
 						     MLX5_FLOW_NAMESPACE_FDB);
 	if (IS_ERR(e->pkt_reformat)) {
 		err = PTR_ERR(e->pkt_reformat);
@@ -471,6 +479,7 @@ int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv,
 {
 	int max_encap_size = MLX5_CAP_ESW(priv->mdev, max_encap_header_size);
 	const struct ip_tunnel_key *tun_key = &e->tun_info->key;
+	struct mlx5_pkt_reformat_params reformat_params;
 	struct mlx5e_neigh m_neigh = {};
 	TC_TUN_ROUTE_ATTR_INIT(attr);
 	struct ipv6hdr *ip6h;
@@ -553,9 +562,11 @@ int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv,
 		goto release_neigh;
 	}
 
-	e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev,
-						     e->reformat_type,
-						     ipv6_encap_size, encap_header,
+	memset(&reformat_params, 0, sizeof(reformat_params));
+	reformat_params.type = e->reformat_type;
+	reformat_params.size = ipv6_encap_size;
+	reformat_params.data = encap_header;
+	e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev, &reformat_params,
 						     MLX5_FLOW_NAMESPACE_FDB);
 	if (IS_ERR(e->pkt_reformat)) {
 		err = PTR_ERR(e->pkt_reformat);
@@ -582,6 +593,7 @@ int mlx5e_tc_tun_update_header_ipv6(struct mlx5e_priv *priv,
 {
 	int max_encap_size = MLX5_CAP_ESW(priv->mdev, max_encap_header_size);
 	const struct ip_tunnel_key *tun_key = &e->tun_info->key;
+	struct mlx5_pkt_reformat_params reformat_params;
 	TC_TUN_ROUTE_ATTR_INIT(attr);
 	struct ipv6hdr *ip6h;
 	int ipv6_encap_size;
@@ -654,9 +666,11 @@ int mlx5e_tc_tun_update_header_ipv6(struct mlx5e_priv *priv,
 		goto release_neigh;
 	}
 
-	e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev,
-						     e->reformat_type,
-						     ipv6_encap_size, encap_header,
+	memset(&reformat_params, 0, sizeof(reformat_params));
+	reformat_params.type = e->reformat_type;
+	reformat_params.size = ipv6_encap_size;
+	reformat_params.data = encap_header;
+	e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev, &reformat_params,
 						     MLX5_FLOW_NAMESPACE_FDB);
 	if (IS_ERR(e->pkt_reformat)) {
 		err = PTR_ERR(e->pkt_reformat);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
index f1fb11680d20..0dfd51d2d178 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
@@ -120,6 +120,7 @@ void mlx5e_tc_encap_flows_add(struct mlx5e_priv *priv,
 			      struct list_head *flow_list)
 {
 	struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
+	struct mlx5_pkt_reformat_params reformat_params;
 	struct mlx5_esw_flow_attr *esw_attr;
 	struct mlx5_flow_handle *rule;
 	struct mlx5_flow_attr *attr;
@@ -130,9 +131,12 @@ void mlx5e_tc_encap_flows_add(struct mlx5e_priv *priv,
 	if (e->flags & MLX5_ENCAP_ENTRY_NO_ROUTE)
 		return;
 
+	memset(&reformat_params, 0, sizeof(reformat_params));
+	reformat_params.type = e->reformat_type;
+	reformat_params.size = e->encap_size;
+	reformat_params.data = e->encap_header;
 	e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev,
-						     e->reformat_type,
-						     e->encap_size, e->encap_header,
+						     &reformat_params,
 						     MLX5_FLOW_NAMESPACE_FDB);
 	if (IS_ERR(e->pkt_reformat)) {
 		mlx5_core_warn(priv->mdev, "Failed to offload cached encapsulation header, %lu\n",
@@ -812,6 +816,7 @@ int mlx5e_attach_decap(struct mlx5e_priv *priv,
 {
 	struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
 	struct mlx5_esw_flow_attr *attr = flow->attr->esw_attr;
+	struct mlx5_pkt_reformat_params reformat_params;
 	struct mlx5e_tc_flow_parse_attr *parse_attr;
 	struct mlx5e_decap_entry *d;
 	struct mlx5e_decap_key key;
@@ -853,10 +858,12 @@ int mlx5e_attach_decap(struct mlx5e_priv *priv,
 	hash_add_rcu(esw->offloads.decap_tbl, &d->hlist, hash_key);
 	mutex_unlock(&esw->offloads.decap_tbl_lock);
 
+	memset(&reformat_params, 0, sizeof(reformat_params));
+	reformat_params.type = MLX5_REFORMAT_TYPE_L3_TUNNEL_TO_L2;
+	reformat_params.size = sizeof(parse_attr->eth);
+	reformat_params.data = &parse_attr->eth;
 	d->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev,
-						     MLX5_REFORMAT_TYPE_L3_TUNNEL_TO_L2,
-						     sizeof(parse_attr->eth),
-						     &parse_attr->eth,
+						     &reformat_params,
 						     MLX5_FLOW_NAMESPACE_FDB);
 	if (IS_ERR(d->pkt_reformat)) {
 		err = PTR_ERR(d->pkt_reformat);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
index b7aae8b75760..896a6c3dbdb7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
@@ -111,9 +111,7 @@ static int mlx5_cmd_stub_delete_fte(struct mlx5_flow_root_namespace *ns,
 }
 
 static int mlx5_cmd_stub_packet_reformat_alloc(struct mlx5_flow_root_namespace *ns,
-					       int reformat_type,
-					       size_t size,
-					       void *reformat_data,
+					       struct mlx5_pkt_reformat_params *params,
 					       enum mlx5_flow_namespace_type namespace,
 					       struct mlx5_pkt_reformat *pkt_reformat)
 {
@@ -701,9 +699,7 @@ int mlx5_cmd_fc_bulk_query(struct mlx5_core_dev *dev, u32 base_id, int bulk_len,
 }
 
 static int mlx5_cmd_packet_reformat_alloc(struct mlx5_flow_root_namespace *ns,
-					  int reformat_type,
-					  size_t size,
-					  void *reformat_data,
+					  struct mlx5_pkt_reformat_params *params,
 					  enum mlx5_flow_namespace_type namespace,
 					  struct mlx5_pkt_reformat *pkt_reformat)
 {
@@ -721,14 +717,14 @@ static int mlx5_cmd_packet_reformat_alloc(struct mlx5_flow_root_namespace *ns,
 	else
 		max_encap_size = MLX5_CAP_FLOWTABLE(dev, max_encap_header_size);
 
-	if (size > max_encap_size) {
+	if (params->size > max_encap_size) {
 		mlx5_core_warn(dev, "encap size %zd too big, max supported is %d\n",
-			       size, max_encap_size);
+			       params->size, max_encap_size);
 		return -EINVAL;
 	}
 
-	in = kzalloc(MLX5_ST_SZ_BYTES(alloc_packet_reformat_context_in) + size,
-		     GFP_KERNEL);
+	in = kzalloc(MLX5_ST_SZ_BYTES(alloc_packet_reformat_context_in) +
+		     params->size, GFP_KERNEL);
 	if (!in)
 		return -ENOMEM;
 
@@ -737,15 +733,20 @@ static int mlx5_cmd_packet_reformat_alloc(struct mlx5_flow_root_namespace *ns,
 	reformat = MLX5_ADDR_OF(packet_reformat_context_in,
 				packet_reformat_context_in,
 				reformat_data);
-	inlen = reformat - (void *)in  + size;
+	inlen = reformat - (void *)in + params->size;
 
 	MLX5_SET(alloc_packet_reformat_context_in, in, opcode,
 		 MLX5_CMD_OP_ALLOC_PACKET_REFORMAT_CONTEXT);
 	MLX5_SET(packet_reformat_context_in, packet_reformat_context_in,
-		 reformat_data_size, size);
+		 reformat_data_size, params->size);
 	MLX5_SET(packet_reformat_context_in, packet_reformat_context_in,
-		 reformat_type, reformat_type);
-	memcpy(reformat, reformat_data, size);
+		 reformat_type, params->type);
+	MLX5_SET(packet_reformat_context_in, packet_reformat_context_in,
+		 reformat_param_0, params->param_0);
+	MLX5_SET(packet_reformat_context_in, packet_reformat_context_in,
+		 reformat_param_1, params->param_1);
+	if (params->data && params->size)
+		memcpy(reformat, params->data, params->size);
 
 	err = mlx5_cmd_exec(dev, in, inlen, out, sizeof(out));
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h
index c2e102ed82ad..5ecd33cdc087 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h
@@ -77,9 +77,7 @@ struct mlx5_flow_cmds {
 			      bool disconnect);
 
 	int (*packet_reformat_alloc)(struct mlx5_flow_root_namespace *ns,
-				     int reformat_type,
-				     size_t size,
-				     void *reformat_data,
+				     struct mlx5_pkt_reformat_params *params,
 				     enum mlx5_flow_namespace_type namespace,
 				     struct mlx5_pkt_reformat *pkt_reformat);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 1b7a1cde097c..c0936b4e53a9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -3165,9 +3165,7 @@ void mlx5_modify_header_dealloc(struct mlx5_core_dev *dev,
 EXPORT_SYMBOL(mlx5_modify_header_dealloc);
 
 struct mlx5_pkt_reformat *mlx5_packet_reformat_alloc(struct mlx5_core_dev *dev,
-						     int reformat_type,
-						     size_t size,
-						     void *reformat_data,
+						     struct mlx5_pkt_reformat_params *params,
 						     enum mlx5_flow_namespace_type ns_type)
 {
 	struct mlx5_pkt_reformat *pkt_reformat;
@@ -3183,9 +3181,8 @@ struct mlx5_pkt_reformat *mlx5_packet_reformat_alloc(struct mlx5_core_dev *dev,
 		return ERR_PTR(-ENOMEM);
 
 	pkt_reformat->ns_type = ns_type;
-	pkt_reformat->reformat_type = reformat_type;
-	err = root->cmds->packet_reformat_alloc(root, reformat_type, size,
-						reformat_data, ns_type,
+	pkt_reformat->reformat_type = params->type;
+	err = root->cmds->packet_reformat_alloc(root, params, ns_type,
 						pkt_reformat);
 	if (err) {
 		kfree(pkt_reformat);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
index 1b7a0e94d432..13fceba11d3f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
@@ -937,6 +937,8 @@ struct mlx5dr_action *mlx5dr_action_create_push_vlan(struct mlx5dr_domain *dmn,
 struct mlx5dr_action *
 mlx5dr_action_create_packet_reformat(struct mlx5dr_domain *dmn,
 				     enum mlx5dr_action_reformat_type reformat_type,
+				     u8 reformat_param_0,
+				     u8 reformat_param_1,
 				     size_t data_sz,
 				     void *data)
 {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
index ee0e9d79aaec..d866cd609d0b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
@@ -289,7 +289,8 @@ static int mlx5_cmd_dr_create_fte(struct mlx5_flow_root_namespace *ns,
 			DR_ACTION_REFORMAT_TYP_TNL_L2_TO_L2;
 
 		tmp_action = mlx5dr_action_create_packet_reformat(domain,
-								  decap_type, 0,
+								  decap_type,
+								  0, 0, 0,
 								  NULL);
 		if (!tmp_action) {
 			err = -ENOMEM;
@@ -522,9 +523,7 @@ static int mlx5_cmd_dr_create_fte(struct mlx5_flow_root_namespace *ns,
 }
 
 static int mlx5_cmd_dr_packet_reformat_alloc(struct mlx5_flow_root_namespace *ns,
-					     int reformat_type,
-					     size_t size,
-					     void *reformat_data,
+					     struct mlx5_pkt_reformat_params *params,
 					     enum mlx5_flow_namespace_type namespace,
 					     struct mlx5_pkt_reformat *pkt_reformat)
 {
@@ -532,7 +531,7 @@ static int mlx5_cmd_dr_packet_reformat_alloc(struct mlx5_flow_root_namespace *ns
 	struct mlx5dr_action *action;
 	int dr_reformat;
 
-	switch (reformat_type) {
+	switch (params->type) {
 	case MLX5_REFORMAT_TYPE_L2_TO_VXLAN:
 	case MLX5_REFORMAT_TYPE_L2_TO_NVGRE:
 	case MLX5_REFORMAT_TYPE_L2_TO_L2_TUNNEL:
@@ -546,14 +545,16 @@ static int mlx5_cmd_dr_packet_reformat_alloc(struct mlx5_flow_root_namespace *ns
 		break;
 	default:
 		mlx5_core_err(ns->dev, "Packet-reformat not supported(%d)\n",
-			      reformat_type);
+			      params->type);
 		return -EOPNOTSUPP;
 	}
 
 	action = mlx5dr_action_create_packet_reformat(dr_domain,
 						      dr_reformat,
-						      size,
-						      reformat_data);
+						      params->param_0,
+						      params->param_1,
+						      params->size,
+						      params->data);
 	if (!action) {
 		mlx5_core_err(ns->dev, "Failed allocating packet-reformat action\n");
 		return -EINVAL;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h
index 612b0ac31db2..8d821bbe3309 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h
@@ -105,6 +105,8 @@ mlx5dr_action_create_flow_counter(u32 counter_id);
 struct mlx5dr_action *
 mlx5dr_action_create_packet_reformat(struct mlx5dr_domain *dmn,
 				     enum mlx5dr_action_reformat_type reformat_type,
+				     u8 reformat_param_0,
+				     u8 reformat_param_1,
 				     size_t data_sz,
 				     void *data);
 
diff --git a/include/linux/mlx5/fs.h b/include/linux/mlx5/fs.h
index 1f51f4c3b1af..f69f68fba946 100644
--- a/include/linux/mlx5/fs.h
+++ b/include/linux/mlx5/fs.h
@@ -254,10 +254,16 @@ struct mlx5_modify_hdr *mlx5_modify_header_alloc(struct mlx5_core_dev *dev,
 void mlx5_modify_header_dealloc(struct mlx5_core_dev *dev,
 				struct mlx5_modify_hdr *modify_hdr);
 
+struct mlx5_pkt_reformat_params {
+	int type;
+	u8 param_0;
+	u8 param_1;
+	size_t size;
+	void *data;
+};
+
 struct mlx5_pkt_reformat *mlx5_packet_reformat_alloc(struct mlx5_core_dev *dev,
-						     int reformat_type,
-						     size_t size,
-						     void *reformat_data,
+						     struct mlx5_pkt_reformat_params *params,
 						     enum mlx5_flow_namespace_type ns_type);
 void mlx5_packet_reformat_dealloc(struct mlx5_core_dev *dev,
 				  struct mlx5_pkt_reformat *reformat);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [net-next 05/16] net/mlx5: DR, Added support for INSERT_HEADER reformat type
  2021-06-10  2:57 [pull request][net-next 00/16] mlx5 updates 2021-06-09 Saeed Mahameed
                   ` (3 preceding siblings ...)
  2021-06-10  2:58 ` [net-next 04/16] net/mlx5: Added new parameters to reformat context Saeed Mahameed
@ 2021-06-10  2:58 ` Saeed Mahameed
  2021-06-10  2:58 ` [net-next 06/16] net/mlx5: DR, Support EMD tag in modify header for STEv1 Saeed Mahameed
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Saeed Mahameed @ 2021-06-10  2:58 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski; +Cc: netdev, Yevgeny Kliteynik, Saeed Mahameed

From: Yevgeny Kliteynik <kliteyn@nvidia.com>

Add support for INSERT_HEADER packet reformat context type

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/steering/dr_action.c   | 76 ++++++++++++++++---
 .../mellanox/mlx5/core/steering/dr_cmd.c      |  7 +-
 .../mellanox/mlx5/core/steering/dr_ste_v0.c   |  4 +-
 .../mellanox/mlx5/core/steering/dr_ste_v1.c   | 68 ++++++++++++++---
 .../mellanox/mlx5/core/steering/dr_types.h    | 17 ++++-
 .../mellanox/mlx5/core/steering/fs_dr.c       |  3 +
 .../mellanox/mlx5/core/steering/mlx5dr.h      |  1 +
 7 files changed, 150 insertions(+), 26 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
index 13fceba11d3f..de68c0ec2143 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c
@@ -37,6 +37,7 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_TNL_L3_TO_L2]	= DR_ACTION_STATE_DECAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_INSERT_HDR]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_POP_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 		},
@@ -48,6 +49,7 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_DECAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_INSERT_HDR]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_POP_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 		},
@@ -66,6 +68,7 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_INSERT_HDR]	= DR_ACTION_STATE_ENCAP,
 		},
 		[DR_ACTION_STATE_MODIFY_VLAN] = {
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
@@ -77,6 +80,7 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_INSERT_HDR]	= DR_ACTION_STATE_ENCAP,
 		},
 		[DR_ACTION_STATE_NON_TERM] = {
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
@@ -88,6 +92,7 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_TNL_L3_TO_L2]	= DR_ACTION_STATE_DECAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_INSERT_HDR]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_POP_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 		},
@@ -102,6 +107,7 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_NON_TERM,
 			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_INSERT_HDR]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_PUSH_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 		},
@@ -116,6 +122,7 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_INSERT_HDR]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_PUSH_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 		},
 		[DR_ACTION_STATE_MODIFY_VLAN] = {
@@ -125,6 +132,7 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_PUSH_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_INSERT_HDR]	= DR_ACTION_STATE_ENCAP,
 		},
 		[DR_ACTION_STATE_NON_TERM] = {
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
@@ -132,6 +140,7 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_NON_TERM,
 			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_INSERT_HDR]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_PUSH_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 		},
@@ -148,6 +157,7 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_TNL_L3_TO_L2]	= DR_ACTION_STATE_DECAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_INSERT_HDR]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_POP_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
@@ -161,6 +171,7 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_INSERT_HDR]	= DR_ACTION_STATE_ENCAP,
 		},
 		[DR_ACTION_STATE_ENCAP] = {
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
@@ -176,6 +187,7 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
 			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_INSERT_HDR]	= DR_ACTION_STATE_ENCAP,
 		},
 		[DR_ACTION_STATE_MODIFY_VLAN] = {
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
@@ -186,6 +198,7 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_INSERT_HDR]	= DR_ACTION_STATE_ENCAP,
 		},
 		[DR_ACTION_STATE_NON_TERM] = {
 			[DR_ACTION_TYP_DROP]		= DR_ACTION_STATE_TERM,
@@ -195,6 +208,7 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_TNL_L3_TO_L2]	= DR_ACTION_STATE_DECAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_INSERT_HDR]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_POP_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
@@ -211,6 +225,7 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_INSERT_HDR]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_PUSH_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
 		},
@@ -226,6 +241,7 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_INSERT_HDR]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_PUSH_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
 		},
@@ -236,6 +252,7 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_CTR]		= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_INSERT_HDR]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
 		},
 		[DR_ACTION_STATE_NON_TERM] = {
@@ -245,6 +262,7 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX]
 			[DR_ACTION_TYP_MODIFY_HDR]	= DR_ACTION_STATE_MODIFY_HDR,
 			[DR_ACTION_TYP_L2_TO_TNL_L2]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_L2_TO_TNL_L3]	= DR_ACTION_STATE_ENCAP,
+			[DR_ACTION_TYP_INSERT_HDR]	= DR_ACTION_STATE_ENCAP,
 			[DR_ACTION_TYP_PUSH_VLAN]	= DR_ACTION_STATE_MODIFY_VLAN,
 			[DR_ACTION_TYP_VPORT]		= DR_ACTION_STATE_TERM,
 		},
@@ -271,6 +289,9 @@ dr_action_reformat_to_action_type(enum mlx5dr_action_reformat_type reformat_type
 	case DR_ACTION_REFORMAT_TYP_L2_TO_TNL_L3:
 		*action_type = DR_ACTION_TYP_L2_TO_TNL_L3;
 		break;
+	case DR_ACTION_REFORMAT_TYP_INSERT_HDR:
+		*action_type = DR_ACTION_TYP_INSERT_HDR;
+		break;
 	default:
 		return -EINVAL;
 	}
@@ -495,8 +516,8 @@ int mlx5dr_actions_build_ste_arr(struct mlx5dr_matcher *matcher,
 				mlx5dr_info(dmn, "Device doesn't support Encap on RX\n");
 				goto out_invalid_arg;
 			}
-			attr.reformat_size = action->reformat->reformat_size;
-			attr.reformat_id = action->reformat->reformat_id;
+			attr.reformat.size = action->reformat->size;
+			attr.reformat.id = action->reformat->id;
 			break;
 		case DR_ACTION_TYP_VPORT:
 			attr.hit_gvmi = action->vport->caps->vhca_gvmi;
@@ -522,6 +543,12 @@ int mlx5dr_actions_build_ste_arr(struct mlx5dr_matcher *matcher,
 
 			attr.vlans.headers[attr.vlans.count++] = action->push_vlan->vlan_hdr;
 			break;
+		case DR_ACTION_TYP_INSERT_HDR:
+			attr.reformat.size = action->reformat->size;
+			attr.reformat.id = action->reformat->id;
+			attr.reformat.param_0 = action->reformat->param_0;
+			attr.reformat.param_1 = action->reformat->param_1;
+			break;
 		default:
 			goto out_invalid_arg;
 		}
@@ -584,6 +611,7 @@ static unsigned int action_size[DR_ACTION_TYP_MAX] = {
 	[DR_ACTION_TYP_MODIFY_HDR]   = sizeof(struct mlx5dr_action_rewrite),
 	[DR_ACTION_TYP_VPORT]        = sizeof(struct mlx5dr_action_vport),
 	[DR_ACTION_TYP_PUSH_VLAN]    = sizeof(struct mlx5dr_action_push_vlan),
+	[DR_ACTION_TYP_INSERT_HDR]   = sizeof(struct mlx5dr_action_reformat),
 };
 
 static struct mlx5dr_action *
@@ -692,7 +720,7 @@ mlx5dr_action_create_mult_dest_tbl(struct mlx5dr_domain *dmn,
 			if (reformat_action) {
 				reformat_req = true;
 				hw_dests[i].vport.reformat_id =
-					reformat_action->reformat->reformat_id;
+					reformat_action->reformat->id;
 				ref_actions[num_of_ref++] = reformat_action;
 				hw_dests[i].vport.flags |= MLX5_FLOW_DEST_VPORT_REFORMAT_ID;
 			}
@@ -799,11 +827,15 @@ struct mlx5dr_action *mlx5dr_action_create_tag(u32 tag_value)
 static int
 dr_action_verify_reformat_params(enum mlx5dr_action_type reformat_type,
 				 struct mlx5dr_domain *dmn,
+				 u8 reformat_param_0,
+				 u8 reformat_param_1,
 				 size_t data_sz,
 				 void *data)
 {
-	if ((!data && data_sz) || (data && !data_sz) || reformat_type >
-		DR_ACTION_TYP_L2_TO_TNL_L3) {
+	if ((!data && data_sz) || (data && !data_sz) ||
+	    ((reformat_param_0 || reformat_param_1) &&
+	     reformat_type != DR_ACTION_TYP_INSERT_HDR) ||
+	    reformat_type > DR_ACTION_TYP_INSERT_HDR) {
 		mlx5dr_dbg(dmn, "Invalid reformat parameter!\n");
 		goto out_err;
 	}
@@ -835,6 +867,7 @@ dr_action_verify_reformat_params(enum mlx5dr_action_type reformat_type,
 
 static int
 dr_action_create_reformat_action(struct mlx5dr_domain *dmn,
+				 u8 reformat_param_0, u8 reformat_param_1,
 				 size_t data_sz, void *data,
 				 struct mlx5dr_action *action)
 {
@@ -852,13 +885,14 @@ dr_action_create_reformat_action(struct mlx5dr_domain *dmn,
 		else
 			rt = MLX5_REFORMAT_TYPE_L2_TO_L3_TUNNEL;
 
-		ret = mlx5dr_cmd_create_reformat_ctx(dmn->mdev, rt, data_sz, data,
+		ret = mlx5dr_cmd_create_reformat_ctx(dmn->mdev, rt, 0, 0,
+						     data_sz, data,
 						     &reformat_id);
 		if (ret)
 			return ret;
 
-		action->reformat->reformat_id = reformat_id;
-		action->reformat->reformat_size = data_sz;
+		action->reformat->id = reformat_id;
+		action->reformat->size = data_sz;
 		return 0;
 	}
 	case DR_ACTION_TYP_TNL_L2_TO_L2:
@@ -900,6 +934,23 @@ dr_action_create_reformat_action(struct mlx5dr_domain *dmn,
 		}
 		return 0;
 	}
+	case DR_ACTION_TYP_INSERT_HDR:
+	{
+		ret = mlx5dr_cmd_create_reformat_ctx(dmn->mdev,
+						     MLX5_REFORMAT_TYPE_INSERT_HDR,
+						     reformat_param_0,
+						     reformat_param_1,
+						     data_sz, data,
+						     &reformat_id);
+		if (ret)
+			return ret;
+
+		action->reformat->id = reformat_id;
+		action->reformat->size = data_sz;
+		action->reformat->param_0 = reformat_param_0;
+		action->reformat->param_1 = reformat_param_1;
+		return 0;
+	}
 	default:
 		mlx5dr_info(dmn, "Reformat type is not supported %d\n", action->action_type);
 		return -EINVAL;
@@ -955,7 +1006,9 @@ mlx5dr_action_create_packet_reformat(struct mlx5dr_domain *dmn,
 		goto dec_ref;
 	}
 
-	ret = dr_action_verify_reformat_params(action_type, dmn, data_sz, data);
+	ret = dr_action_verify_reformat_params(action_type, dmn,
+					       reformat_param_0, reformat_param_1,
+					       data_sz, data);
 	if (ret)
 		goto dec_ref;
 
@@ -966,6 +1019,8 @@ mlx5dr_action_create_packet_reformat(struct mlx5dr_domain *dmn,
 	action->reformat->dmn = dmn;
 
 	ret = dr_action_create_reformat_action(dmn,
+					       reformat_param_0,
+					       reformat_param_1,
 					       data_sz,
 					       data,
 					       action);
@@ -1559,8 +1614,9 @@ int mlx5dr_action_destroy(struct mlx5dr_action *action)
 		break;
 	case DR_ACTION_TYP_L2_TO_TNL_L2:
 	case DR_ACTION_TYP_L2_TO_TNL_L3:
+	case DR_ACTION_TYP_INSERT_HDR:
 		mlx5dr_cmd_destroy_reformat_ctx((action->reformat->dmn)->mdev,
-						action->reformat->reformat_id);
+						action->reformat->id);
 		refcount_dec(&action->reformat->dmn->refcount);
 		break;
 	case DR_ACTION_TYP_MODIFY_HDR:
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c
index 5970cb8fc0c0..6314f50efbd4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c
@@ -460,6 +460,8 @@ int mlx5dr_cmd_destroy_flow_table(struct mlx5_core_dev *mdev,
 
 int mlx5dr_cmd_create_reformat_ctx(struct mlx5_core_dev *mdev,
 				   enum mlx5_reformat_ctx_type rt,
+				   u8 reformat_param_0,
+				   u8 reformat_param_1,
 				   size_t reformat_size,
 				   void *reformat_data,
 				   u32 *reformat_id)
@@ -486,8 +488,11 @@ int mlx5dr_cmd_create_reformat_ctx(struct mlx5_core_dev *mdev,
 	pdata = MLX5_ADDR_OF(packet_reformat_context_in, prctx, reformat_data);
 
 	MLX5_SET(packet_reformat_context_in, prctx, reformat_type, rt);
+	MLX5_SET(packet_reformat_context_in, prctx, reformat_param_0, reformat_param_0);
+	MLX5_SET(packet_reformat_context_in, prctx, reformat_param_1, reformat_param_1);
 	MLX5_SET(packet_reformat_context_in, prctx, reformat_data_size, reformat_size);
-	memcpy(pdata, reformat_data, reformat_size);
+	if (reformat_data && reformat_size)
+		memcpy(pdata, reformat_data, reformat_size);
 
 	err = mlx5_cmd_exec(mdev, in, inlen, out, sizeof(out));
 	if (err)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c
index 7e26a9e3afc7..f1950e4968da 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c
@@ -437,8 +437,8 @@ dr_ste_v0_set_actions_tx(struct mlx5dr_domain *dmn,
 						attr->gvmi);
 
 		dr_ste_v0_set_tx_encap(last_ste,
-				       attr->reformat_id,
-				       attr->reformat_size,
+				       attr->reformat.id,
+				       attr->reformat.size,
 				       action_type_set[DR_ACTION_TYP_L2_TO_TNL_L3]);
 		/* Whenever prio_tag_required enabled, we can be sure that the
 		 * previous table (ACL) already push vlan to our packet,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c
index a5807d190698..b4dae628e716 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c
@@ -374,6 +374,26 @@ static void dr_ste_v1_set_encap(u8 *hw_ste_p, u8 *d_action,
 	dr_ste_v1_set_reparse(hw_ste_p);
 }
 
+static void dr_ste_v1_set_insert_hdr(u8 *hw_ste_p, u8 *d_action,
+				     u32 reformat_id,
+				     u8 anchor, u8 offset,
+				     int size)
+{
+	MLX5_SET(ste_double_action_insert_with_ptr_v1, d_action,
+		 action_id, DR_STE_V1_ACTION_ID_INSERT_POINTER);
+	MLX5_SET(ste_double_action_insert_with_ptr_v1, d_action, start_anchor, anchor);
+
+	/* The hardware expects here size and offset in words (2 byte) */
+	MLX5_SET(ste_double_action_insert_with_ptr_v1, d_action, size, size / 2);
+	MLX5_SET(ste_double_action_insert_with_ptr_v1, d_action, start_offset, offset / 2);
+
+	MLX5_SET(ste_double_action_insert_with_ptr_v1, d_action, pointer, reformat_id);
+	MLX5_SET(ste_double_action_insert_with_ptr_v1, d_action, attributes,
+		 DR_STE_V1_ACTION_INSERT_PTR_ATTR_NONE);
+
+	dr_ste_v1_set_reparse(hw_ste_p);
+}
+
 static void dr_ste_v1_set_tx_push_vlan(u8 *hw_ste_p, u8 *d_action,
 				       u32 vlan_hdr)
 {
@@ -520,8 +540,8 @@ static void dr_ste_v1_set_actions_tx(struct mlx5dr_domain *dmn,
 			allow_encap = true;
 		}
 		dr_ste_v1_set_encap(last_ste, action,
-				    attr->reformat_id,
-				    attr->reformat_size);
+				    attr->reformat.id,
+				    attr->reformat.size);
 		action_sz -= DR_STE_ACTION_DOUBLE_SZ;
 		action += DR_STE_ACTION_DOUBLE_SZ;
 	} else if (action_type_set[DR_ACTION_TYP_L2_TO_TNL_L3]) {
@@ -534,10 +554,23 @@ static void dr_ste_v1_set_actions_tx(struct mlx5dr_domain *dmn,
 
 		dr_ste_v1_set_encap_l3(last_ste,
 				       action, d_action,
-				       attr->reformat_id,
-				       attr->reformat_size);
+				       attr->reformat.id,
+				       attr->reformat.size);
 		action_sz -= DR_STE_ACTION_TRIPLE_SZ;
 		action += DR_STE_ACTION_TRIPLE_SZ;
+	} else if (action_type_set[DR_ACTION_TYP_INSERT_HDR]) {
+		if (!allow_encap || action_sz < DR_STE_ACTION_DOUBLE_SZ) {
+			dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi);
+			action = MLX5_ADDR_OF(ste_mask_and_match_v1, last_ste, action);
+			action_sz = DR_STE_ACTION_TRIPLE_SZ;
+		}
+		dr_ste_v1_set_insert_hdr(last_ste, action,
+					 attr->reformat.id,
+					 attr->reformat.param_0,
+					 attr->reformat.param_1,
+					 attr->reformat.size);
+		action_sz -= DR_STE_ACTION_DOUBLE_SZ;
+		action += DR_STE_ACTION_DOUBLE_SZ;
 	}
 
 	dr_ste_v1_set_hit_gvmi(last_ste, attr->hit_gvmi);
@@ -616,7 +649,9 @@ static void dr_ste_v1_set_actions_rx(struct mlx5dr_domain *dmn,
 	}
 
 	if (action_type_set[DR_ACTION_TYP_CTR]) {
-		/* Counter action set after decap to exclude decaped header */
+		/* Counter action set after decap and before insert_hdr
+		 * to exclude decaped / encaped header respectively.
+		 */
 		if (!allow_ctr) {
 			dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi);
 			action = MLX5_ADDR_OF(ste_mask_and_match_v1, last_ste, action);
@@ -634,8 +669,8 @@ static void dr_ste_v1_set_actions_rx(struct mlx5dr_domain *dmn,
 			action_sz = DR_STE_ACTION_TRIPLE_SZ;
 		}
 		dr_ste_v1_set_encap(last_ste, action,
-				    attr->reformat_id,
-				    attr->reformat_size);
+				    attr->reformat.id,
+				    attr->reformat.size);
 		action_sz -= DR_STE_ACTION_DOUBLE_SZ;
 		action += DR_STE_ACTION_DOUBLE_SZ;
 		allow_modify_hdr = false;
@@ -652,10 +687,25 @@ static void dr_ste_v1_set_actions_rx(struct mlx5dr_domain *dmn,
 
 		dr_ste_v1_set_encap_l3(last_ste,
 				       action, d_action,
-				       attr->reformat_id,
-				       attr->reformat_size);
+				       attr->reformat.id,
+				       attr->reformat.size);
 		action_sz -= DR_STE_ACTION_TRIPLE_SZ;
 		allow_modify_hdr = false;
+	} else if (action_type_set[DR_ACTION_TYP_INSERT_HDR]) {
+		/* Modify header, decap, and encap must use different STEs */
+		if (!allow_modify_hdr || action_sz < DR_STE_ACTION_DOUBLE_SZ) {
+			dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi);
+			action = MLX5_ADDR_OF(ste_mask_and_match_v1, last_ste, action);
+			action_sz = DR_STE_ACTION_TRIPLE_SZ;
+		}
+		dr_ste_v1_set_insert_hdr(last_ste, action,
+					 attr->reformat.id,
+					 attr->reformat.param_0,
+					 attr->reformat.param_1,
+					 attr->reformat.size);
+		action_sz -= DR_STE_ACTION_DOUBLE_SZ;
+		action += DR_STE_ACTION_DOUBLE_SZ;
+		allow_modify_hdr = false;
 	}
 
 	dr_ste_v1_set_hit_gvmi(last_ste, attr->hit_gvmi);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
index b34018d49326..60b8c04e165e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
@@ -123,6 +123,7 @@ enum mlx5dr_action_type {
 	DR_ACTION_TYP_VPORT,
 	DR_ACTION_TYP_POP_VLAN,
 	DR_ACTION_TYP_PUSH_VLAN,
+	DR_ACTION_TYP_INSERT_HDR,
 	DR_ACTION_TYP_MAX,
 };
 
@@ -266,8 +267,12 @@ struct mlx5dr_ste_actions_attr {
 	u32	ctr_id;
 	u16	gvmi;
 	u16	hit_gvmi;
-	u32	reformat_id;
-	u32	reformat_size;
+	struct {
+		u32	id;
+		u32	size;
+		u8	param_0;
+		u8	param_1;
+	} reformat;
 	struct {
 		int	count;
 		u32	headers[MLX5DR_MAX_VLANS];
@@ -908,8 +913,10 @@ struct mlx5dr_action_rewrite {
 
 struct mlx5dr_action_reformat {
 	struct mlx5dr_domain *dmn;
-	u32 reformat_id;
-	u32 reformat_size;
+	u32 id;
+	u32 size;
+	u8 param_0;
+	u8 param_1;
 };
 
 struct mlx5dr_action_dest_tbl {
@@ -1147,6 +1154,8 @@ int mlx5dr_cmd_query_flow_table(struct mlx5_core_dev *dev,
 				struct mlx5dr_cmd_query_flow_table_details *output);
 int mlx5dr_cmd_create_reformat_ctx(struct mlx5_core_dev *mdev,
 				   enum mlx5_reformat_ctx_type rt,
+				   u8 reformat_param_0,
+				   u8 reformat_param_1,
 				   size_t reformat_size,
 				   void *reformat_data,
 				   u32 *reformat_id);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
index d866cd609d0b..00b4c753cae2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
@@ -543,6 +543,9 @@ static int mlx5_cmd_dr_packet_reformat_alloc(struct mlx5_flow_root_namespace *ns
 	case MLX5_REFORMAT_TYPE_L2_TO_L3_TUNNEL:
 		dr_reformat = DR_ACTION_REFORMAT_TYP_L2_TO_TNL_L3;
 		break;
+	case MLX5_REFORMAT_TYPE_INSERT_HDR:
+		dr_reformat = DR_ACTION_REFORMAT_TYP_INSERT_HDR;
+		break;
 	default:
 		mlx5_core_err(ns->dev, "Packet-reformat not supported(%d)\n",
 			      params->type);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h
index 8d821bbe3309..0e2b73731117 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h
@@ -26,6 +26,7 @@ enum mlx5dr_action_reformat_type {
 	DR_ACTION_REFORMAT_TYP_L2_TO_TNL_L2,
 	DR_ACTION_REFORMAT_TYP_TNL_L3_TO_L2,
 	DR_ACTION_REFORMAT_TYP_L2_TO_TNL_L3,
+	DR_ACTION_REFORMAT_TYP_INSERT_HDR,
 };
 
 struct mlx5dr_match_parameters {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [net-next 06/16] net/mlx5: DR, Support EMD tag in modify header for STEv1
  2021-06-10  2:57 [pull request][net-next 00/16] mlx5 updates 2021-06-09 Saeed Mahameed
                   ` (4 preceding siblings ...)
  2021-06-10  2:58 ` [net-next 05/16] net/mlx5: DR, Added support for INSERT_HEADER reformat type Saeed Mahameed
@ 2021-06-10  2:58 ` Saeed Mahameed
  2021-06-10  2:58 ` [net-next 07/16] net/mlx5: Create TC-miss priority and table Saeed Mahameed
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Saeed Mahameed @ 2021-06-10  2:58 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski; +Cc: netdev, Yevgeny Kliteynik, Saeed Mahameed

From: Yevgeny Kliteynik <kliteyn@nvidia.com>

Add support for EMD tag in modify header set/copy actions
on device that supports STEv1.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c  | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c
index b4dae628e716..42668de01abc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c
@@ -116,6 +116,8 @@ enum {
 	DR_STE_V1_ACTION_MDFY_FLD_IPV6_SRC_OUT_3	= 0x4f,
 	DR_STE_V1_ACTION_MDFY_FLD_TCP_MISC_0		= 0x5e,
 	DR_STE_V1_ACTION_MDFY_FLD_TCP_MISC_1		= 0x5f,
+	DR_STE_V1_ACTION_MDFY_FLD_CFG_HDR_0_0		= 0x6f,
+	DR_STE_V1_ACTION_MDFY_FLD_CFG_HDR_0_1		= 0x70,
 	DR_STE_V1_ACTION_MDFY_FLD_METADATA_2_CQE	= 0x7b,
 	DR_STE_V1_ACTION_MDFY_FLD_GNRL_PURPOSE		= 0x7c,
 	DR_STE_V1_ACTION_MDFY_FLD_REGISTER_2		= 0x8c,
@@ -246,6 +248,12 @@ static const struct mlx5dr_ste_action_modify_field dr_ste_v1_action_modify_field
 	[MLX5_ACTION_IN_FIELD_OUT_FIRST_VID] = {
 		.hw_field = DR_STE_V1_ACTION_MDFY_FLD_L2_OUT_2, .start = 0, .end = 15,
 	},
+	[MLX5_ACTION_IN_FIELD_OUT_EMD_31_0] = {
+		.hw_field = DR_STE_V1_ACTION_MDFY_FLD_CFG_HDR_0_1, .start = 0, .end = 31,
+	},
+	[MLX5_ACTION_IN_FIELD_OUT_EMD_47_32] = {
+		.hw_field = DR_STE_V1_ACTION_MDFY_FLD_CFG_HDR_0_0, .start = 0, .end = 15,
+	},
 };
 
 static void dr_ste_v1_set_entry_type(u8 *hw_ste_p, u8 entry_type)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [net-next 07/16] net/mlx5: Create TC-miss priority and table
  2021-06-10  2:57 [pull request][net-next 00/16] mlx5 updates 2021-06-09 Saeed Mahameed
                   ` (5 preceding siblings ...)
  2021-06-10  2:58 ` [net-next 06/16] net/mlx5: DR, Support EMD tag in modify header for STEv1 Saeed Mahameed
@ 2021-06-10  2:58 ` Saeed Mahameed
  2021-06-10  2:58 ` [net-next 08/16] net/mlx5e: Refactor mlx5e_eswitch_{*}rep() helpers Saeed Mahameed
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Saeed Mahameed @ 2021-06-10  2:58 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Vlad Buslov, Jianbo Liu, Saeed Mahameed

From: Vlad Buslov <vladbu@nvidia.com>

In order to adhere to kernel software datapath model bridge offloads must
come after TC and NF FDBs. Following patches in this series add new FDB
priority for bridge after FDB_FT_OFFLOAD. However, since netfilter offload
is implemented with unmanaged tables, its miss path is not automatically
connected to next priority and requires the code to manually connect with
slow table. To keep bridge offloads encapsulated and not mix it with
eswitch offloads, create a new FDB_TC_MISS priority between FDB_FT_OFFLOAD
and FDB_SLOW_PATH:

          +
          |
+---------v----------+
|                    |
|   FDB_TC_OFFLOAD   |
|                    |
+---------+----------+
          |
          |
          |
+---------v----------+
|                    |
|   FDB_FT_OFFLOAD   |
|                    |
+---------+----------+
          |
          |
          |
+---------v----------+
|                    |
|    FDB_TC_MISS     |
|                    |
+---------+----------+
          |
          |
          |
+---------v----------+
|                    |
|   FDB_SLOW_PATH    |
|                    |
+---------+----------+
          |
          v

Initialize the new priority with single default empty managed table and use
the table as TC/NF miss patch instead of slow table. This approach allows
bridge offloads to be created as new FDB namespace priority between
FDB_TC_MISS and FDB_SLOW_PATH without exposing its internal tables to any
other modules since miss path of managed TC-miss table is automatically
wired to next priority.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/eswitch.h |  1 +
 .../mellanox/mlx5/core/eswitch_offloads.c     | 19 ++++++++++++++++++-
 .../net/ethernet/mellanox/mlx5/core/fs_core.c |  6 ++++++
 include/linux/mlx5/fs.h                       |  1 +
 4 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index 64ccb2bc0b58..55404eabff39 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -196,6 +196,7 @@ struct mlx5_eswitch_fdb {
 
 		struct offloads_fdb {
 			struct mlx5_flow_namespace *ns;
+			struct mlx5_flow_table *tc_miss_table;
 			struct mlx5_flow_table *slow_fdb;
 			struct mlx5_flow_group *send_to_vport_grp;
 			struct mlx5_flow_group *send_to_vport_meta_grp;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index d18a28a6e9a6..7579f3402776 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -1634,7 +1634,21 @@ static int esw_create_offloads_fdb_tables(struct mlx5_eswitch *esw)
 	}
 	esw->fdb_table.offloads.slow_fdb = fdb;
 
-	err = esw_chains_create(esw, fdb);
+	/* Create empty TC-miss managed table. This allows plugging in following
+	 * priorities without directly exposing their level 0 table to
+	 * eswitch_offloads and passing it as miss_fdb to following call to
+	 * esw_chains_create().
+	 */
+	memset(&ft_attr, 0, sizeof(ft_attr));
+	ft_attr.prio = FDB_TC_MISS;
+	esw->fdb_table.offloads.tc_miss_table = mlx5_create_flow_table(root_ns, &ft_attr);
+	if (IS_ERR(esw->fdb_table.offloads.tc_miss_table)) {
+		err = PTR_ERR(esw->fdb_table.offloads.tc_miss_table);
+		esw_warn(dev, "Failed to create TC miss FDB Table err %d\n", err);
+		goto tc_miss_table_err;
+	}
+
+	err = esw_chains_create(esw, esw->fdb_table.offloads.tc_miss_table);
 	if (err) {
 		esw_warn(dev, "Failed to open fdb chains err(%d)\n", err);
 		goto fdb_chains_err;
@@ -1779,6 +1793,8 @@ static int esw_create_offloads_fdb_tables(struct mlx5_eswitch *esw)
 send_vport_err:
 	esw_chains_destroy(esw, esw_chains(esw));
 fdb_chains_err:
+	mlx5_destroy_flow_table(esw->fdb_table.offloads.tc_miss_table);
+tc_miss_table_err:
 	mlx5_destroy_flow_table(esw->fdb_table.offloads.slow_fdb);
 slow_fdb_err:
 	/* Holds true only as long as DMFS is the default */
@@ -1806,6 +1822,7 @@ static void esw_destroy_offloads_fdb_tables(struct mlx5_eswitch *esw)
 
 	esw_chains_destroy(esw, esw_chains(esw));
 
+	mlx5_destroy_flow_table(esw->fdb_table.offloads.tc_miss_table);
 	mlx5_destroy_flow_table(esw->fdb_table.offloads.slow_fdb);
 	/* Holds true only as long as DMFS is the default */
 	mlx5_flow_namespace_set_mode(esw->fdb_table.offloads.ns,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index c0936b4e53a9..fc70c4ed8469 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -2780,6 +2780,12 @@ static int init_fdb_root_ns(struct mlx5_flow_steering *steering)
 	if (err)
 		goto out_err;
 
+	maj_prio = fs_create_prio(&steering->fdb_root_ns->ns, FDB_TC_MISS, 1);
+	if (IS_ERR(maj_prio)) {
+		err = PTR_ERR(maj_prio);
+		goto out_err;
+	}
+
 	maj_prio = fs_create_prio(&steering->fdb_root_ns->ns, FDB_SLOW_PATH, 1);
 	if (IS_ERR(maj_prio)) {
 		err = PTR_ERR(maj_prio);
diff --git a/include/linux/mlx5/fs.h b/include/linux/mlx5/fs.h
index f69f68fba946..271f2f4d6b60 100644
--- a/include/linux/mlx5/fs.h
+++ b/include/linux/mlx5/fs.h
@@ -87,6 +87,7 @@ enum {
 	FDB_BYPASS_PATH,
 	FDB_TC_OFFLOAD,
 	FDB_FT_OFFLOAD,
+	FDB_TC_MISS,
 	FDB_SLOW_PATH,
 	FDB_PER_VPORT,
 };
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [net-next 08/16] net/mlx5e: Refactor mlx5e_eswitch_{*}rep() helpers
  2021-06-10  2:57 [pull request][net-next 00/16] mlx5 updates 2021-06-09 Saeed Mahameed
                   ` (6 preceding siblings ...)
  2021-06-10  2:58 ` [net-next 07/16] net/mlx5: Create TC-miss priority and table Saeed Mahameed
@ 2021-06-10  2:58 ` Saeed Mahameed
  2021-06-10  2:58 ` [net-next 09/16] net/mlx5: Bridge, add offload infrastructure Saeed Mahameed
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Saeed Mahameed @ 2021-06-10  2:58 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Vlad Buslov, Jianbo Liu, Saeed Mahameed

From: Vlad Buslov <vladbu@nvidia.com>

Change the helper to functions to accept constant pointer to struct
net_device. This is necessary for following patches in series that pass
mlx5e_eswitch_rep() as a callback to kernel bridge infrastructure code.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.h | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 34eb1118670f..40db54412041 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -536,13 +536,13 @@ static const struct net_device_ops mlx5e_netdev_ops_rep = {
 	.ndo_change_carrier      = mlx5e_rep_change_carrier,
 };
 
-bool mlx5e_eswitch_uplink_rep(struct net_device *netdev)
+bool mlx5e_eswitch_uplink_rep(const struct net_device *netdev)
 {
 	return netdev->netdev_ops == &mlx5e_netdev_ops &&
 	       mlx5e_is_uplink_rep(netdev_priv(netdev));
 }
 
-bool mlx5e_eswitch_vf_rep(struct net_device *netdev)
+bool mlx5e_eswitch_vf_rep(const struct net_device *netdev)
 {
 	return netdev->netdev_ops == &mlx5e_netdev_ops_rep;
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h
index 22585015c7a7..47a2dfb7792a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h
@@ -231,9 +231,9 @@ void mlx5e_remove_sqs_fwd_rules(struct mlx5e_priv *priv);
 
 void mlx5e_rep_queue_neigh_stats_work(struct mlx5e_priv *priv);
 
-bool mlx5e_eswitch_vf_rep(struct net_device *netdev);
-bool mlx5e_eswitch_uplink_rep(struct net_device *netdev);
-static inline bool mlx5e_eswitch_rep(struct net_device *netdev)
+bool mlx5e_eswitch_vf_rep(const struct net_device *netdev);
+bool mlx5e_eswitch_uplink_rep(const struct net_device *netdev);
+static inline bool mlx5e_eswitch_rep(const struct net_device *netdev)
 {
 	return mlx5e_eswitch_vf_rep(netdev) ||
 	       mlx5e_eswitch_uplink_rep(netdev);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [net-next 09/16] net/mlx5: Bridge, add offload infrastructure
  2021-06-10  2:57 [pull request][net-next 00/16] mlx5 updates 2021-06-09 Saeed Mahameed
                   ` (7 preceding siblings ...)
  2021-06-10  2:58 ` [net-next 08/16] net/mlx5e: Refactor mlx5e_eswitch_{*}rep() helpers Saeed Mahameed
@ 2021-06-10  2:58 ` Saeed Mahameed
  2021-06-10  2:58 ` [net-next 10/16] net/mlx5: Bridge, handle FDB events Saeed Mahameed
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Saeed Mahameed @ 2021-06-10  2:58 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Vlad Buslov, Jianbo Liu, Saeed Mahameed

From: Vlad Buslov <vladbu@nvidia.com>

Create new files bridge.{c|h} in en/rep directory that implement bridge
interaction with representor netdevices and handle required
events/notifications, bridge.{c|h} in esw directory that implement all
necessary eswitch offloading infrastructure and works on vport/eswitch
level. Provide new kconfig MLX5_BRIDGE which is automatically selected when
both kernel bridge and mlx5 eswitch configs are enabled.

Provide basic infrastructure for bridge offloads:

- struct mlx5_esw_bridge_offloads - per-eswitch bridge offload structure
that encapsulates generic bridge-offloads data (notifier blocks, ingress
flow table/group, etc.) that is created/deleted on enable/disable eswitch
offloads.

- struct mlx5_esw_bridge - per-bridge structure that encapsulates
per-bridge data (reference counter, FDB, egress flow table/group, etc.)
that is created when first eswitch represetor is attached to new bridge and
deleted when last representor is removed from the bridge as a result of
NETDEV_CHANGEUPPER event.

The bridge tables are created with new priority FDB_BR_OFFLOAD in FDB
namespace. The new priority is between tc-miss and slow path priorities.
Priority consist of two levels: the ingress table that is global per
eswitch and matches incoming packets by src_mac/vid and redirects them to
next level (egress table) that is chosen according to ingress port bridge
membership and matches on dst_mac/vid in order to redirect packet to vport
according to the following diagram:

                +
                |
      +---------v----------+
      |                    |
      |   FDB_TC_OFFLOAD   |
      |                    |
      +---------+----------+
                |
                |
      +---------v----------+
      |                    |
      |   FDB_FT_OFFLOAD   |
      |                    |
      +---------+----------+
                |
                |
      +---------v----------+
      |                    |
      |    FDB_TC_MISS     |
      |                    |
      +---------+----------+
                |
+--------------------------------------+
|               |                      |
|        +------+                      |
|        |                             |
| +------v--------+   FDB_BR_OFFLOAD   |
| | INGRESS_TABLE |                    |
| +------+---+----+                    |
|        |   |      match              |
|        |   +---------+               |
|        |             |               |    +-------+
|        |     +-------v-------+ match |    |       |
|        |     | EGRESS_TABLE  +------------> vport |
|        |     +-------+-------+       |    |       |
|        |             |               |    +-------+
|        |    miss     |               |
|        +------+------+               |
|               |                      |
+--------------------------------------+
                |
                |
      +---------v----------+
      |                    |
      |   FDB_SLOW_PATH    |
      |                    |
      +---------+----------+
                |
                v

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/Kconfig   |  10 +
 .../net/ethernet/mellanox/mlx5/core/Makefile  |   1 +
 .../mellanox/mlx5/core/en/rep/bridge.c        | 108 ++++++
 .../mellanox/mlx5/core/en/rep/bridge.h        |  21 ++
 .../net/ethernet/mellanox/mlx5/core/en_rep.c  |   3 +
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  | 354 ++++++++++++++++++
 .../ethernet/mellanox/mlx5/core/esw/bridge.h  |  30 ++
 .../net/ethernet/mellanox/mlx5/core/eswitch.h |   6 +
 .../net/ethernet/mellanox/mlx5/core/fs_core.c |   6 +
 include/linux/mlx5/fs.h                       |   1 +
 10 files changed, 540 insertions(+)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
index 461a43f338e6..d62f90aedade 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
@@ -79,6 +79,16 @@ config MLX5_ESWITCH
 	        Legacy SRIOV mode (L2 mac vlan steering based).
 	        Switchdev mode (eswitch offloads).
 
+config MLX5_BRIDGE
+	bool
+	depends on MLX5_ESWITCH && BRIDGE
+	default y
+	help
+	  mlx5 ConnectX offloads support for Ethernet Bridging (BRIDGE).
+	  Enable adding representors of mlx5 uplink and VF ports to Bridge and
+	  offloading rules for traffic between such ports. Supports VLANs (trunk and
+	  access modes).
+
 config MLX5_CLS_ACT
 	bool "MLX5 TC classifier action support"
 	depends on MLX5_ESWITCH && NET_CLS_ACT
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 8dbdf1aef00f..b5072a3a2585 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -56,6 +56,7 @@ mlx5_core-$(CONFIG_MLX5_ESWITCH)   += esw/acl/helper.o \
 				      esw/acl/ingress_lgcy.o esw/acl/ingress_ofld.o \
 				      esw/devlink_port.o esw/vporttbl.o
 mlx5_core-$(CONFIG_MLX5_TC_SAMPLE) += esw/sample.o
+mlx5_core-$(CONFIG_MLX5_BRIDGE)    += esw/bridge.o en/rep/bridge.o
 
 mlx5_core-$(CONFIG_MLX5_MPFS)      += lib/mpfs.o
 mlx5_core-$(CONFIG_VXLAN)          += lib/vxlan.o
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
new file mode 100644
index 000000000000..de7a68488a9d
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
@@ -0,0 +1,108 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/* Copyright (c) 2021 Mellanox Technologies. */
+
+#include <linux/netdevice.h>
+#include <net/netevent.h>
+#include <net/switchdev.h>
+#include "bridge.h"
+#include "esw/bridge.h"
+#include "en_rep.h"
+
+static int mlx5_esw_bridge_port_changeupper(struct notifier_block *nb, void *ptr)
+{
+	struct mlx5_esw_bridge_offloads *br_offloads = container_of(nb,
+								    struct mlx5_esw_bridge_offloads,
+								    netdev_nb);
+	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
+	struct netdev_notifier_changeupper_info *info = ptr;
+	struct netlink_ext_ack *extack;
+	struct mlx5e_rep_priv *rpriv;
+	struct mlx5_eswitch *esw;
+	struct mlx5_vport *vport;
+	struct net_device *upper;
+	struct mlx5e_priv *priv;
+	u16 vport_num;
+
+	if (!mlx5e_eswitch_rep(dev))
+		return 0;
+
+	upper = info->upper_dev;
+	if (!netif_is_bridge_master(upper))
+		return 0;
+
+	esw = br_offloads->esw;
+	priv = netdev_priv(dev);
+	if (esw != priv->mdev->priv.eswitch)
+		return 0;
+
+	rpriv = priv->ppriv;
+	vport_num = rpriv->rep->vport;
+	vport = mlx5_eswitch_get_vport(esw, vport_num);
+	if (IS_ERR(vport))
+		return PTR_ERR(vport);
+
+	extack = netdev_notifier_info_to_extack(&info->info);
+
+	return info->linking ?
+		mlx5_esw_bridge_vport_link(upper->ifindex, br_offloads, vport, extack) :
+		mlx5_esw_bridge_vport_unlink(upper->ifindex, br_offloads, vport, extack);
+}
+
+static int mlx5_esw_bridge_switchdev_port_event(struct notifier_block *nb,
+						unsigned long event, void *ptr)
+{
+	int err = 0;
+
+	switch (event) {
+	case NETDEV_PRECHANGEUPPER:
+		break;
+
+	case NETDEV_CHANGEUPPER:
+		err = mlx5_esw_bridge_port_changeupper(nb, ptr);
+		break;
+	}
+
+	return notifier_from_errno(err);
+}
+
+void mlx5e_rep_bridge_init(struct mlx5e_priv *priv)
+{
+	struct mlx5_esw_bridge_offloads *br_offloads;
+	struct mlx5_core_dev *mdev = priv->mdev;
+	struct mlx5_eswitch *esw =
+		mdev->priv.eswitch;
+	int err;
+
+	rtnl_lock();
+	br_offloads = mlx5_esw_bridge_init(esw);
+	rtnl_unlock();
+	if (IS_ERR(br_offloads)) {
+		esw_warn(mdev, "Failed to init esw bridge (err=%ld)\n", PTR_ERR(br_offloads));
+		return;
+	}
+
+	br_offloads->netdev_nb.notifier_call = mlx5_esw_bridge_switchdev_port_event;
+	err = register_netdevice_notifier(&br_offloads->netdev_nb);
+	if (err) {
+		esw_warn(mdev, "Failed to register bridge offloads netdevice notifier (err=%d)\n",
+			 err);
+		mlx5_esw_bridge_cleanup(esw);
+	}
+}
+
+void mlx5e_rep_bridge_cleanup(struct mlx5e_priv *priv)
+{
+	struct mlx5_esw_bridge_offloads *br_offloads;
+	struct mlx5_core_dev *mdev = priv->mdev;
+	struct mlx5_eswitch *esw =
+		mdev->priv.eswitch;
+
+	br_offloads = esw->br_offloads;
+	if (!br_offloads)
+		return;
+
+	unregister_netdevice_notifier(&br_offloads->netdev_nb);
+	rtnl_lock();
+	mlx5_esw_bridge_cleanup(esw);
+	rtnl_unlock();
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.h b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.h
new file mode 100644
index 000000000000..fbeb64242831
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2021 Mellanox Technologies. */
+
+#ifndef __MLX5_EN_REP_BRIDGE__
+#define __MLX5_EN_REP_BRIDGE__
+
+#include "en.h"
+
+#if IS_ENABLED(CONFIG_MLX5_BRIDGE)
+
+void mlx5e_rep_bridge_init(struct mlx5e_priv *priv);
+void mlx5e_rep_bridge_cleanup(struct mlx5e_priv *priv);
+
+#else /* CONFIG_MLX5_BRIDGE */
+
+static inline void mlx5e_rep_bridge_init(struct mlx5e_priv *priv) {}
+static inline void mlx5e_rep_bridge_cleanup(struct mlx5e_priv *priv) {}
+
+#endif /* CONFIG_MLX5_BRIDGE */
+
+#endif /* __MLX5_EN_REP_BRIDGE__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 40db54412041..8290e0086178 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -45,6 +45,7 @@
 #include "en_tc.h"
 #include "en/rep/tc.h"
 #include "en/rep/neigh.h"
+#include "en/rep/bridge.h"
 #include "en/devlink.h"
 #include "fs_core.h"
 #include "lib/mlx5.h"
@@ -981,6 +982,7 @@ static void mlx5e_uplink_rep_enable(struct mlx5e_priv *priv)
 	mlx5e_dcbnl_initialize(priv);
 	mlx5e_dcbnl_init_app(priv);
 	mlx5e_rep_neigh_init(rpriv);
+	mlx5e_rep_bridge_init(priv);
 
 	netdev->wanted_features |= NETIF_F_HW_TC;
 
@@ -1002,6 +1004,7 @@ static void mlx5e_uplink_rep_disable(struct mlx5e_priv *priv)
 	netif_device_detach(priv->netdev);
 	rtnl_unlock();
 
+	mlx5e_rep_bridge_cleanup(priv);
 	mlx5e_rep_neigh_cleanup(rpriv);
 	mlx5e_dcbnl_delete_app(priv);
 	mlx5_notifier_unregister(mdev, &priv->events_nb);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
new file mode 100644
index 000000000000..b503562f97d0
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -0,0 +1,354 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/* Copyright (c) 2021 Mellanox Technologies. */
+
+#include <linux/netdevice.h>
+#include <linux/list.h>
+#include <net/switchdev.h>
+#include "bridge.h"
+#include "eswitch.h"
+#include "fs_core.h"
+
+#define MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE 64000
+#define MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_FROM 0
+#define MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_TO (MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE - 1)
+
+#define MLX5_ESW_BRIDGE_EGRESS_TABLE_SIZE 64000
+#define MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_FROM 0
+#define MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_TO (MLX5_ESW_BRIDGE_EGRESS_TABLE_SIZE - 1)
+
+enum {
+	MLX5_ESW_BRIDGE_LEVEL_INGRESS_TABLE,
+	MLX5_ESW_BRIDGE_LEVEL_EGRESS_TABLE,
+};
+
+struct mlx5_esw_bridge {
+	int ifindex;
+	int refcnt;
+	struct list_head list;
+
+	struct mlx5_flow_table *egress_ft;
+	struct mlx5_flow_group *egress_mac_fg;
+};
+
+static struct mlx5_flow_table *
+mlx5_esw_bridge_table_create(int max_fte, u32 level, struct mlx5_eswitch *esw)
+{
+	struct mlx5_flow_table_attr ft_attr = {};
+	struct mlx5_core_dev *dev = esw->dev;
+	struct mlx5_flow_namespace *ns;
+	struct mlx5_flow_table *fdb;
+
+	ns = mlx5_get_flow_namespace(dev, MLX5_FLOW_NAMESPACE_FDB);
+	if (!ns) {
+		esw_warn(dev, "Failed to get FDB namespace\n");
+		return ERR_PTR(-ENOENT);
+	}
+
+	ft_attr.max_fte = max_fte;
+	ft_attr.level = level;
+	ft_attr.prio = FDB_BR_OFFLOAD;
+	fdb = mlx5_create_flow_table(ns, &ft_attr);
+	if (IS_ERR(fdb))
+		esw_warn(dev, "Failed to create bridge FDB Table (err=%ld)\n", PTR_ERR(fdb));
+
+	return fdb;
+}
+
+static struct mlx5_flow_group *
+mlx5_esw_bridge_ingress_mac_fg_create(struct mlx5_eswitch *esw, struct mlx5_flow_table *ingress_ft)
+{
+	int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in);
+	struct mlx5_flow_group *fg;
+	u32 *in, *match;
+
+	in = kvzalloc(inlen, GFP_KERNEL);
+	if (!in)
+		return ERR_PTR(-ENOMEM);
+
+	MLX5_SET(create_flow_group_in, in, match_criteria_enable,
+		 MLX5_MATCH_OUTER_HEADERS | MLX5_MATCH_MISC_PARAMETERS_2);
+	match = MLX5_ADDR_OF(create_flow_group_in, in, match_criteria);
+
+	MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.smac_47_16);
+	MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.smac_15_0);
+
+	MLX5_SET(fte_match_param, match, misc_parameters_2.metadata_reg_c_0,
+		 mlx5_eswitch_get_vport_metadata_mask());
+
+	MLX5_SET(create_flow_group_in, in, start_flow_index,
+		 MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_FROM);
+	MLX5_SET(create_flow_group_in, in, end_flow_index,
+		 MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_TO);
+
+	fg = mlx5_create_flow_group(ingress_ft, in);
+	if (IS_ERR(fg))
+		esw_warn(esw->dev,
+			 "Failed to create bridge ingress table MAC flow group (err=%ld)\n",
+			 PTR_ERR(fg));
+
+	kvfree(in);
+	return fg;
+}
+
+static struct mlx5_flow_group *
+mlx5_esw_bridge_egress_mac_fg_create(struct mlx5_eswitch *esw, struct mlx5_flow_table *egress_ft)
+{
+	int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in);
+	struct mlx5_flow_group *fg;
+	u32 *in, *match;
+
+	in = kvzalloc(inlen, GFP_KERNEL);
+	if (!in)
+		return ERR_PTR(-ENOMEM);
+
+	MLX5_SET(create_flow_group_in, in, match_criteria_enable, MLX5_MATCH_OUTER_HEADERS);
+	match = MLX5_ADDR_OF(create_flow_group_in, in, match_criteria);
+
+	MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.dmac_47_16);
+	MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.dmac_15_0);
+
+	MLX5_SET(create_flow_group_in, in, start_flow_index,
+		 MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_FROM);
+	MLX5_SET(create_flow_group_in, in, end_flow_index,
+		 MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_TO);
+
+	fg = mlx5_create_flow_group(egress_ft, in);
+	if (IS_ERR(fg))
+		esw_warn(esw->dev,
+			 "Failed to create bridge egress table MAC flow group (err=%ld)\n",
+			 PTR_ERR(fg));
+	kvfree(in);
+	return fg;
+}
+
+static int
+mlx5_esw_bridge_ingress_table_init(struct mlx5_esw_bridge_offloads *br_offloads)
+{
+	struct mlx5_flow_table *ingress_ft;
+	struct mlx5_flow_group *mac_fg;
+	int err;
+
+	ingress_ft = mlx5_esw_bridge_table_create(MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE,
+						  MLX5_ESW_BRIDGE_LEVEL_INGRESS_TABLE,
+						  br_offloads->esw);
+	if (IS_ERR(ingress_ft))
+		return PTR_ERR(ingress_ft);
+
+	mac_fg = mlx5_esw_bridge_ingress_mac_fg_create(br_offloads->esw, ingress_ft);
+	if (IS_ERR(mac_fg)) {
+		err = PTR_ERR(mac_fg);
+		goto err_mac_fg;
+	}
+
+	br_offloads->ingress_ft = ingress_ft;
+	br_offloads->ingress_mac_fg = mac_fg;
+	return 0;
+
+err_mac_fg:
+	mlx5_destroy_flow_table(ingress_ft);
+	return err;
+}
+
+static void
+mlx5_esw_bridge_ingress_table_cleanup(struct mlx5_esw_bridge_offloads *br_offloads)
+{
+	mlx5_destroy_flow_group(br_offloads->ingress_mac_fg);
+	br_offloads->ingress_mac_fg = NULL;
+	mlx5_destroy_flow_table(br_offloads->ingress_ft);
+	br_offloads->ingress_ft = NULL;
+}
+
+static int
+mlx5_esw_bridge_egress_table_init(struct mlx5_esw_bridge_offloads *br_offloads,
+				  struct mlx5_esw_bridge *bridge)
+{
+	struct mlx5_flow_table *egress_ft;
+	struct mlx5_flow_group *mac_fg;
+	int err;
+
+	egress_ft = mlx5_esw_bridge_table_create(MLX5_ESW_BRIDGE_EGRESS_TABLE_SIZE,
+						 MLX5_ESW_BRIDGE_LEVEL_EGRESS_TABLE,
+						 br_offloads->esw);
+	if (IS_ERR(egress_ft))
+		return PTR_ERR(egress_ft);
+
+	mac_fg = mlx5_esw_bridge_egress_mac_fg_create(br_offloads->esw, egress_ft);
+	if (IS_ERR(mac_fg)) {
+		err = PTR_ERR(mac_fg);
+		goto err_mac_fg;
+	}
+
+	bridge->egress_ft = egress_ft;
+	bridge->egress_mac_fg = mac_fg;
+	return 0;
+
+err_mac_fg:
+	mlx5_destroy_flow_table(egress_ft);
+	return err;
+}
+
+static void
+mlx5_esw_bridge_egress_table_cleanup(struct mlx5_esw_bridge *bridge)
+{
+	mlx5_destroy_flow_group(bridge->egress_mac_fg);
+	mlx5_destroy_flow_table(bridge->egress_ft);
+}
+
+static struct mlx5_esw_bridge *mlx5_esw_bridge_create(int ifindex,
+						      struct mlx5_esw_bridge_offloads *br_offloads)
+{
+	struct mlx5_esw_bridge *bridge;
+	int err;
+
+	bridge = kvzalloc(sizeof(*bridge), GFP_KERNEL);
+	if (!bridge)
+		return ERR_PTR(-ENOMEM);
+
+	err = mlx5_esw_bridge_egress_table_init(br_offloads, bridge);
+	if (err)
+		goto err_egress_tbl;
+
+	bridge->ifindex = ifindex;
+	bridge->refcnt = 1;
+	list_add(&bridge->list, &br_offloads->bridges);
+
+	return bridge;
+
+err_egress_tbl:
+	kvfree(bridge);
+	return ERR_PTR(err);
+}
+
+static void mlx5_esw_bridge_get(struct mlx5_esw_bridge *bridge)
+{
+	bridge->refcnt++;
+}
+
+static void mlx5_esw_bridge_put(struct mlx5_esw_bridge_offloads *br_offloads,
+				struct mlx5_esw_bridge *bridge)
+{
+	if (--bridge->refcnt)
+		return;
+
+	mlx5_esw_bridge_egress_table_cleanup(bridge);
+	list_del(&bridge->list);
+	kvfree(bridge);
+
+	if (list_empty(&br_offloads->bridges))
+		mlx5_esw_bridge_ingress_table_cleanup(br_offloads);
+}
+
+static struct mlx5_esw_bridge *
+mlx5_esw_bridge_lookup(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads)
+{
+	struct mlx5_esw_bridge *bridge;
+
+	ASSERT_RTNL();
+
+	list_for_each_entry(bridge, &br_offloads->bridges, list) {
+		if (bridge->ifindex == ifindex) {
+			mlx5_esw_bridge_get(bridge);
+			return bridge;
+		}
+	}
+
+	if (!br_offloads->ingress_ft) {
+		int err = mlx5_esw_bridge_ingress_table_init(br_offloads);
+
+		if (err)
+			return ERR_PTR(err);
+	}
+
+	bridge = mlx5_esw_bridge_create(ifindex, br_offloads);
+	if (IS_ERR(bridge) && list_empty(&br_offloads->bridges))
+		mlx5_esw_bridge_ingress_table_cleanup(br_offloads);
+	return bridge;
+}
+
+static int mlx5_esw_bridge_vport_init(struct mlx5_esw_bridge *bridge,
+				      struct mlx5_vport *vport)
+{
+	vport->bridge = bridge;
+	return 0;
+}
+
+static int mlx5_esw_bridge_vport_cleanup(struct mlx5_esw_bridge_offloads *br_offloads,
+					 struct mlx5_vport *vport)
+{
+	mlx5_esw_bridge_put(br_offloads, vport->bridge);
+	vport->bridge = NULL;
+	return 0;
+}
+
+int mlx5_esw_bridge_vport_link(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads,
+			       struct mlx5_vport *vport, struct netlink_ext_ack *extack)
+{
+	struct mlx5_esw_bridge *bridge;
+
+	WARN_ON(vport->bridge);
+
+	bridge = mlx5_esw_bridge_lookup(ifindex, br_offloads);
+	if (IS_ERR(bridge)) {
+		NL_SET_ERR_MSG_MOD(extack, "Error checking for existing bridge with same ifindex");
+		return PTR_ERR(bridge);
+	}
+
+	return mlx5_esw_bridge_vport_init(bridge, vport);
+}
+
+int mlx5_esw_bridge_vport_unlink(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads,
+				 struct mlx5_vport *vport, struct netlink_ext_ack *extack)
+{
+	if (!vport->bridge) {
+		NL_SET_ERR_MSG_MOD(extack, "Port is not attached to any bridge");
+		return -EINVAL;
+	}
+	if (vport->bridge->ifindex != ifindex) {
+		NL_SET_ERR_MSG_MOD(extack, "Port is attached to another bridge");
+		return -EINVAL;
+	}
+
+	return mlx5_esw_bridge_vport_cleanup(br_offloads, vport);
+}
+
+static void mlx5_esw_bridge_flush(struct mlx5_esw_bridge_offloads *br_offloads)
+{
+	struct mlx5_eswitch *esw = br_offloads->esw;
+	struct mlx5_vport *vport;
+	unsigned long i;
+
+	mlx5_esw_for_each_vport(esw, i, vport)
+		if (vport->bridge)
+			mlx5_esw_bridge_vport_cleanup(br_offloads, vport);
+
+	WARN_ONCE(!list_empty(&br_offloads->bridges),
+		  "Cleaning up bridge offloads while still having bridges attached\n");
+}
+
+struct mlx5_esw_bridge_offloads *mlx5_esw_bridge_init(struct mlx5_eswitch *esw)
+{
+	struct mlx5_esw_bridge_offloads *br_offloads;
+
+	br_offloads = kvzalloc(sizeof(*br_offloads), GFP_KERNEL);
+	if (!br_offloads)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&br_offloads->bridges);
+	br_offloads->esw = esw;
+	esw->br_offloads = br_offloads;
+
+	return br_offloads;
+}
+
+void mlx5_esw_bridge_cleanup(struct mlx5_eswitch *esw)
+{
+	struct mlx5_esw_bridge_offloads *br_offloads = esw->br_offloads;
+
+	if (!br_offloads)
+		return;
+
+	mlx5_esw_bridge_flush(br_offloads);
+
+	esw->br_offloads = NULL;
+	kvfree(br_offloads);
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
new file mode 100644
index 000000000000..319b6f1db0ba
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2021 Mellanox Technologies. */
+
+#ifndef __MLX5_ESW_BRIDGE_H__
+#define __MLX5_ESW_BRIDGE_H__
+
+#include <linux/notifier.h>
+#include <linux/list.h>
+#include "eswitch.h"
+
+struct mlx5_flow_table;
+struct mlx5_flow_group;
+
+struct mlx5_esw_bridge_offloads {
+	struct mlx5_eswitch *esw;
+	struct list_head bridges;
+	struct notifier_block netdev_nb;
+
+	struct mlx5_flow_table *ingress_ft;
+	struct mlx5_flow_group *ingress_mac_fg;
+};
+
+struct mlx5_esw_bridge_offloads *mlx5_esw_bridge_init(struct mlx5_eswitch *esw);
+void mlx5_esw_bridge_cleanup(struct mlx5_eswitch *esw);
+int mlx5_esw_bridge_vport_link(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads,
+			       struct mlx5_vport *vport, struct netlink_ext_ack *extack);
+int mlx5_esw_bridge_vport_unlink(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads,
+				 struct mlx5_vport *vport, struct netlink_ext_ack *extack);
+
+#endif /* __MLX5_ESW_BRIDGE_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index 55404eabff39..48cac5bf606d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -150,6 +150,8 @@ enum mlx5_eswitch_vport_event {
 	MLX5_VPORT_PROMISC_CHANGE = BIT(3),
 };
 
+struct mlx5_esw_bridge;
+
 struct mlx5_vport {
 	struct mlx5_core_dev    *dev;
 	struct hlist_head       uc_list[MLX5_L2_ADDR_HASH_SIZE];
@@ -178,6 +180,7 @@ struct mlx5_vport {
 	enum mlx5_eswitch_vport_event enabled_events;
 	int index;
 	struct devlink_port *dl_port;
+	struct mlx5_esw_bridge *bridge;
 };
 
 struct mlx5_esw_indir_table;
@@ -271,6 +274,8 @@ enum {
 	MLX5_ESWITCH_REG_C1_LOOPBACK_ENABLED = BIT(1),
 };
 
+struct mlx5_esw_bridge_offloads;
+
 struct mlx5_eswitch {
 	struct mlx5_core_dev    *dev;
 	struct mlx5_nb          nb;
@@ -300,6 +305,7 @@ struct mlx5_eswitch {
 		u32             root_tsar_id;
 	} qos;
 
+	struct mlx5_esw_bridge_offloads *br_offloads;
 	struct mlx5_esw_offload offloads;
 	int                     mode;
 	u16                     manager_vport;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index fc70c4ed8469..fc37ac9eab12 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -2786,6 +2786,12 @@ static int init_fdb_root_ns(struct mlx5_flow_steering *steering)
 		goto out_err;
 	}
 
+	maj_prio = fs_create_prio(&steering->fdb_root_ns->ns, FDB_BR_OFFLOAD, 2);
+	if (IS_ERR(maj_prio)) {
+		err = PTR_ERR(maj_prio);
+		goto out_err;
+	}
+
 	maj_prio = fs_create_prio(&steering->fdb_root_ns->ns, FDB_SLOW_PATH, 1);
 	if (IS_ERR(maj_prio)) {
 		err = PTR_ERR(maj_prio);
diff --git a/include/linux/mlx5/fs.h b/include/linux/mlx5/fs.h
index 271f2f4d6b60..77746f7e35b8 100644
--- a/include/linux/mlx5/fs.h
+++ b/include/linux/mlx5/fs.h
@@ -88,6 +88,7 @@ enum {
 	FDB_TC_OFFLOAD,
 	FDB_FT_OFFLOAD,
 	FDB_TC_MISS,
+	FDB_BR_OFFLOAD,
 	FDB_SLOW_PATH,
 	FDB_PER_VPORT,
 };
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [net-next 10/16] net/mlx5: Bridge, handle FDB events
  2021-06-10  2:57 [pull request][net-next 00/16] mlx5 updates 2021-06-09 Saeed Mahameed
                   ` (8 preceding siblings ...)
  2021-06-10  2:58 ` [net-next 09/16] net/mlx5: Bridge, add offload infrastructure Saeed Mahameed
@ 2021-06-10  2:58 ` Saeed Mahameed
  2021-06-10  2:58 ` [net-next 11/16] net/mlx5: Bridge, dynamic entry ageing Saeed Mahameed
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Saeed Mahameed @ 2021-06-10  2:58 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Vlad Buslov, Jianbo Liu, Saeed Mahameed

From: Vlad Buslov <vladbu@nvidia.com>

Hardware supported by mlx5 driver doesn't provide learning and requires the
driver to emulate all switch-like behavior in software. As such, all
packets by default go through miss path, appear on representor and get to
software bridge, if it is the upper device of the representor. This causes
bridge to process packet in software, learn the MAC address to FDB and send
SWITCHDEV_FDB_ADD_TO_DEVICE event to all subscribers.

In order to offload FDB entries in mlx5, register switchdev notifier
callback and implement support for both 'added_by_user' and dynamic FDB
entry SWITCHDEV_FDB_ADD_TO_DEVICE events asynchronously using new
mlx5_esw_bridge_offloads->wq ordered workqueue. In workqueue callback
offload the ingress rule (matching FDB entry MAC as packet source MAC) and
egress table rule (matching FDB entry MAC as destination MAC). For ingress
table rule also match source vport to ensure that only traffic coming from
expected bridge port is matched by offloaded rule. Save all the relevant
FDB entry data in struct mlx5_esw_bridge_fdb_entry instance and insert the
instance in new mlx5_esw_bridge->fdb_list list (for traversing all entries
by software ageing implementation in following patch) and in new
mlx5_esw_bridge->fdb_ht hash table for fast retrieval. Notify the bridge
that FDB entry has been offloaded by sending SWITCHDEV_FDB_OFFLOADED
notification.

Delete FDB entry on reception of SWITCHDEV_FDB_DEL_TO_DEVICE event.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../device_drivers/ethernet/mellanox/mlx5.rst |  15 ++
 .../mellanox/mlx5/core/en/rep/bridge.c        | 150 ++++++++++-
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  | 254 +++++++++++++++++-
 .../ethernet/mellanox/mlx5/core/esw/bridge.h  |   9 +
 4 files changed, 424 insertions(+), 4 deletions(-)

diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
index 936a10f1942c..ea32136b30e7 100644
--- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
@@ -12,6 +12,7 @@ Contents
 - `Enabling the driver and kconfig options`_
 - `Devlink info`_
 - `Devlink parameters`_
+- `Bridge offload`_
 - `mlx5 subfunction`_
 - `mlx5 function attributes`_
 - `Devlink health reporters`_
@@ -217,6 +218,20 @@ users try to enable them.
 
     $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev
 
+Bridge offload
+==============
+The mlx5 driver implements support for offloading bridge rules when in switchdev
+mode. Linux bridge FDBs are automatically offloaded when mlx5 switchdev
+representor is attached to bridge.
+
+- Change device to switchdev mode::
+
+    $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev
+
+- Attach mlx5 switchdev representor 'enp8s0f0' to bridge netdev 'bridge1'::
+
+    $ ip link set enp8s0f0 master bridge1
+
 mlx5 subfunction
 ================
 mlx5 supports subfunction management using devlink port (see :ref:`Documentation/networking/devlink/devlink-port.rst <devlink_port>`) interface.
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
index de7a68488a9d..b34e9cb686e3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
@@ -8,6 +8,13 @@
 #include "esw/bridge.h"
 #include "en_rep.h"
 
+struct mlx5_bridge_switchdev_fdb_work {
+	struct work_struct work;
+	struct switchdev_notifier_fdb_info fdb_info;
+	struct net_device *dev;
+	bool add;
+};
+
 static int mlx5_esw_bridge_port_changeupper(struct notifier_block *nb, void *ptr)
 {
 	struct mlx5_esw_bridge_offloads *br_offloads = container_of(nb,
@@ -65,6 +72,124 @@ static int mlx5_esw_bridge_switchdev_port_event(struct notifier_block *nb,
 	return notifier_from_errno(err);
 }
 
+static void
+mlx5_esw_bridge_cleanup_switchdev_fdb_work(struct mlx5_bridge_switchdev_fdb_work *fdb_work)
+{
+	dev_put(fdb_work->dev);
+	kfree(fdb_work->fdb_info.addr);
+	kfree(fdb_work);
+}
+
+static void mlx5_esw_bridge_switchdev_fdb_event_work(struct work_struct *work)
+{
+	struct mlx5_bridge_switchdev_fdb_work *fdb_work =
+		container_of(work, struct mlx5_bridge_switchdev_fdb_work, work);
+	struct switchdev_notifier_fdb_info *fdb_info =
+		&fdb_work->fdb_info;
+	struct net_device *dev = fdb_work->dev;
+	struct mlx5e_rep_priv *rpriv;
+	struct mlx5_eswitch *esw;
+	struct mlx5_vport *vport;
+	struct mlx5e_priv *priv;
+	u16 vport_num;
+
+	rtnl_lock();
+
+	priv = netdev_priv(dev);
+	rpriv = priv->ppriv;
+	vport_num = rpriv->rep->vport;
+	esw = priv->mdev->priv.eswitch;
+	vport = mlx5_eswitch_get_vport(esw, vport_num);
+	if (IS_ERR(vport))
+		goto out;
+
+	if (fdb_work->add)
+		mlx5_esw_bridge_fdb_create(dev, esw, vport, fdb_info);
+	else
+		mlx5_esw_bridge_fdb_remove(dev, esw, vport, fdb_info);
+
+out:
+	rtnl_unlock();
+	mlx5_esw_bridge_cleanup_switchdev_fdb_work(fdb_work);
+}
+
+static struct mlx5_bridge_switchdev_fdb_work *
+mlx5_esw_bridge_init_switchdev_fdb_work(struct net_device *dev, bool add,
+					struct switchdev_notifier_fdb_info *fdb_info)
+{
+	struct mlx5_bridge_switchdev_fdb_work *work;
+	u8 *addr;
+
+	work = kzalloc(sizeof(*work), GFP_ATOMIC);
+	if (!work)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_WORK(&work->work, mlx5_esw_bridge_switchdev_fdb_event_work);
+	memcpy(&work->fdb_info, fdb_info, sizeof(work->fdb_info));
+
+	addr = kzalloc(ETH_ALEN, GFP_ATOMIC);
+	if (!addr) {
+		kfree(work);
+		return ERR_PTR(-ENOMEM);
+	}
+	ether_addr_copy(addr, fdb_info->addr);
+	work->fdb_info.addr = addr;
+
+	dev_hold(dev);
+	work->dev = dev;
+	work->add = add;
+	return work;
+}
+
+static int mlx5_esw_bridge_switchdev_event(struct notifier_block *nb,
+					   unsigned long event, void *ptr)
+{
+	struct mlx5_esw_bridge_offloads *br_offloads = container_of(nb,
+								    struct mlx5_esw_bridge_offloads,
+								    nb);
+	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
+	struct switchdev_notifier_fdb_info *fdb_info;
+	struct mlx5_bridge_switchdev_fdb_work *work;
+	struct switchdev_notifier_info *info = ptr;
+	struct net_device *upper;
+	struct mlx5e_priv *priv;
+
+	if (!mlx5e_eswitch_rep(dev))
+		return NOTIFY_DONE;
+	priv = netdev_priv(dev);
+	if (priv->mdev->priv.eswitch != br_offloads->esw)
+		return NOTIFY_DONE;
+
+	upper = netdev_master_upper_dev_get_rcu(dev);
+	if (!upper)
+		return NOTIFY_DONE;
+	if (!netif_is_bridge_master(upper))
+		return NOTIFY_DONE;
+
+	switch (event) {
+	case SWITCHDEV_FDB_ADD_TO_DEVICE:
+	case SWITCHDEV_FDB_DEL_TO_DEVICE:
+		fdb_info = container_of(info,
+					struct switchdev_notifier_fdb_info,
+					info);
+
+		work = mlx5_esw_bridge_init_switchdev_fdb_work(dev,
+							       event == SWITCHDEV_FDB_ADD_TO_DEVICE,
+							       fdb_info);
+		if (IS_ERR(work)) {
+			WARN_ONCE(1, "Failed to init switchdev work, err=%ld",
+				  PTR_ERR(work));
+			return notifier_from_errno(PTR_ERR(work));
+		}
+
+		queue_work(br_offloads->wq, &work->work);
+		break;
+	default:
+		break;
+	}
+	return NOTIFY_DONE;
+}
+
 void mlx5e_rep_bridge_init(struct mlx5e_priv *priv)
 {
 	struct mlx5_esw_bridge_offloads *br_offloads;
@@ -81,13 +206,34 @@ void mlx5e_rep_bridge_init(struct mlx5e_priv *priv)
 		return;
 	}
 
+	br_offloads->wq = alloc_ordered_workqueue("mlx5_bridge_wq", 0);
+	if (!br_offloads->wq) {
+		esw_warn(mdev, "Failed to allocate bridge offloads workqueue\n");
+		goto err_alloc_wq;
+	}
+
+	br_offloads->nb.notifier_call = mlx5_esw_bridge_switchdev_event;
+	err = register_switchdev_notifier(&br_offloads->nb);
+	if (err) {
+		esw_warn(mdev, "Failed to register switchdev notifier (err=%d)\n", err);
+		goto err_register_swdev;
+	}
+
 	br_offloads->netdev_nb.notifier_call = mlx5_esw_bridge_switchdev_port_event;
 	err = register_netdevice_notifier(&br_offloads->netdev_nb);
 	if (err) {
 		esw_warn(mdev, "Failed to register bridge offloads netdevice notifier (err=%d)\n",
 			 err);
-		mlx5_esw_bridge_cleanup(esw);
+		goto err_register_netdev;
 	}
+	return;
+
+err_register_netdev:
+	unregister_switchdev_notifier(&br_offloads->nb);
+err_register_swdev:
+	destroy_workqueue(br_offloads->wq);
+err_alloc_wq:
+	mlx5_esw_bridge_cleanup(esw);
 }
 
 void mlx5e_rep_bridge_cleanup(struct mlx5e_priv *priv)
@@ -102,6 +248,8 @@ void mlx5e_rep_bridge_cleanup(struct mlx5e_priv *priv)
 		return;
 
 	unregister_netdevice_notifier(&br_offloads->netdev_nb);
+	unregister_switchdev_notifier(&br_offloads->nb);
+	destroy_workqueue(br_offloads->wq);
 	rtnl_lock();
 	mlx5_esw_bridge_cleanup(esw);
 	rtnl_unlock();
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
index b503562f97d0..6dd47891189c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -3,6 +3,7 @@
 
 #include <linux/netdevice.h>
 #include <linux/list.h>
+#include <linux/rhashtable.h>
 #include <net/switchdev.h>
 #include "bridge.h"
 #include "eswitch.h"
@@ -21,15 +22,53 @@ enum {
 	MLX5_ESW_BRIDGE_LEVEL_EGRESS_TABLE,
 };
 
+struct mlx5_esw_bridge_fdb_key {
+	unsigned char addr[ETH_ALEN];
+	u16 vid;
+};
+
+struct mlx5_esw_bridge_fdb_entry {
+	struct mlx5_esw_bridge_fdb_key key;
+	struct rhash_head ht_node;
+	struct list_head list;
+	u16 vport_num;
+
+	struct mlx5_flow_handle *ingress_handle;
+	struct mlx5_flow_handle *egress_handle;
+};
+
+static const struct rhashtable_params fdb_ht_params = {
+	.key_offset = offsetof(struct mlx5_esw_bridge_fdb_entry, key),
+	.key_len = sizeof(struct mlx5_esw_bridge_fdb_key),
+	.head_offset = offsetof(struct mlx5_esw_bridge_fdb_entry, ht_node),
+	.automatic_shrinking = true,
+};
+
 struct mlx5_esw_bridge {
 	int ifindex;
 	int refcnt;
 	struct list_head list;
+	struct mlx5_esw_bridge_offloads *br_offloads;
+
+	struct list_head fdb_list;
+	struct rhashtable fdb_ht;
 
 	struct mlx5_flow_table *egress_ft;
 	struct mlx5_flow_group *egress_mac_fg;
 };
 
+static void
+mlx5_esw_bridge_fdb_offload_notify(struct net_device *dev, const unsigned char *addr, u16 vid,
+				   unsigned long val)
+{
+	struct switchdev_notifier_fdb_info send_info;
+
+	send_info.addr = addr;
+	send_info.vid = vid;
+	send_info.offloaded = true;
+	call_switchdev_notifiers(val, dev, &send_info.info, NULL);
+}
+
 static struct mlx5_flow_table *
 mlx5_esw_bridge_table_create(int max_fte, u32 level, struct mlx5_eswitch *esw)
 {
@@ -128,6 +167,9 @@ mlx5_esw_bridge_ingress_table_init(struct mlx5_esw_bridge_offloads *br_offloads)
 	struct mlx5_flow_group *mac_fg;
 	int err;
 
+	if (!mlx5_eswitch_vport_match_metadata_enabled(br_offloads->esw))
+		return -EOPNOTSUPP;
+
 	ingress_ft = mlx5_esw_bridge_table_create(MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE,
 						  MLX5_ESW_BRIDGE_LEVEL_INGRESS_TABLE,
 						  br_offloads->esw);
@@ -194,6 +236,82 @@ mlx5_esw_bridge_egress_table_cleanup(struct mlx5_esw_bridge *bridge)
 	mlx5_destroy_flow_table(bridge->egress_ft);
 }
 
+static struct mlx5_flow_handle *
+mlx5_esw_bridge_ingress_flow_create(u16 vport_num, const unsigned char *addr, u16 vid,
+				    struct mlx5_esw_bridge *bridge)
+{
+	struct mlx5_esw_bridge_offloads *br_offloads = bridge->br_offloads;
+	struct mlx5_flow_destination dest = {
+		.type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE,
+		.ft = bridge->egress_ft,
+	};
+	struct mlx5_flow_act flow_act = {
+		.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST,
+		.flags = FLOW_ACT_NO_APPEND,
+	};
+	struct mlx5_flow_spec *rule_spec;
+	struct mlx5_flow_handle *handle;
+	u8 *smac_v, *smac_c;
+
+	rule_spec = kvzalloc(sizeof(*rule_spec), GFP_KERNEL);
+	if (!rule_spec)
+		return ERR_PTR(-ENOMEM);
+
+	rule_spec->match_criteria_enable = MLX5_MATCH_OUTER_HEADERS | MLX5_MATCH_MISC_PARAMETERS_2;
+
+	smac_v = MLX5_ADDR_OF(fte_match_param, rule_spec->match_value,
+			      outer_headers.smac_47_16);
+	ether_addr_copy(smac_v, addr);
+	smac_c = MLX5_ADDR_OF(fte_match_param, rule_spec->match_criteria,
+			      outer_headers.smac_47_16);
+	eth_broadcast_addr(smac_c);
+
+	MLX5_SET(fte_match_param, rule_spec->match_criteria,
+		 misc_parameters_2.metadata_reg_c_0, mlx5_eswitch_get_vport_metadata_mask());
+	MLX5_SET(fte_match_param, rule_spec->match_value, misc_parameters_2.metadata_reg_c_0,
+		 mlx5_eswitch_get_vport_metadata_for_match(br_offloads->esw, vport_num));
+
+	handle = mlx5_add_flow_rules(br_offloads->ingress_ft, rule_spec, &flow_act, &dest, 1);
+
+	kvfree(rule_spec);
+	return handle;
+}
+
+static struct mlx5_flow_handle *
+mlx5_esw_bridge_egress_flow_create(u16 vport_num, const unsigned char *addr, u16 vid,
+				   struct mlx5_esw_bridge *bridge)
+{
+	struct mlx5_flow_destination dest = {
+		.type = MLX5_FLOW_DESTINATION_TYPE_VPORT,
+		.vport.num = vport_num,
+	};
+	struct mlx5_flow_act flow_act = {
+		.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST,
+		.flags = FLOW_ACT_NO_APPEND,
+	};
+	struct mlx5_flow_spec *rule_spec;
+	struct mlx5_flow_handle *handle;
+	u8 *dmac_v, *dmac_c;
+
+	rule_spec = kvzalloc(sizeof(*rule_spec), GFP_KERNEL);
+	if (!rule_spec)
+		return ERR_PTR(-ENOMEM);
+
+	rule_spec->match_criteria_enable = MLX5_MATCH_OUTER_HEADERS;
+
+	dmac_v = MLX5_ADDR_OF(fte_match_param, rule_spec->match_value,
+			      outer_headers.dmac_47_16);
+	ether_addr_copy(dmac_v, addr);
+	dmac_c = MLX5_ADDR_OF(fte_match_param, rule_spec->match_criteria,
+			      outer_headers.dmac_47_16);
+	eth_broadcast_addr(dmac_c);
+
+	handle = mlx5_add_flow_rules(bridge->egress_ft, rule_spec, &flow_act, &dest, 1);
+
+	kvfree(rule_spec);
+	return handle;
+}
+
 static struct mlx5_esw_bridge *mlx5_esw_bridge_create(int ifindex,
 						      struct mlx5_esw_bridge_offloads *br_offloads)
 {
@@ -204,16 +322,24 @@ static struct mlx5_esw_bridge *mlx5_esw_bridge_create(int ifindex,
 	if (!bridge)
 		return ERR_PTR(-ENOMEM);
 
+	bridge->br_offloads = br_offloads;
 	err = mlx5_esw_bridge_egress_table_init(br_offloads, bridge);
 	if (err)
 		goto err_egress_tbl;
 
+	err = rhashtable_init(&bridge->fdb_ht, &fdb_ht_params);
+	if (err)
+		goto err_fdb_ht;
+
+	INIT_LIST_HEAD(&bridge->fdb_list);
 	bridge->ifindex = ifindex;
 	bridge->refcnt = 1;
 	list_add(&bridge->list, &br_offloads->bridges);
 
 	return bridge;
 
+err_fdb_ht:
+	mlx5_esw_bridge_egress_table_cleanup(bridge);
 err_egress_tbl:
 	kvfree(bridge);
 	return ERR_PTR(err);
@@ -232,6 +358,7 @@ static void mlx5_esw_bridge_put(struct mlx5_esw_bridge_offloads *br_offloads,
 
 	mlx5_esw_bridge_egress_table_cleanup(bridge);
 	list_del(&bridge->list);
+	rhashtable_destroy(&bridge->fdb_ht);
 	kvfree(bridge);
 
 	if (list_empty(&br_offloads->bridges))
@@ -265,6 +392,69 @@ mlx5_esw_bridge_lookup(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads
 	return bridge;
 }
 
+static void
+mlx5_esw_bridge_fdb_entry_cleanup(struct mlx5_esw_bridge_fdb_entry *entry,
+				  struct mlx5_esw_bridge *bridge)
+{
+	rhashtable_remove_fast(&bridge->fdb_ht, &entry->ht_node, fdb_ht_params);
+	mlx5_del_flow_rules(entry->egress_handle);
+	mlx5_del_flow_rules(entry->ingress_handle);
+	list_del(&entry->list);
+	kvfree(entry);
+}
+
+static struct mlx5_esw_bridge_fdb_entry *
+mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsigned char *addr,
+			       u16 vid, struct mlx5_eswitch *esw, struct mlx5_esw_bridge *bridge)
+{
+	struct mlx5_esw_bridge_fdb_entry *entry;
+	struct mlx5_flow_handle *handle;
+	int err;
+
+	entry = kvzalloc(sizeof(*entry), GFP_KERNEL);
+	if (!entry)
+		return ERR_PTR(-ENOMEM);
+
+	ether_addr_copy(entry->key.addr, addr);
+	entry->key.vid = vid;
+	entry->vport_num = vport_num;
+
+	handle = mlx5_esw_bridge_ingress_flow_create(vport_num, addr, vid, bridge);
+	if (IS_ERR(handle)) {
+		err = PTR_ERR(handle);
+		esw_warn(esw->dev, "Failed to create ingress flow(vport=%u,err=%d)\n",
+			 vport_num, err);
+		goto err_ingress_flow_create;
+	}
+	entry->ingress_handle = handle;
+
+	handle = mlx5_esw_bridge_egress_flow_create(vport_num, addr, vid, bridge);
+	if (IS_ERR(handle)) {
+		err = PTR_ERR(handle);
+		esw_warn(esw->dev, "Failed to create egress flow(vport=%u,err=%d)\n",
+			 vport_num, err);
+		goto err_egress_flow_create;
+	}
+	entry->egress_handle = handle;
+
+	err = rhashtable_insert_fast(&bridge->fdb_ht, &entry->ht_node, fdb_ht_params);
+	if (err) {
+		esw_warn(esw->dev, "Failed to insert FDB flow(vport=%u,err=%d)\n", vport_num, err);
+		goto err_ht_init;
+	}
+
+	list_add(&entry->list, &bridge->fdb_list);
+	return entry;
+
+err_ht_init:
+	mlx5_del_flow_rules(entry->egress_handle);
+err_egress_flow_create:
+	mlx5_del_flow_rules(entry->ingress_handle);
+err_ingress_flow_create:
+	kvfree(entry);
+	return ERR_PTR(err);
+}
+
 static int mlx5_esw_bridge_vport_init(struct mlx5_esw_bridge *bridge,
 				      struct mlx5_vport *vport)
 {
@@ -275,7 +465,14 @@ static int mlx5_esw_bridge_vport_init(struct mlx5_esw_bridge *bridge,
 static int mlx5_esw_bridge_vport_cleanup(struct mlx5_esw_bridge_offloads *br_offloads,
 					 struct mlx5_vport *vport)
 {
-	mlx5_esw_bridge_put(br_offloads, vport->bridge);
+	struct mlx5_esw_bridge *bridge = vport->bridge;
+	struct mlx5_esw_bridge_fdb_entry *entry, *tmp;
+
+	list_for_each_entry_safe(entry, tmp, &bridge->fdb_list, list)
+		if (entry->vport_num == vport->vport)
+			mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge);
+
+	mlx5_esw_bridge_put(br_offloads, bridge);
 	vport->bridge = NULL;
 	return 0;
 }
@@ -299,11 +496,13 @@ int mlx5_esw_bridge_vport_link(int ifindex, struct mlx5_esw_bridge_offloads *br_
 int mlx5_esw_bridge_vport_unlink(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads,
 				 struct mlx5_vport *vport, struct netlink_ext_ack *extack)
 {
-	if (!vport->bridge) {
+	struct mlx5_esw_bridge *bridge = vport->bridge;
+
+	if (!bridge) {
 		NL_SET_ERR_MSG_MOD(extack, "Port is not attached to any bridge");
 		return -EINVAL;
 	}
-	if (vport->bridge->ifindex != ifindex) {
+	if (bridge->ifindex != ifindex) {
 		NL_SET_ERR_MSG_MOD(extack, "Port is attached to another bridge");
 		return -EINVAL;
 	}
@@ -311,6 +510,55 @@ int mlx5_esw_bridge_vport_unlink(int ifindex, struct mlx5_esw_bridge_offloads *b
 	return mlx5_esw_bridge_vport_cleanup(br_offloads, vport);
 }
 
+void mlx5_esw_bridge_fdb_create(struct net_device *dev, struct mlx5_eswitch *esw,
+				struct mlx5_vport *vport,
+				struct switchdev_notifier_fdb_info *fdb_info)
+{
+	struct mlx5_esw_bridge *bridge = vport->bridge;
+	struct mlx5_esw_bridge_fdb_entry *entry;
+	u16 vport_num = vport->vport;
+
+	if (!bridge) {
+		esw_info(esw->dev, "Vport is not assigned to bridge (vport=%u)\n", vport_num);
+		return;
+	}
+
+	entry = mlx5_esw_bridge_fdb_entry_init(dev, vport_num, fdb_info->addr, fdb_info->vid,
+					       esw, bridge);
+	if (IS_ERR(entry))
+		return;
+
+	mlx5_esw_bridge_fdb_offload_notify(dev, entry->key.addr, entry->key.vid,
+					   SWITCHDEV_FDB_OFFLOADED);
+}
+
+void mlx5_esw_bridge_fdb_remove(struct net_device *dev, struct mlx5_eswitch *esw,
+				struct mlx5_vport *vport,
+				struct switchdev_notifier_fdb_info *fdb_info)
+{
+	struct mlx5_esw_bridge *bridge = vport->bridge;
+	struct mlx5_esw_bridge_fdb_entry *entry;
+	struct mlx5_esw_bridge_fdb_key key;
+	u16 vport_num = vport->vport;
+
+	if (!bridge) {
+		esw_warn(esw->dev, "Vport is not assigned to bridge (vport=%u)\n", vport_num);
+		return;
+	}
+
+	ether_addr_copy(key.addr, fdb_info->addr);
+	key.vid = fdb_info->vid;
+	entry = rhashtable_lookup_fast(&bridge->fdb_ht, &key, fdb_ht_params);
+	if (!entry) {
+		esw_warn(esw->dev,
+			 "FDB entry with specified key not found (MAC=%pM,vid=%u,vport=%u)\n",
+			 key.addr, key.vid, vport_num);
+		return;
+	}
+
+	mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge);
+}
+
 static void mlx5_esw_bridge_flush(struct mlx5_esw_bridge_offloads *br_offloads)
 {
 	struct mlx5_eswitch *esw = br_offloads->esw;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
index 319b6f1db0ba..cec118c0b733 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
@@ -10,11 +10,14 @@
 
 struct mlx5_flow_table;
 struct mlx5_flow_group;
+struct workqueue_struct;
 
 struct mlx5_esw_bridge_offloads {
 	struct mlx5_eswitch *esw;
 	struct list_head bridges;
 	struct notifier_block netdev_nb;
+	struct notifier_block nb;
+	struct workqueue_struct *wq;
 
 	struct mlx5_flow_table *ingress_ft;
 	struct mlx5_flow_group *ingress_mac_fg;
@@ -26,5 +29,11 @@ int mlx5_esw_bridge_vport_link(int ifindex, struct mlx5_esw_bridge_offloads *br_
 			       struct mlx5_vport *vport, struct netlink_ext_ack *extack);
 int mlx5_esw_bridge_vport_unlink(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads,
 				 struct mlx5_vport *vport, struct netlink_ext_ack *extack);
+void mlx5_esw_bridge_fdb_create(struct net_device *dev, struct mlx5_eswitch *esw,
+				struct mlx5_vport *vport,
+				struct switchdev_notifier_fdb_info *fdb_info);
+void mlx5_esw_bridge_fdb_remove(struct net_device *dev, struct mlx5_eswitch *esw,
+				struct mlx5_vport *vport,
+				struct switchdev_notifier_fdb_info *fdb_info);
 
 #endif /* __MLX5_ESW_BRIDGE_H__ */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [net-next 11/16] net/mlx5: Bridge, dynamic entry ageing
  2021-06-10  2:57 [pull request][net-next 00/16] mlx5 updates 2021-06-09 Saeed Mahameed
                   ` (9 preceding siblings ...)
  2021-06-10  2:58 ` [net-next 10/16] net/mlx5: Bridge, handle FDB events Saeed Mahameed
@ 2021-06-10  2:58 ` Saeed Mahameed
  2021-06-10  2:58 ` [net-next 12/16] net/mlx5: Bridge, implement infrastructure for vlans Saeed Mahameed
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Saeed Mahameed @ 2021-06-10  2:58 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Vlad Buslov, Jianbo Liu, Saeed Mahameed

From: Vlad Buslov <vladbu@nvidia.com>

Dynamic FDB entries require capability to age out unused entries. Such
entries are either aged out by kernel software bridge implementation or by
hardware switch that offloaded them (and notified the kernel to mark them
as SWITCHDEV_FDB_ADD_TO_BRIDGE). Leaving ageing to kernel bridge would
result it deleting offloaded dynamic FDB entries every ageing_time period
due to packets being processed by hardware and, consecutively, 'used'
timestamp for FDB entry not being updated. However, since hardware doesn't
support ageing, software solution inside the driver is required.

In order to emulate hardware ageing in driver, extend bridge FDB ingress
flows with counter and create delayed br_offloads->update_work task on
bridge offloads workqueue. Run the task every second, update 'used'
timestamp in software bridge dynamic entry by sending
SWITCHDEV_FDB_ADD_TO_BRIDGE for the entry, if it flow hardware counter
lastuse field was changed since last update. If lastuse wasn't changed for
ageing_time period, then delete the FDB entry and notify kernel bridge by
sending SWITCHDEV_FDB_DEL_TO_BRIDGE notification.

Register blocking switchdev notifier callback and handle attribute set
SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME event to allow user to dynamically
configure bridge FDB entry ageing timeout. Save the value per-bridge in
struct mlx5_esw_bridge. Silently ignore
SWITCHDEV_ATTR_ID_PORT_{PRE_}BRIDGE_FLAGS switchdev event since mlx5 bridge
implementation relies on software bridge for implementing necessary
behavior for all of these flags.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/en/rep/bridge.c        |  95 ++++++++++++++++
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  | 104 ++++++++++++++++--
 .../ethernet/mellanox/mlx5/core/esw/bridge.h  |   7 +-
 3 files changed, 193 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
index b34e9cb686e3..14645f24671f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
@@ -2,12 +2,15 @@
 /* Copyright (c) 2021 Mellanox Technologies. */
 
 #include <linux/netdevice.h>
+#include <linux/if_bridge.h>
 #include <net/netevent.h>
 #include <net/switchdev.h>
 #include "bridge.h"
 #include "esw/bridge.h"
 #include "en_rep.h"
 
+#define MLX5_ESW_BRIDGE_UPDATE_INTERVAL 1000
+
 struct mlx5_bridge_switchdev_fdb_work {
 	struct work_struct work;
 	struct switchdev_notifier_fdb_info fdb_info;
@@ -72,6 +75,63 @@ static int mlx5_esw_bridge_switchdev_port_event(struct notifier_block *nb,
 	return notifier_from_errno(err);
 }
 
+static int mlx5_esw_bridge_port_obj_attr_set(struct net_device *dev,
+					     const struct switchdev_attr *attr,
+					     struct netlink_ext_ack *extack)
+{
+	struct mlx5e_rep_priv *rpriv;
+	struct mlx5_eswitch *esw;
+	struct mlx5_vport *vport;
+	struct mlx5e_priv *priv;
+	u16 vport_num;
+	int err = 0;
+
+	priv = netdev_priv(dev);
+	rpriv = priv->ppriv;
+	vport_num = rpriv->rep->vport;
+	esw = priv->mdev->priv.eswitch;
+	vport = mlx5_eswitch_get_vport(esw, vport_num);
+	if (IS_ERR(vport))
+		return PTR_ERR(vport);
+
+	switch (attr->id) {
+	case SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS:
+		if (attr->u.brport_flags.mask & ~(BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD)) {
+			NL_SET_ERR_MSG_MOD(extack, "Flag is not supported");
+			err = -EINVAL;
+		}
+		break;
+	case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS:
+		break;
+	case SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME:
+		err = mlx5_esw_bridge_ageing_time_set(attr->u.ageing_time, esw, vport);
+		break;
+	default:
+		err = -EOPNOTSUPP;
+	}
+
+	return err;
+}
+
+static int mlx5_esw_bridge_event_blocking(struct notifier_block *unused,
+					  unsigned long event, void *ptr)
+{
+	struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
+	int err;
+
+	switch (event) {
+	case SWITCHDEV_PORT_ATTR_SET:
+		err = switchdev_handle_port_attr_set(dev, ptr,
+						     mlx5e_eswitch_rep,
+						     mlx5_esw_bridge_port_obj_attr_set);
+		break;
+	default:
+		err = 0;
+	}
+
+	return notifier_from_errno(err);
+}
+
 static void
 mlx5_esw_bridge_cleanup_switchdev_fdb_work(struct mlx5_bridge_switchdev_fdb_work *fdb_work)
 {
@@ -160,6 +220,13 @@ static int mlx5_esw_bridge_switchdev_event(struct notifier_block *nb,
 	if (priv->mdev->priv.eswitch != br_offloads->esw)
 		return NOTIFY_DONE;
 
+	if (event == SWITCHDEV_PORT_ATTR_SET) {
+		int err = switchdev_handle_port_attr_set(dev, ptr,
+							 mlx5e_eswitch_rep,
+							 mlx5_esw_bridge_port_obj_attr_set);
+		return notifier_from_errno(err);
+	}
+
 	upper = netdev_master_upper_dev_get_rcu(dev);
 	if (!upper)
 		return NOTIFY_DONE;
@@ -190,6 +257,20 @@ static int mlx5_esw_bridge_switchdev_event(struct notifier_block *nb,
 	return NOTIFY_DONE;
 }
 
+static void mlx5_esw_bridge_update_work(struct work_struct *work)
+{
+	struct mlx5_esw_bridge_offloads *br_offloads = container_of(work,
+								    struct mlx5_esw_bridge_offloads,
+								    update_work.work);
+
+	rtnl_lock();
+	mlx5_esw_bridge_update(br_offloads);
+	rtnl_unlock();
+
+	queue_delayed_work(br_offloads->wq, &br_offloads->update_work,
+			   msecs_to_jiffies(MLX5_ESW_BRIDGE_UPDATE_INTERVAL));
+}
+
 void mlx5e_rep_bridge_init(struct mlx5e_priv *priv)
 {
 	struct mlx5_esw_bridge_offloads *br_offloads;
@@ -211,6 +292,9 @@ void mlx5e_rep_bridge_init(struct mlx5e_priv *priv)
 		esw_warn(mdev, "Failed to allocate bridge offloads workqueue\n");
 		goto err_alloc_wq;
 	}
+	INIT_DELAYED_WORK(&br_offloads->update_work, mlx5_esw_bridge_update_work);
+	queue_delayed_work(br_offloads->wq, &br_offloads->update_work,
+			   msecs_to_jiffies(MLX5_ESW_BRIDGE_UPDATE_INTERVAL));
 
 	br_offloads->nb.notifier_call = mlx5_esw_bridge_switchdev_event;
 	err = register_switchdev_notifier(&br_offloads->nb);
@@ -219,6 +303,13 @@ void mlx5e_rep_bridge_init(struct mlx5e_priv *priv)
 		goto err_register_swdev;
 	}
 
+	br_offloads->nb_blk.notifier_call = mlx5_esw_bridge_event_blocking;
+	err = register_switchdev_blocking_notifier(&br_offloads->nb_blk);
+	if (err) {
+		esw_warn(mdev, "Failed to register blocking switchdev notifier (err=%d)\n", err);
+		goto err_register_swdev_blk;
+	}
+
 	br_offloads->netdev_nb.notifier_call = mlx5_esw_bridge_switchdev_port_event;
 	err = register_netdevice_notifier(&br_offloads->netdev_nb);
 	if (err) {
@@ -229,6 +320,8 @@ void mlx5e_rep_bridge_init(struct mlx5e_priv *priv)
 	return;
 
 err_register_netdev:
+	unregister_switchdev_blocking_notifier(&br_offloads->nb_blk);
+err_register_swdev_blk:
 	unregister_switchdev_notifier(&br_offloads->nb);
 err_register_swdev:
 	destroy_workqueue(br_offloads->wq);
@@ -248,7 +341,9 @@ void mlx5e_rep_bridge_cleanup(struct mlx5e_priv *priv)
 		return;
 
 	unregister_netdevice_notifier(&br_offloads->netdev_nb);
+	unregister_switchdev_blocking_notifier(&br_offloads->nb_blk);
 	unregister_switchdev_notifier(&br_offloads->nb);
+	cancel_delayed_work(&br_offloads->update_work);
 	destroy_workqueue(br_offloads->wq);
 	rtnl_lock();
 	mlx5_esw_bridge_cleanup(esw);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
index 6dd47891189c..557dac5e9745 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -4,6 +4,7 @@
 #include <linux/netdevice.h>
 #include <linux/list.h>
 #include <linux/rhashtable.h>
+#include <linux/if_bridge.h>
 #include <net/switchdev.h>
 #include "bridge.h"
 #include "eswitch.h"
@@ -27,13 +28,21 @@ struct mlx5_esw_bridge_fdb_key {
 	u16 vid;
 };
 
+enum {
+	MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER = BIT(0),
+};
+
 struct mlx5_esw_bridge_fdb_entry {
 	struct mlx5_esw_bridge_fdb_key key;
 	struct rhash_head ht_node;
+	struct net_device *dev;
 	struct list_head list;
 	u16 vport_num;
+	u16 flags;
 
 	struct mlx5_flow_handle *ingress_handle;
+	struct mlx5_fc *ingress_counter;
+	unsigned long lastuse;
 	struct mlx5_flow_handle *egress_handle;
 };
 
@@ -55,6 +64,7 @@ struct mlx5_esw_bridge {
 
 	struct mlx5_flow_table *egress_ft;
 	struct mlx5_flow_group *egress_mac_fg;
+	unsigned long ageing_time;
 };
 
 static void
@@ -238,17 +248,14 @@ mlx5_esw_bridge_egress_table_cleanup(struct mlx5_esw_bridge *bridge)
 
 static struct mlx5_flow_handle *
 mlx5_esw_bridge_ingress_flow_create(u16 vport_num, const unsigned char *addr, u16 vid,
-				    struct mlx5_esw_bridge *bridge)
+				    u32 counter_id, struct mlx5_esw_bridge *bridge)
 {
 	struct mlx5_esw_bridge_offloads *br_offloads = bridge->br_offloads;
-	struct mlx5_flow_destination dest = {
-		.type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE,
-		.ft = bridge->egress_ft,
-	};
 	struct mlx5_flow_act flow_act = {
-		.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST,
+		.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST | MLX5_FLOW_CONTEXT_ACTION_COUNT,
 		.flags = FLOW_ACT_NO_APPEND,
 	};
+	struct mlx5_flow_destination dests[2] = {};
 	struct mlx5_flow_spec *rule_spec;
 	struct mlx5_flow_handle *handle;
 	u8 *smac_v, *smac_c;
@@ -271,7 +278,13 @@ mlx5_esw_bridge_ingress_flow_create(u16 vport_num, const unsigned char *addr, u1
 	MLX5_SET(fte_match_param, rule_spec->match_value, misc_parameters_2.metadata_reg_c_0,
 		 mlx5_eswitch_get_vport_metadata_for_match(br_offloads->esw, vport_num));
 
-	handle = mlx5_add_flow_rules(br_offloads->ingress_ft, rule_spec, &flow_act, &dest, 1);
+	dests[0].type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE;
+	dests[0].ft = bridge->egress_ft;
+	dests[1].type = MLX5_FLOW_DESTINATION_TYPE_COUNTER;
+	dests[1].counter_id = counter_id;
+
+	handle = mlx5_add_flow_rules(br_offloads->ingress_ft, rule_spec, &flow_act, dests,
+				     ARRAY_SIZE(dests));
 
 	kvfree(rule_spec);
 	return handle;
@@ -334,6 +347,7 @@ static struct mlx5_esw_bridge *mlx5_esw_bridge_create(int ifindex,
 	INIT_LIST_HEAD(&bridge->fdb_list);
 	bridge->ifindex = ifindex;
 	bridge->refcnt = 1;
+	bridge->ageing_time = BR_DEFAULT_AGEING_TIME;
 	list_add(&bridge->list, &br_offloads->bridges);
 
 	return bridge;
@@ -399,27 +413,44 @@ mlx5_esw_bridge_fdb_entry_cleanup(struct mlx5_esw_bridge_fdb_entry *entry,
 	rhashtable_remove_fast(&bridge->fdb_ht, &entry->ht_node, fdb_ht_params);
 	mlx5_del_flow_rules(entry->egress_handle);
 	mlx5_del_flow_rules(entry->ingress_handle);
+	mlx5_fc_destroy(bridge->br_offloads->esw->dev, entry->ingress_counter);
 	list_del(&entry->list);
 	kvfree(entry);
 }
 
 static struct mlx5_esw_bridge_fdb_entry *
 mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsigned char *addr,
-			       u16 vid, struct mlx5_eswitch *esw, struct mlx5_esw_bridge *bridge)
+			       u16 vid, bool added_by_user, struct mlx5_eswitch *esw,
+			       struct mlx5_esw_bridge *bridge)
 {
 	struct mlx5_esw_bridge_fdb_entry *entry;
 	struct mlx5_flow_handle *handle;
+	struct mlx5_fc *counter;
+	struct mlx5e_priv *priv;
 	int err;
 
+	priv = netdev_priv(dev);
 	entry = kvzalloc(sizeof(*entry), GFP_KERNEL);
 	if (!entry)
 		return ERR_PTR(-ENOMEM);
 
 	ether_addr_copy(entry->key.addr, addr);
 	entry->key.vid = vid;
+	entry->dev = dev;
 	entry->vport_num = vport_num;
+	entry->lastuse = jiffies;
+	if (added_by_user)
+		entry->flags |= MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER;
+
+	counter = mlx5_fc_create(priv->mdev, true);
+	if (IS_ERR(counter)) {
+		err = PTR_ERR(counter);
+		goto err_ingress_fc_create;
+	}
+	entry->ingress_counter = counter;
 
-	handle = mlx5_esw_bridge_ingress_flow_create(vport_num, addr, vid, bridge);
+	handle = mlx5_esw_bridge_ingress_flow_create(vport_num, addr, vid, mlx5_fc_id(counter),
+						     bridge);
 	if (IS_ERR(handle)) {
 		err = PTR_ERR(handle);
 		esw_warn(esw->dev, "Failed to create ingress flow(vport=%u,err=%d)\n",
@@ -451,10 +482,22 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsi
 err_egress_flow_create:
 	mlx5_del_flow_rules(entry->ingress_handle);
 err_ingress_flow_create:
+	mlx5_fc_destroy(priv->mdev, entry->ingress_counter);
+err_ingress_fc_create:
 	kvfree(entry);
 	return ERR_PTR(err);
 }
 
+int mlx5_esw_bridge_ageing_time_set(unsigned long ageing_time, struct mlx5_eswitch *esw,
+				    struct mlx5_vport *vport)
+{
+	if (!vport->bridge)
+		return -EINVAL;
+
+	vport->bridge->ageing_time = ageing_time;
+	return 0;
+}
+
 static int mlx5_esw_bridge_vport_init(struct mlx5_esw_bridge *bridge,
 				      struct mlx5_vport *vport)
 {
@@ -524,12 +567,17 @@ void mlx5_esw_bridge_fdb_create(struct net_device *dev, struct mlx5_eswitch *esw
 	}
 
 	entry = mlx5_esw_bridge_fdb_entry_init(dev, vport_num, fdb_info->addr, fdb_info->vid,
-					       esw, bridge);
+					       fdb_info->added_by_user, esw, bridge);
 	if (IS_ERR(entry))
 		return;
 
-	mlx5_esw_bridge_fdb_offload_notify(dev, entry->key.addr, entry->key.vid,
-					   SWITCHDEV_FDB_OFFLOADED);
+	if (entry->flags & MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER)
+		mlx5_esw_bridge_fdb_offload_notify(dev, entry->key.addr, entry->key.vid,
+						   SWITCHDEV_FDB_OFFLOADED);
+	else
+		/* Take over dynamic entries to prevent kernel bridge from aging them out. */
+		mlx5_esw_bridge_fdb_offload_notify(dev, entry->key.addr, entry->key.vid,
+						   SWITCHDEV_FDB_ADD_TO_BRIDGE);
 }
 
 void mlx5_esw_bridge_fdb_remove(struct net_device *dev, struct mlx5_eswitch *esw,
@@ -556,9 +604,41 @@ void mlx5_esw_bridge_fdb_remove(struct net_device *dev, struct mlx5_eswitch *esw
 		return;
 	}
 
+	if (!(entry->flags & MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER))
+		mlx5_esw_bridge_fdb_offload_notify(dev, entry->key.addr, entry->key.vid,
+						   SWITCHDEV_FDB_DEL_TO_BRIDGE);
 	mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge);
 }
 
+void mlx5_esw_bridge_update(struct mlx5_esw_bridge_offloads *br_offloads)
+{
+	struct mlx5_esw_bridge_fdb_entry *entry, *tmp;
+	struct mlx5_esw_bridge *bridge;
+
+	list_for_each_entry(bridge, &br_offloads->bridges, list) {
+		list_for_each_entry_safe(entry, tmp, &bridge->fdb_list, list) {
+			unsigned long lastuse =
+				(unsigned long)mlx5_fc_query_lastuse(entry->ingress_counter);
+
+			if (entry->flags & MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER)
+				continue;
+
+			if (time_after(lastuse, entry->lastuse)) {
+				entry->lastuse = lastuse;
+				/* refresh existing bridge entry */
+				mlx5_esw_bridge_fdb_offload_notify(entry->dev, entry->key.addr,
+								   entry->key.vid,
+								   SWITCHDEV_FDB_ADD_TO_BRIDGE);
+			} else if (time_is_before_jiffies(entry->lastuse + bridge->ageing_time)) {
+				mlx5_esw_bridge_fdb_offload_notify(entry->dev, entry->key.addr,
+								   entry->key.vid,
+								   SWITCHDEV_FDB_DEL_TO_BRIDGE);
+				mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge);
+			}
+		}
+	}
+}
+
 static void mlx5_esw_bridge_flush(struct mlx5_esw_bridge_offloads *br_offloads)
 {
 	struct mlx5_eswitch *esw = br_offloads->esw;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
index cec118c0b733..07726ae55b2b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
@@ -6,18 +6,20 @@
 
 #include <linux/notifier.h>
 #include <linux/list.h>
+#include <linux/workqueue.h>
 #include "eswitch.h"
 
 struct mlx5_flow_table;
 struct mlx5_flow_group;
-struct workqueue_struct;
 
 struct mlx5_esw_bridge_offloads {
 	struct mlx5_eswitch *esw;
 	struct list_head bridges;
 	struct notifier_block netdev_nb;
+	struct notifier_block nb_blk;
 	struct notifier_block nb;
 	struct workqueue_struct *wq;
+	struct delayed_work update_work;
 
 	struct mlx5_flow_table *ingress_ft;
 	struct mlx5_flow_group *ingress_mac_fg;
@@ -35,5 +37,8 @@ void mlx5_esw_bridge_fdb_create(struct net_device *dev, struct mlx5_eswitch *esw
 void mlx5_esw_bridge_fdb_remove(struct net_device *dev, struct mlx5_eswitch *esw,
 				struct mlx5_vport *vport,
 				struct switchdev_notifier_fdb_info *fdb_info);
+void mlx5_esw_bridge_update(struct mlx5_esw_bridge_offloads *br_offloads);
+int mlx5_esw_bridge_ageing_time_set(unsigned long ageing_time, struct mlx5_eswitch *esw,
+				    struct mlx5_vport *vport);
 
 #endif /* __MLX5_ESW_BRIDGE_H__ */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [net-next 12/16] net/mlx5: Bridge, implement infrastructure for vlans
  2021-06-10  2:57 [pull request][net-next 00/16] mlx5 updates 2021-06-09 Saeed Mahameed
                   ` (10 preceding siblings ...)
  2021-06-10  2:58 ` [net-next 11/16] net/mlx5: Bridge, dynamic entry ageing Saeed Mahameed
@ 2021-06-10  2:58 ` Saeed Mahameed
  2021-06-10  2:58 ` [net-next 13/16] net/mlx5: Bridge, match FDB entry vlan tag Saeed Mahameed
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Saeed Mahameed @ 2021-06-10  2:58 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Vlad Buslov, Jianbo Liu, Saeed Mahameed

From: Vlad Buslov <vladbu@nvidia.com>

Establish all the necessary infrastructure for implementing vlan matching
and vlan push/pop in following patches:

- Add new per-vport struct mlx5_esw_bridge_port that is used to store
metadata for all port vlans. Initialize and cleanup the instance of the
structure when port representor is linked/unliked to bridge. Use xarray to
allow quick vport metadata lookup by vport number.

- Add new per-port-vlan struct mlx5_esw_bridge_vlan that is used to store
vlan-specific data (vid, flags). Handle SWITCHDEV_PORT_OBJ_{ADD|DEL}
switchdev blocking event for SWITCHDEV_OBJ_ID_PORT_VLAN object by
creating/deleting the vlan structure and saving it in per-vport xarray for
quick lookup.

- Implement support for SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING object
attribute that is used to toggle vlan filtering. Remove all FDB entries
from hardware when vlan filtering state is changed.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/en/rep/bridge.c        |  73 ++++++
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  | 211 +++++++++++++++++-
 .../ethernet/mellanox/mlx5/core/esw/bridge.h  |   5 +
 3 files changed, 286 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
index 14645f24671f..7f5efc1b4392 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
@@ -75,6 +75,66 @@ static int mlx5_esw_bridge_switchdev_port_event(struct notifier_block *nb,
 	return notifier_from_errno(err);
 }
 
+static int mlx5_esw_bridge_port_obj_add(struct net_device *dev,
+					const struct switchdev_obj *obj,
+					struct netlink_ext_ack *extack)
+{
+	const struct switchdev_obj_port_vlan *vlan;
+	struct mlx5e_rep_priv *rpriv;
+	struct mlx5_eswitch *esw;
+	struct mlx5_vport *vport;
+	struct mlx5e_priv *priv;
+	u16 vport_num;
+	int err = 0;
+
+	priv = netdev_priv(dev);
+	rpriv = priv->ppriv;
+	vport_num = rpriv->rep->vport;
+	esw = priv->mdev->priv.eswitch;
+	vport = mlx5_eswitch_get_vport(esw, vport_num);
+	if (IS_ERR(vport))
+		return PTR_ERR(vport);
+
+	switch (obj->id) {
+	case SWITCHDEV_OBJ_ID_PORT_VLAN:
+		vlan = SWITCHDEV_OBJ_PORT_VLAN(obj);
+		err = mlx5_esw_bridge_port_vlan_add(vlan->vid, vlan->flags, esw, vport, extack);
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+	return err;
+}
+
+static int mlx5_esw_bridge_port_obj_del(struct net_device *dev,
+					const struct switchdev_obj *obj)
+{
+	const struct switchdev_obj_port_vlan *vlan;
+	struct mlx5e_rep_priv *rpriv;
+	struct mlx5_eswitch *esw;
+	struct mlx5_vport *vport;
+	struct mlx5e_priv *priv;
+	u16 vport_num;
+
+	priv = netdev_priv(dev);
+	rpriv = priv->ppriv;
+	vport_num = rpriv->rep->vport;
+	esw = priv->mdev->priv.eswitch;
+	vport = mlx5_eswitch_get_vport(esw, vport_num);
+	if (IS_ERR(vport))
+		return PTR_ERR(vport);
+
+	switch (obj->id) {
+	case SWITCHDEV_OBJ_ID_PORT_VLAN:
+		vlan = SWITCHDEV_OBJ_PORT_VLAN(obj);
+		mlx5_esw_bridge_port_vlan_del(vlan->vid, esw, vport);
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+	return 0;
+}
+
 static int mlx5_esw_bridge_port_obj_attr_set(struct net_device *dev,
 					     const struct switchdev_attr *attr,
 					     struct netlink_ext_ack *extack)
@@ -106,6 +166,9 @@ static int mlx5_esw_bridge_port_obj_attr_set(struct net_device *dev,
 	case SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME:
 		err = mlx5_esw_bridge_ageing_time_set(attr->u.ageing_time, esw, vport);
 		break;
+	case SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING:
+		err = mlx5_esw_bridge_vlan_filtering_set(attr->u.vlan_filtering, esw, vport);
+		break;
 	default:
 		err = -EOPNOTSUPP;
 	}
@@ -120,6 +183,16 @@ static int mlx5_esw_bridge_event_blocking(struct notifier_block *unused,
 	int err;
 
 	switch (event) {
+	case SWITCHDEV_PORT_OBJ_ADD:
+		err = switchdev_handle_port_obj_add(dev, ptr,
+						    mlx5e_eswitch_rep,
+						    mlx5_esw_bridge_port_obj_add);
+		break;
+	case SWITCHDEV_PORT_OBJ_DEL:
+		err = switchdev_handle_port_obj_del(dev, ptr,
+						    mlx5e_eswitch_rep,
+						    mlx5_esw_bridge_port_obj_del);
+		break;
 	case SWITCHDEV_PORT_ATTR_SET:
 		err = switchdev_handle_port_attr_set(dev, ptr,
 						     mlx5e_eswitch_rep,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
index 557dac5e9745..eec5897c6b79 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -4,6 +4,7 @@
 #include <linux/netdevice.h>
 #include <linux/list.h>
 #include <linux/rhashtable.h>
+#include <linux/xarray.h>
 #include <linux/if_bridge.h>
 #include <net/switchdev.h>
 #include "bridge.h"
@@ -53,6 +54,20 @@ static const struct rhashtable_params fdb_ht_params = {
 	.automatic_shrinking = true,
 };
 
+struct mlx5_esw_bridge_vlan {
+	u16 vid;
+	u16 flags;
+};
+
+struct mlx5_esw_bridge_port {
+	u16 vport_num;
+	struct xarray vlans;
+};
+
+enum {
+	MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG = BIT(0),
+};
+
 struct mlx5_esw_bridge {
 	int ifindex;
 	int refcnt;
@@ -61,10 +76,12 @@ struct mlx5_esw_bridge {
 
 	struct list_head fdb_list;
 	struct rhashtable fdb_ht;
+	struct xarray vports;
 
 	struct mlx5_flow_table *egress_ft;
 	struct mlx5_flow_group *egress_mac_fg;
 	unsigned long ageing_time;
+	u32 flags;
 };
 
 static void
@@ -345,6 +362,7 @@ static struct mlx5_esw_bridge *mlx5_esw_bridge_create(int ifindex,
 		goto err_fdb_ht;
 
 	INIT_LIST_HEAD(&bridge->fdb_list);
+	xa_init(&bridge->vports);
 	bridge->ifindex = ifindex;
 	bridge->refcnt = 1;
 	bridge->ageing_time = BR_DEFAULT_AGEING_TIME;
@@ -371,6 +389,7 @@ static void mlx5_esw_bridge_put(struct mlx5_esw_bridge_offloads *br_offloads,
 		return;
 
 	mlx5_esw_bridge_egress_table_cleanup(bridge);
+	WARN_ON(!xa_empty(&bridge->vports));
 	list_del(&bridge->list);
 	rhashtable_destroy(&bridge->fdb_ht);
 	kvfree(bridge);
@@ -406,6 +425,24 @@ mlx5_esw_bridge_lookup(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads
 	return bridge;
 }
 
+static int mlx5_esw_bridge_port_insert(struct mlx5_esw_bridge_port *port,
+				       struct mlx5_esw_bridge *bridge)
+{
+	return xa_insert(&bridge->vports, port->vport_num, port, GFP_KERNEL);
+}
+
+static struct mlx5_esw_bridge_port *
+mlx5_esw_bridge_port_lookup(u16 vport_num, struct mlx5_esw_bridge *bridge)
+{
+	return xa_load(&bridge->vports, vport_num);
+}
+
+static void mlx5_esw_bridge_port_erase(struct mlx5_esw_bridge_port *port,
+				       struct mlx5_esw_bridge *bridge)
+{
+	xa_erase(&bridge->vports, port->vport_num);
+}
+
 static void
 mlx5_esw_bridge_fdb_entry_cleanup(struct mlx5_esw_bridge_fdb_entry *entry,
 				  struct mlx5_esw_bridge *bridge)
@@ -418,6 +455,68 @@ mlx5_esw_bridge_fdb_entry_cleanup(struct mlx5_esw_bridge_fdb_entry *entry,
 	kvfree(entry);
 }
 
+static void mlx5_esw_bridge_fdb_flush(struct mlx5_esw_bridge *bridge)
+{
+	struct mlx5_esw_bridge_fdb_entry *entry, *tmp;
+
+	list_for_each_entry_safe(entry, tmp, &bridge->fdb_list, list) {
+		if (!(entry->flags & MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER))
+			mlx5_esw_bridge_fdb_offload_notify(entry->dev, entry->key.addr,
+							   entry->key.vid,
+							   SWITCHDEV_FDB_DEL_TO_BRIDGE);
+		mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge);
+	}
+}
+
+static struct mlx5_esw_bridge_vlan *
+mlx5_esw_bridge_vlan_lookup(u16 vid, struct mlx5_esw_bridge_port *port)
+{
+	return xa_load(&port->vlans, vid);
+}
+
+static struct mlx5_esw_bridge_vlan *
+mlx5_esw_bridge_vlan_create(u16 vid, u16 flags, struct mlx5_esw_bridge_port *port)
+{
+	struct mlx5_esw_bridge_vlan *vlan;
+	int err;
+
+	vlan = kvzalloc(sizeof(*vlan), GFP_KERNEL);
+	if (!vlan)
+		return ERR_PTR(-ENOMEM);
+
+	vlan->vid = vid;
+	vlan->flags = flags;
+	err = xa_insert(&port->vlans, vid, vlan, GFP_KERNEL);
+	if (err) {
+		kvfree(vlan);
+		return ERR_PTR(err);
+	}
+
+	return vlan;
+}
+
+static void mlx5_esw_bridge_vlan_erase(struct mlx5_esw_bridge_port *port,
+				       struct mlx5_esw_bridge_vlan *vlan)
+{
+	xa_erase(&port->vlans, vlan->vid);
+}
+
+static void mlx5_esw_bridge_vlan_cleanup(struct mlx5_esw_bridge_port *port,
+					 struct mlx5_esw_bridge_vlan *vlan)
+{
+	mlx5_esw_bridge_vlan_erase(port, vlan);
+	kvfree(vlan);
+}
+
+static void mlx5_esw_bridge_port_vlans_flush(struct mlx5_esw_bridge_port *port)
+{
+	struct mlx5_esw_bridge_vlan *vlan;
+	unsigned long index;
+
+	xa_for_each(&port->vlans, index, vlan)
+		mlx5_esw_bridge_vlan_cleanup(port, vlan);
+}
+
 static struct mlx5_esw_bridge_fdb_entry *
 mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsigned char *addr,
 			       u16 vid, bool added_by_user, struct mlx5_eswitch *esw,
@@ -498,11 +597,60 @@ int mlx5_esw_bridge_ageing_time_set(unsigned long ageing_time, struct mlx5_eswit
 	return 0;
 }
 
-static int mlx5_esw_bridge_vport_init(struct mlx5_esw_bridge *bridge,
+int mlx5_esw_bridge_vlan_filtering_set(bool enable, struct mlx5_eswitch *esw,
+				       struct mlx5_vport *vport)
+{
+	struct mlx5_esw_bridge *bridge;
+	bool filtering;
+
+	if (!vport->bridge)
+		return -EINVAL;
+
+	bridge = vport->bridge;
+	filtering = bridge->flags & MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG;
+	if (filtering == enable)
+		return 0;
+
+	mlx5_esw_bridge_fdb_flush(bridge);
+	if (enable)
+		bridge->flags |= MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG;
+	else
+		bridge->flags &= ~MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG;
+
+	return 0;
+}
+
+static int mlx5_esw_bridge_vport_init(struct mlx5_esw_bridge_offloads *br_offloads,
+				      struct mlx5_esw_bridge *bridge,
 				      struct mlx5_vport *vport)
 {
+	struct mlx5_eswitch *esw = br_offloads->esw;
+	struct mlx5_esw_bridge_port *port;
+	int err;
+
+	port = kvzalloc(sizeof(*port), GFP_KERNEL);
+	if (!port) {
+		err = -ENOMEM;
+		goto err_port_alloc;
+	}
+
+	port->vport_num = vport->vport;
+	xa_init(&port->vlans);
+	err = mlx5_esw_bridge_port_insert(port, bridge);
+	if (err) {
+		esw_warn(esw->dev, "Failed to insert port metadata (vport=%u,err=%d)\n",
+			 vport->vport, err);
+		goto err_port_insert;
+	}
+
 	vport->bridge = bridge;
 	return 0;
+
+err_port_insert:
+	kvfree(port);
+err_port_alloc:
+	mlx5_esw_bridge_put(br_offloads, bridge);
+	return err;
 }
 
 static int mlx5_esw_bridge_vport_cleanup(struct mlx5_esw_bridge_offloads *br_offloads,
@@ -510,11 +658,21 @@ static int mlx5_esw_bridge_vport_cleanup(struct mlx5_esw_bridge_offloads *br_off
 {
 	struct mlx5_esw_bridge *bridge = vport->bridge;
 	struct mlx5_esw_bridge_fdb_entry *entry, *tmp;
+	struct mlx5_esw_bridge_port *port;
 
 	list_for_each_entry_safe(entry, tmp, &bridge->fdb_list, list)
 		if (entry->vport_num == vport->vport)
 			mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge);
 
+	port = mlx5_esw_bridge_port_lookup(vport->vport, bridge);
+	if (!port) {
+		WARN(1, "Vport %u metadata not found on bridge", vport->vport);
+		return -EINVAL;
+	}
+
+	mlx5_esw_bridge_port_vlans_flush(port);
+	mlx5_esw_bridge_port_erase(port, bridge);
+	kvfree(port);
 	mlx5_esw_bridge_put(br_offloads, bridge);
 	vport->bridge = NULL;
 	return 0;
@@ -524,6 +682,7 @@ int mlx5_esw_bridge_vport_link(int ifindex, struct mlx5_esw_bridge_offloads *br_
 			       struct mlx5_vport *vport, struct netlink_ext_ack *extack)
 {
 	struct mlx5_esw_bridge *bridge;
+	int err;
 
 	WARN_ON(vport->bridge);
 
@@ -533,13 +692,17 @@ int mlx5_esw_bridge_vport_link(int ifindex, struct mlx5_esw_bridge_offloads *br_
 		return PTR_ERR(bridge);
 	}
 
-	return mlx5_esw_bridge_vport_init(bridge, vport);
+	err = mlx5_esw_bridge_vport_init(br_offloads, bridge, vport);
+	if (err)
+		NL_SET_ERR_MSG_MOD(extack, "Error initializing port");
+	return err;
 }
 
 int mlx5_esw_bridge_vport_unlink(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads,
 				 struct mlx5_vport *vport, struct netlink_ext_ack *extack)
 {
 	struct mlx5_esw_bridge *bridge = vport->bridge;
+	int err;
 
 	if (!bridge) {
 		NL_SET_ERR_MSG_MOD(extack, "Port is not attached to any bridge");
@@ -550,7 +713,49 @@ int mlx5_esw_bridge_vport_unlink(int ifindex, struct mlx5_esw_bridge_offloads *b
 		return -EINVAL;
 	}
 
-	return mlx5_esw_bridge_vport_cleanup(br_offloads, vport);
+	err = mlx5_esw_bridge_vport_cleanup(br_offloads, vport);
+	if (err)
+		NL_SET_ERR_MSG_MOD(extack, "Port cleanup failed");
+	return err;
+}
+
+int mlx5_esw_bridge_port_vlan_add(u16 vid, u16 flags, struct mlx5_eswitch *esw,
+				  struct mlx5_vport *vport, struct netlink_ext_ack *extack)
+{
+	struct mlx5_esw_bridge_port *port;
+	struct mlx5_esw_bridge_vlan *vlan;
+
+	port = mlx5_esw_bridge_port_lookup(vport->vport, vport->bridge);
+	if (!port)
+		return -EINVAL;
+
+	vlan = mlx5_esw_bridge_vlan_lookup(vid, port);
+	if (vlan) {
+		vlan->flags = flags;
+		return 0;
+	}
+
+	vlan = mlx5_esw_bridge_vlan_create(vid, flags, port);
+	if (IS_ERR(vlan)) {
+		NL_SET_ERR_MSG_MOD(extack, "Failed to create VLAN entry");
+		return PTR_ERR(vlan);
+	}
+	return 0;
+}
+
+void mlx5_esw_bridge_port_vlan_del(u16 vid, struct mlx5_eswitch *esw, struct mlx5_vport *vport)
+{
+	struct mlx5_esw_bridge_port *port;
+	struct mlx5_esw_bridge_vlan *vlan;
+
+	port = mlx5_esw_bridge_port_lookup(vport->vport, vport->bridge);
+	if (!port)
+		return;
+
+	vlan = mlx5_esw_bridge_vlan_lookup(vid, port);
+	if (!vlan)
+		return;
+	mlx5_esw_bridge_vlan_cleanup(port, vlan);
 }
 
 void mlx5_esw_bridge_fdb_create(struct net_device *dev, struct mlx5_eswitch *esw,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
index 07726ae55b2b..276ed0392607 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
@@ -40,5 +40,10 @@ void mlx5_esw_bridge_fdb_remove(struct net_device *dev, struct mlx5_eswitch *esw
 void mlx5_esw_bridge_update(struct mlx5_esw_bridge_offloads *br_offloads);
 int mlx5_esw_bridge_ageing_time_set(unsigned long ageing_time, struct mlx5_eswitch *esw,
 				    struct mlx5_vport *vport);
+int mlx5_esw_bridge_vlan_filtering_set(bool enable, struct mlx5_eswitch *esw,
+				       struct mlx5_vport *vport);
+int mlx5_esw_bridge_port_vlan_add(u16 vid, u16 flags, struct mlx5_eswitch *esw,
+				  struct mlx5_vport *vport, struct netlink_ext_ack *extack);
+void mlx5_esw_bridge_port_vlan_del(u16 vid, struct mlx5_eswitch *esw, struct mlx5_vport *vport);
 
 #endif /* __MLX5_ESW_BRIDGE_H__ */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [net-next 13/16] net/mlx5: Bridge, match FDB entry vlan tag
  2021-06-10  2:57 [pull request][net-next 00/16] mlx5 updates 2021-06-09 Saeed Mahameed
                   ` (11 preceding siblings ...)
  2021-06-10  2:58 ` [net-next 12/16] net/mlx5: Bridge, implement infrastructure for vlans Saeed Mahameed
@ 2021-06-10  2:58 ` Saeed Mahameed
  2021-06-10  2:58 ` [net-next 14/16] net/mlx5: Bridge, support pvid and untagged vlan configurations Saeed Mahameed
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Saeed Mahameed @ 2021-06-10  2:58 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Vlad Buslov, Jianbo Liu, Saeed Mahameed

From: Vlad Buslov <vladbu@nvidia.com>

Add support for FDB vlan-tagged entries. Extend ingress and egress flow
tables with flow groups to match packet vlan tag. Modify the flow creation
code to include vlan tag, if vlan is configured on port and vlan
configuration is supported for offload.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../device_drivers/ethernet/mellanox/mlx5.rst |   9 +
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  | 181 +++++++++++++++++-
 .../ethernet/mellanox/mlx5/core/esw/bridge.h  |   1 +
 3 files changed, 181 insertions(+), 10 deletions(-)

diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
index ea32136b30e7..a0c91fe5574d 100644
--- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
@@ -232,6 +232,15 @@ representor is attached to bridge.
 
     $ ip link set enp8s0f0 master bridge1
 
+VLANs
+-----
+Following bridge VLAN functions are supported by mlx5:
+
+- VLAN filtering (including multiple VLANs per port)::
+
+    $ ip link set bridge1 type bridge vlan_filtering 1
+    $ bridge vlan add dev enp8s0f0 vid 2-3
+
 mlx5 subfunction
 ================
 mlx5 supports subfunction management using devlink port (see :ref:`Documentation/networking/devlink/devlink-port.rst <devlink_port>`) interface.
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
index eec5897c6b79..e1467dbe80dc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -12,11 +12,17 @@
 #include "fs_core.h"
 
 #define MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE 64000
-#define MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_FROM 0
+#define MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_FROM 0
+#define MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_TO (MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE / 2 - 1)
+#define MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_FROM \
+	(MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_TO + 1)
 #define MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_TO (MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE - 1)
 
 #define MLX5_ESW_BRIDGE_EGRESS_TABLE_SIZE 64000
-#define MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_FROM 0
+#define MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_IDX_FROM 0
+#define MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_IDX_TO (MLX5_ESW_BRIDGE_EGRESS_TABLE_SIZE / 2 - 1)
+#define MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_FROM \
+	(MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_IDX_TO + 1)
 #define MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_TO (MLX5_ESW_BRIDGE_EGRESS_TABLE_SIZE - 1)
 
 enum {
@@ -79,6 +85,7 @@ struct mlx5_esw_bridge {
 	struct xarray vports;
 
 	struct mlx5_flow_table *egress_ft;
+	struct mlx5_flow_group *egress_vlan_fg;
 	struct mlx5_flow_group *egress_mac_fg;
 	unsigned long ageing_time;
 	u32 flags;
@@ -120,6 +127,44 @@ mlx5_esw_bridge_table_create(int max_fte, u32 level, struct mlx5_eswitch *esw)
 	return fdb;
 }
 
+static struct mlx5_flow_group *
+mlx5_esw_bridge_ingress_vlan_fg_create(struct mlx5_eswitch *esw, struct mlx5_flow_table *ingress_ft)
+{
+	int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in);
+	struct mlx5_flow_group *fg;
+	u32 *in, *match;
+
+	in = kvzalloc(inlen, GFP_KERNEL);
+	if (!in)
+		return ERR_PTR(-ENOMEM);
+
+	MLX5_SET(create_flow_group_in, in, match_criteria_enable,
+		 MLX5_MATCH_OUTER_HEADERS | MLX5_MATCH_MISC_PARAMETERS_2);
+	match = MLX5_ADDR_OF(create_flow_group_in, in, match_criteria);
+
+	MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.smac_47_16);
+	MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.smac_15_0);
+	MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.cvlan_tag);
+	MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.first_vid);
+
+	MLX5_SET(fte_match_param, match, misc_parameters_2.metadata_reg_c_0,
+		 mlx5_eswitch_get_vport_metadata_mask());
+
+	MLX5_SET(create_flow_group_in, in, start_flow_index,
+		 MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_FROM);
+	MLX5_SET(create_flow_group_in, in, end_flow_index,
+		 MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_TO);
+
+	fg = mlx5_create_flow_group(ingress_ft, in);
+	kvfree(in);
+	if (IS_ERR(fg))
+		esw_warn(esw->dev,
+			 "Failed to create VLAN flow group for bridge ingress table (err=%ld)\n",
+			 PTR_ERR(fg));
+
+	return fg;
+}
+
 static struct mlx5_flow_group *
 mlx5_esw_bridge_ingress_mac_fg_create(struct mlx5_eswitch *esw, struct mlx5_flow_table *ingress_ft)
 {
@@ -149,13 +194,46 @@ mlx5_esw_bridge_ingress_mac_fg_create(struct mlx5_eswitch *esw, struct mlx5_flow
 	fg = mlx5_create_flow_group(ingress_ft, in);
 	if (IS_ERR(fg))
 		esw_warn(esw->dev,
-			 "Failed to create bridge ingress table MAC flow group (err=%ld)\n",
+			 "Failed to create MAC flow group for bridge ingress table (err=%ld)\n",
 			 PTR_ERR(fg));
 
 	kvfree(in);
 	return fg;
 }
 
+static struct mlx5_flow_group *
+mlx5_esw_bridge_egress_vlan_fg_create(struct mlx5_eswitch *esw, struct mlx5_flow_table *egress_ft)
+{
+	int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in);
+	struct mlx5_flow_group *fg;
+	u32 *in, *match;
+
+	in = kvzalloc(inlen, GFP_KERNEL);
+	if (!in)
+		return ERR_PTR(-ENOMEM);
+
+	MLX5_SET(create_flow_group_in, in, match_criteria_enable, MLX5_MATCH_OUTER_HEADERS);
+	match = MLX5_ADDR_OF(create_flow_group_in, in, match_criteria);
+
+	MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.dmac_47_16);
+	MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.dmac_15_0);
+	MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.cvlan_tag);
+	MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.first_vid);
+
+	MLX5_SET(create_flow_group_in, in, start_flow_index,
+		 MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_IDX_FROM);
+	MLX5_SET(create_flow_group_in, in, end_flow_index,
+		 MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_IDX_TO);
+
+	fg = mlx5_create_flow_group(egress_ft, in);
+	if (IS_ERR(fg))
+		esw_warn(esw->dev,
+			 "Failed to create VLAN flow group for bridge egress table (err=%ld)\n",
+			 PTR_ERR(fg));
+	kvfree(in);
+	return fg;
+}
+
 static struct mlx5_flow_group *
 mlx5_esw_bridge_egress_mac_fg_create(struct mlx5_eswitch *esw, struct mlx5_flow_table *egress_ft)
 {
@@ -190,8 +268,8 @@ mlx5_esw_bridge_egress_mac_fg_create(struct mlx5_eswitch *esw, struct mlx5_flow_
 static int
 mlx5_esw_bridge_ingress_table_init(struct mlx5_esw_bridge_offloads *br_offloads)
 {
+	struct mlx5_flow_group *mac_fg, *vlan_fg;
 	struct mlx5_flow_table *ingress_ft;
-	struct mlx5_flow_group *mac_fg;
 	int err;
 
 	if (!mlx5_eswitch_vport_match_metadata_enabled(br_offloads->esw))
@@ -203,6 +281,12 @@ mlx5_esw_bridge_ingress_table_init(struct mlx5_esw_bridge_offloads *br_offloads)
 	if (IS_ERR(ingress_ft))
 		return PTR_ERR(ingress_ft);
 
+	vlan_fg = mlx5_esw_bridge_ingress_vlan_fg_create(br_offloads->esw, ingress_ft);
+	if (IS_ERR(vlan_fg)) {
+		err = PTR_ERR(vlan_fg);
+		goto err_vlan_fg;
+	}
+
 	mac_fg = mlx5_esw_bridge_ingress_mac_fg_create(br_offloads->esw, ingress_ft);
 	if (IS_ERR(mac_fg)) {
 		err = PTR_ERR(mac_fg);
@@ -210,10 +294,13 @@ mlx5_esw_bridge_ingress_table_init(struct mlx5_esw_bridge_offloads *br_offloads)
 	}
 
 	br_offloads->ingress_ft = ingress_ft;
+	br_offloads->ingress_vlan_fg = vlan_fg;
 	br_offloads->ingress_mac_fg = mac_fg;
 	return 0;
 
 err_mac_fg:
+	mlx5_destroy_flow_group(vlan_fg);
+err_vlan_fg:
 	mlx5_destroy_flow_table(ingress_ft);
 	return err;
 }
@@ -223,6 +310,8 @@ mlx5_esw_bridge_ingress_table_cleanup(struct mlx5_esw_bridge_offloads *br_offloa
 {
 	mlx5_destroy_flow_group(br_offloads->ingress_mac_fg);
 	br_offloads->ingress_mac_fg = NULL;
+	mlx5_destroy_flow_group(br_offloads->ingress_vlan_fg);
+	br_offloads->ingress_vlan_fg = NULL;
 	mlx5_destroy_flow_table(br_offloads->ingress_ft);
 	br_offloads->ingress_ft = NULL;
 }
@@ -231,8 +320,8 @@ static int
 mlx5_esw_bridge_egress_table_init(struct mlx5_esw_bridge_offloads *br_offloads,
 				  struct mlx5_esw_bridge *bridge)
 {
+	struct mlx5_flow_group *mac_fg, *vlan_fg;
 	struct mlx5_flow_table *egress_ft;
-	struct mlx5_flow_group *mac_fg;
 	int err;
 
 	egress_ft = mlx5_esw_bridge_table_create(MLX5_ESW_BRIDGE_EGRESS_TABLE_SIZE,
@@ -241,6 +330,12 @@ mlx5_esw_bridge_egress_table_init(struct mlx5_esw_bridge_offloads *br_offloads,
 	if (IS_ERR(egress_ft))
 		return PTR_ERR(egress_ft);
 
+	vlan_fg = mlx5_esw_bridge_egress_vlan_fg_create(br_offloads->esw, egress_ft);
+	if (IS_ERR(vlan_fg)) {
+		err = PTR_ERR(vlan_fg);
+		goto err_vlan_fg;
+	}
+
 	mac_fg = mlx5_esw_bridge_egress_mac_fg_create(br_offloads->esw, egress_ft);
 	if (IS_ERR(mac_fg)) {
 		err = PTR_ERR(mac_fg);
@@ -248,10 +343,13 @@ mlx5_esw_bridge_egress_table_init(struct mlx5_esw_bridge_offloads *br_offloads,
 	}
 
 	bridge->egress_ft = egress_ft;
+	bridge->egress_vlan_fg = vlan_fg;
 	bridge->egress_mac_fg = mac_fg;
 	return 0;
 
 err_mac_fg:
+	mlx5_destroy_flow_group(vlan_fg);
+err_vlan_fg:
 	mlx5_destroy_flow_table(egress_ft);
 	return err;
 }
@@ -260,12 +358,14 @@ static void
 mlx5_esw_bridge_egress_table_cleanup(struct mlx5_esw_bridge *bridge)
 {
 	mlx5_destroy_flow_group(bridge->egress_mac_fg);
+	mlx5_destroy_flow_group(bridge->egress_vlan_fg);
 	mlx5_destroy_flow_table(bridge->egress_ft);
 }
 
 static struct mlx5_flow_handle *
-mlx5_esw_bridge_ingress_flow_create(u16 vport_num, const unsigned char *addr, u16 vid,
-				    u32 counter_id, struct mlx5_esw_bridge *bridge)
+mlx5_esw_bridge_ingress_flow_create(u16 vport_num, const unsigned char *addr,
+				    struct mlx5_esw_bridge_vlan *vlan, u32 counter_id,
+				    struct mlx5_esw_bridge *bridge)
 {
 	struct mlx5_esw_bridge_offloads *br_offloads = bridge->br_offloads;
 	struct mlx5_flow_act flow_act = {
@@ -295,6 +395,17 @@ mlx5_esw_bridge_ingress_flow_create(u16 vport_num, const unsigned char *addr, u1
 	MLX5_SET(fte_match_param, rule_spec->match_value, misc_parameters_2.metadata_reg_c_0,
 		 mlx5_eswitch_get_vport_metadata_for_match(br_offloads->esw, vport_num));
 
+	if (vlan) {
+		MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria,
+				 outer_headers.cvlan_tag);
+		MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_value,
+				 outer_headers.cvlan_tag);
+		MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria,
+				 outer_headers.first_vid);
+		MLX5_SET(fte_match_param, rule_spec->match_value, outer_headers.first_vid,
+			 vlan->vid);
+	}
+
 	dests[0].type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE;
 	dests[0].ft = bridge->egress_ft;
 	dests[1].type = MLX5_FLOW_DESTINATION_TYPE_COUNTER;
@@ -308,7 +419,8 @@ mlx5_esw_bridge_ingress_flow_create(u16 vport_num, const unsigned char *addr, u1
 }
 
 static struct mlx5_flow_handle *
-mlx5_esw_bridge_egress_flow_create(u16 vport_num, const unsigned char *addr, u16 vid,
+mlx5_esw_bridge_egress_flow_create(u16 vport_num, const unsigned char *addr,
+				   struct mlx5_esw_bridge_vlan *vlan,
 				   struct mlx5_esw_bridge *bridge)
 {
 	struct mlx5_flow_destination dest = {
@@ -336,6 +448,17 @@ mlx5_esw_bridge_egress_flow_create(u16 vport_num, const unsigned char *addr, u16
 			      outer_headers.dmac_47_16);
 	eth_broadcast_addr(dmac_c);
 
+	if (vlan) {
+		MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria,
+				 outer_headers.cvlan_tag);
+		MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_value,
+				 outer_headers.cvlan_tag);
+		MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria,
+				 outer_headers.first_vid);
+		MLX5_SET(fte_match_param, rule_spec->match_value, outer_headers.first_vid,
+			 vlan->vid);
+	}
+
 	handle = mlx5_add_flow_rules(bridge->egress_ft, rule_spec, &flow_act, &dest, 1);
 
 	kvfree(rule_spec);
@@ -517,17 +640,55 @@ static void mlx5_esw_bridge_port_vlans_flush(struct mlx5_esw_bridge_port *port)
 		mlx5_esw_bridge_vlan_cleanup(port, vlan);
 }
 
+static struct mlx5_esw_bridge_vlan *
+mlx5_esw_bridge_port_vlan_lookup(u16 vid, u16 vport_num, struct mlx5_esw_bridge *bridge,
+				 struct mlx5_eswitch *esw)
+{
+	struct mlx5_esw_bridge_port *port;
+	struct mlx5_esw_bridge_vlan *vlan;
+
+	port = mlx5_esw_bridge_port_lookup(vport_num, bridge);
+	if (!port) {
+		/* FDB is added asynchronously on wq while port might have been deleted
+		 * concurrently. Report on 'info' logging level and skip the FDB offload.
+		 */
+		esw_info(esw->dev, "Failed to lookup bridge port (vport=%u)\n", vport_num);
+		return ERR_PTR(-EINVAL);
+	}
+
+	vlan = mlx5_esw_bridge_vlan_lookup(vid, port);
+	if (!vlan) {
+		/* FDB is added asynchronously on wq while vlan might have been deleted
+		 * concurrently. Report on 'info' logging level and skip the FDB offload.
+		 */
+		esw_info(esw->dev, "Failed to lookup bridge port vlan metadata (vport=%u)\n",
+			 vport_num);
+		return ERR_PTR(-EINVAL);
+	}
+
+	return vlan;
+}
+
 static struct mlx5_esw_bridge_fdb_entry *
 mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsigned char *addr,
 			       u16 vid, bool added_by_user, struct mlx5_eswitch *esw,
 			       struct mlx5_esw_bridge *bridge)
 {
+	struct mlx5_esw_bridge_vlan *vlan = NULL;
 	struct mlx5_esw_bridge_fdb_entry *entry;
 	struct mlx5_flow_handle *handle;
 	struct mlx5_fc *counter;
 	struct mlx5e_priv *priv;
 	int err;
 
+	if (bridge->flags & MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG && vid) {
+		vlan = mlx5_esw_bridge_port_vlan_lookup(vid, vport_num, bridge, esw);
+		if (IS_ERR(vlan))
+			return ERR_CAST(vlan);
+		if (vlan->flags & (BRIDGE_VLAN_INFO_PVID | BRIDGE_VLAN_INFO_UNTAGGED))
+			return ERR_PTR(-EOPNOTSUPP); /* can't offload vlan push/pop */
+	}
+
 	priv = netdev_priv(dev);
 	entry = kvzalloc(sizeof(*entry), GFP_KERNEL);
 	if (!entry)
@@ -548,7 +709,7 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsi
 	}
 	entry->ingress_counter = counter;
 
-	handle = mlx5_esw_bridge_ingress_flow_create(vport_num, addr, vid, mlx5_fc_id(counter),
+	handle = mlx5_esw_bridge_ingress_flow_create(vport_num, addr, vlan, mlx5_fc_id(counter),
 						     bridge);
 	if (IS_ERR(handle)) {
 		err = PTR_ERR(handle);
@@ -558,7 +719,7 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsi
 	}
 	entry->ingress_handle = handle;
 
-	handle = mlx5_esw_bridge_egress_flow_create(vport_num, addr, vid, bridge);
+	handle = mlx5_esw_bridge_egress_flow_create(vport_num, addr, vlan, bridge);
 	if (IS_ERR(handle)) {
 		err = PTR_ERR(handle);
 		esw_warn(esw->dev, "Failed to create egress flow(vport=%u,err=%d)\n",
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
index 276ed0392607..bedbda57cdb3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
@@ -22,6 +22,7 @@ struct mlx5_esw_bridge_offloads {
 	struct delayed_work update_work;
 
 	struct mlx5_flow_table *ingress_ft;
+	struct mlx5_flow_group *ingress_vlan_fg;
 	struct mlx5_flow_group *ingress_mac_fg;
 };
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [net-next 14/16] net/mlx5: Bridge, support pvid and untagged vlan configurations
  2021-06-10  2:57 [pull request][net-next 00/16] mlx5 updates 2021-06-09 Saeed Mahameed
                   ` (12 preceding siblings ...)
  2021-06-10  2:58 ` [net-next 13/16] net/mlx5: Bridge, match FDB entry vlan tag Saeed Mahameed
@ 2021-06-10  2:58 ` Saeed Mahameed
  2021-06-10  2:58 ` [net-next 15/16] net/mlx5: Bridge, filter tagged packets that didn't match tagged fg Saeed Mahameed
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Saeed Mahameed @ 2021-06-10  2:58 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Vlad Buslov, Jianbo Liu, Saeed Mahameed

From: Vlad Buslov <vladbu@nvidia.com>

Implement support for pushing vlan header into untagged packet on ingress
of port that has pvid configured and support for popping vlan on egress of
port that has the matching vlan configured as untagged. To support such
configurations packet reformat contexts of {INSERT|REMOVE}_HEADER types are
created per such vlan and saved to struct mlx5_esw_bridge_vlan which allows
all FDB entries on particular vlan to share single packet reformat
instance. When initializing FDB entries with pvid or untagged vlan type set
its mlx5_flow_act->pkt_reformat action accordingly.

Flush all flows when removing vlan from port. This is necessary because
even though software bridge removes all FDB entries before removing their
vlan, mlx5 bridge implementation deletes their corresponding flow entries
from hardware in asynchronous workqueue task, which will cause firmware
error if vlan packet reformat context is deleted before all flows that
point to it.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../device_drivers/ethernet/mellanox/mlx5.rst |   8 +
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  | 175 ++++++++++++++++--
 2 files changed, 167 insertions(+), 16 deletions(-)

diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
index a0c91fe5574d..058882dca17b 100644
--- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
@@ -241,6 +241,14 @@ Following bridge VLAN functions are supported by mlx5:
     $ ip link set bridge1 type bridge vlan_filtering 1
     $ bridge vlan add dev enp8s0f0 vid 2-3
 
+- VLAN push on bridge ingress::
+
+    $ bridge vlan add dev enp8s0f0 vid 3 pvid
+
+- VLAN pop on bridge egress::
+
+    $ bridge vlan add dev enp8s0f0 vid 3 untagged
+
 mlx5 subfunction
 ================
 mlx5 supports subfunction management using devlink port (see :ref:`Documentation/networking/devlink/devlink-port.rst <devlink_port>`) interface.
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
index e1467dbe80dc..442a62ff7b43 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -6,6 +6,8 @@
 #include <linux/rhashtable.h>
 #include <linux/xarray.h>
 #include <linux/if_bridge.h>
+#include <linux/if_vlan.h>
+#include <linux/if_ether.h>
 #include <net/switchdev.h>
 #include "bridge.h"
 #include "eswitch.h"
@@ -44,6 +46,7 @@ struct mlx5_esw_bridge_fdb_entry {
 	struct rhash_head ht_node;
 	struct net_device *dev;
 	struct list_head list;
+	struct list_head vlan_list;
 	u16 vport_num;
 	u16 flags;
 
@@ -63,6 +66,9 @@ static const struct rhashtable_params fdb_ht_params = {
 struct mlx5_esw_bridge_vlan {
 	u16 vid;
 	u16 flags;
+	struct list_head fdb_list;
+	struct mlx5_pkt_reformat *pkt_reformat_push;
+	struct mlx5_pkt_reformat *pkt_reformat_pop;
 };
 
 struct mlx5_esw_bridge_port {
@@ -117,6 +123,7 @@ mlx5_esw_bridge_table_create(int max_fte, u32 level, struct mlx5_eswitch *esw)
 		return ERR_PTR(-ENOENT);
 	}
 
+	ft_attr.flags = MLX5_FLOW_TABLE_TUNNEL_EN_REFORMAT;
 	ft_attr.max_fte = max_fte;
 	ft_attr.level = level;
 	ft_attr.prio = FDB_BR_OFFLOAD;
@@ -395,7 +402,10 @@ mlx5_esw_bridge_ingress_flow_create(u16 vport_num, const unsigned char *addr,
 	MLX5_SET(fte_match_param, rule_spec->match_value, misc_parameters_2.metadata_reg_c_0,
 		 mlx5_eswitch_get_vport_metadata_for_match(br_offloads->esw, vport_num));
 
-	if (vlan) {
+	if (vlan && vlan->pkt_reformat_push) {
+		flow_act.action |= MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT;
+		flow_act.pkt_reformat = vlan->pkt_reformat_push;
+	} else if (vlan) {
 		MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria,
 				 outer_headers.cvlan_tag);
 		MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_value,
@@ -449,6 +459,11 @@ mlx5_esw_bridge_egress_flow_create(u16 vport_num, const unsigned char *addr,
 	eth_broadcast_addr(dmac_c);
 
 	if (vlan) {
+		if (vlan->pkt_reformat_pop) {
+			flow_act.action |= MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT;
+			flow_act.pkt_reformat = vlan->pkt_reformat_pop;
+		}
+
 		MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria,
 				 outer_headers.cvlan_tag);
 		MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_value,
@@ -597,8 +612,90 @@ mlx5_esw_bridge_vlan_lookup(u16 vid, struct mlx5_esw_bridge_port *port)
 	return xa_load(&port->vlans, vid);
 }
 
+static int
+mlx5_esw_bridge_vlan_push_create(struct mlx5_esw_bridge_vlan *vlan, struct mlx5_eswitch *esw)
+{
+	struct {
+		__be16	h_vlan_proto;
+		__be16	h_vlan_TCI;
+	} vlan_hdr = { htons(ETH_P_8021Q), htons(vlan->vid) };
+	struct mlx5_pkt_reformat_params reformat_params = {};
+	struct mlx5_pkt_reformat *pkt_reformat;
+
+	if (!BIT(MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, reformat_insert)) ||
+	    MLX5_CAP_GEN_2(esw->dev, max_reformat_insert_size) < sizeof(vlan_hdr) ||
+	    MLX5_CAP_GEN_2(esw->dev, max_reformat_insert_offset) <
+	    offsetof(struct vlan_ethhdr, h_vlan_proto)) {
+		esw_warn(esw->dev, "Packet reformat INSERT_HEADER is not supported\n");
+		return -EOPNOTSUPP;
+	}
+
+	reformat_params.type = MLX5_REFORMAT_TYPE_INSERT_HDR;
+	reformat_params.param_0 = MLX5_REFORMAT_CONTEXT_ANCHOR_MAC_START;
+	reformat_params.param_1 = offsetof(struct vlan_ethhdr, h_vlan_proto);
+	reformat_params.size = sizeof(vlan_hdr);
+	reformat_params.data = &vlan_hdr;
+	pkt_reformat = mlx5_packet_reformat_alloc(esw->dev,
+						  &reformat_params,
+						  MLX5_FLOW_NAMESPACE_FDB);
+	if (IS_ERR(pkt_reformat)) {
+		esw_warn(esw->dev, "Failed to alloc packet reformat INSERT_HEADER (err=%ld)\n",
+			 PTR_ERR(pkt_reformat));
+		return PTR_ERR(pkt_reformat);
+	}
+
+	vlan->pkt_reformat_push = pkt_reformat;
+	return 0;
+}
+
+static void
+mlx5_esw_bridge_vlan_push_cleanup(struct mlx5_esw_bridge_vlan *vlan, struct mlx5_eswitch *esw)
+{
+	mlx5_packet_reformat_dealloc(esw->dev, vlan->pkt_reformat_push);
+	vlan->pkt_reformat_push = NULL;
+}
+
+static int
+mlx5_esw_bridge_vlan_pop_create(struct mlx5_esw_bridge_vlan *vlan, struct mlx5_eswitch *esw)
+{
+	struct mlx5_pkt_reformat_params reformat_params = {};
+	struct mlx5_pkt_reformat *pkt_reformat;
+
+	if (!BIT(MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, reformat_remove)) ||
+	    MLX5_CAP_GEN_2(esw->dev, max_reformat_remove_size) < sizeof(struct vlan_hdr) ||
+	    MLX5_CAP_GEN_2(esw->dev, max_reformat_remove_offset) <
+	    offsetof(struct vlan_ethhdr, h_vlan_proto)) {
+		esw_warn(esw->dev, "Packet reformat REMOVE_HEADER is not supported\n");
+		return -EOPNOTSUPP;
+	}
+
+	reformat_params.type = MLX5_REFORMAT_TYPE_REMOVE_HDR;
+	reformat_params.param_0 = MLX5_REFORMAT_CONTEXT_ANCHOR_MAC_START;
+	reformat_params.param_1 = offsetof(struct vlan_ethhdr, h_vlan_proto);
+	reformat_params.size = sizeof(struct vlan_hdr);
+	pkt_reformat = mlx5_packet_reformat_alloc(esw->dev,
+						  &reformat_params,
+						  MLX5_FLOW_NAMESPACE_FDB);
+	if (IS_ERR(pkt_reformat)) {
+		esw_warn(esw->dev, "Failed to alloc packet reformat REMOVE_HEADER (err=%ld)\n",
+			 PTR_ERR(pkt_reformat));
+		return PTR_ERR(pkt_reformat);
+	}
+
+	vlan->pkt_reformat_pop = pkt_reformat;
+	return 0;
+}
+
+static void
+mlx5_esw_bridge_vlan_pop_cleanup(struct mlx5_esw_bridge_vlan *vlan, struct mlx5_eswitch *esw)
+{
+	mlx5_packet_reformat_dealloc(esw->dev, vlan->pkt_reformat_pop);
+	vlan->pkt_reformat_pop = NULL;
+}
+
 static struct mlx5_esw_bridge_vlan *
-mlx5_esw_bridge_vlan_create(u16 vid, u16 flags, struct mlx5_esw_bridge_port *port)
+mlx5_esw_bridge_vlan_create(u16 vid, u16 flags, struct mlx5_esw_bridge_port *port,
+			    struct mlx5_eswitch *esw)
 {
 	struct mlx5_esw_bridge_vlan *vlan;
 	int err;
@@ -609,13 +706,34 @@ mlx5_esw_bridge_vlan_create(u16 vid, u16 flags, struct mlx5_esw_bridge_port *por
 
 	vlan->vid = vid;
 	vlan->flags = flags;
-	err = xa_insert(&port->vlans, vid, vlan, GFP_KERNEL);
-	if (err) {
-		kvfree(vlan);
-		return ERR_PTR(err);
+	INIT_LIST_HEAD(&vlan->fdb_list);
+
+	if (flags & BRIDGE_VLAN_INFO_PVID) {
+		err = mlx5_esw_bridge_vlan_push_create(vlan, esw);
+		if (err)
+			goto err_vlan_push;
+	}
+	if (flags & BRIDGE_VLAN_INFO_UNTAGGED) {
+		err = mlx5_esw_bridge_vlan_pop_create(vlan, esw);
+		if (err)
+			goto err_vlan_pop;
 	}
 
+	err = xa_insert(&port->vlans, vid, vlan, GFP_KERNEL);
+	if (err)
+		goto err_xa_insert;
+
 	return vlan;
+
+err_xa_insert:
+	if (vlan->pkt_reformat_pop)
+		mlx5_esw_bridge_vlan_pop_cleanup(vlan, esw);
+err_vlan_pop:
+	if (vlan->pkt_reformat_push)
+		mlx5_esw_bridge_vlan_push_cleanup(vlan, esw);
+err_vlan_push:
+	kvfree(vlan);
+	return ERR_PTR(err);
 }
 
 static void mlx5_esw_bridge_vlan_erase(struct mlx5_esw_bridge_port *port,
@@ -624,20 +742,42 @@ static void mlx5_esw_bridge_vlan_erase(struct mlx5_esw_bridge_port *port,
 	xa_erase(&port->vlans, vlan->vid);
 }
 
+static void mlx5_esw_bridge_vlan_flush(struct mlx5_esw_bridge_vlan *vlan,
+				       struct mlx5_esw_bridge *bridge)
+{
+	struct mlx5_esw_bridge_fdb_entry *entry, *tmp;
+
+	list_for_each_entry_safe(entry, tmp, &vlan->fdb_list, vlan_list) {
+		if (!(entry->flags & MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER))
+			mlx5_esw_bridge_fdb_offload_notify(entry->dev, entry->key.addr,
+							   entry->key.vid,
+							   SWITCHDEV_FDB_DEL_TO_BRIDGE);
+		mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge);
+	}
+
+	if (vlan->pkt_reformat_pop)
+		mlx5_esw_bridge_vlan_pop_cleanup(vlan, bridge->br_offloads->esw);
+	if (vlan->pkt_reformat_push)
+		mlx5_esw_bridge_vlan_push_cleanup(vlan, bridge->br_offloads->esw);
+}
+
 static void mlx5_esw_bridge_vlan_cleanup(struct mlx5_esw_bridge_port *port,
-					 struct mlx5_esw_bridge_vlan *vlan)
+					 struct mlx5_esw_bridge_vlan *vlan,
+					 struct mlx5_esw_bridge *bridge)
 {
+	mlx5_esw_bridge_vlan_flush(vlan, bridge);
 	mlx5_esw_bridge_vlan_erase(port, vlan);
 	kvfree(vlan);
 }
 
-static void mlx5_esw_bridge_port_vlans_flush(struct mlx5_esw_bridge_port *port)
+static void mlx5_esw_bridge_port_vlans_flush(struct mlx5_esw_bridge_port *port,
+					     struct mlx5_esw_bridge *bridge)
 {
 	struct mlx5_esw_bridge_vlan *vlan;
 	unsigned long index;
 
 	xa_for_each(&port->vlans, index, vlan)
-		mlx5_esw_bridge_vlan_cleanup(port, vlan);
+		mlx5_esw_bridge_vlan_cleanup(port, vlan, bridge);
 }
 
 static struct mlx5_esw_bridge_vlan *
@@ -685,8 +825,6 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsi
 		vlan = mlx5_esw_bridge_port_vlan_lookup(vid, vport_num, bridge, esw);
 		if (IS_ERR(vlan))
 			return ERR_CAST(vlan);
-		if (vlan->flags & (BRIDGE_VLAN_INFO_PVID | BRIDGE_VLAN_INFO_UNTAGGED))
-			return ERR_PTR(-EOPNOTSUPP); /* can't offload vlan push/pop */
 	}
 
 	priv = netdev_priv(dev);
@@ -734,6 +872,10 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsi
 		goto err_ht_init;
 	}
 
+	if (vlan)
+		list_add(&entry->vlan_list, &vlan->fdb_list);
+	else
+		INIT_LIST_HEAD(&entry->vlan_list);
 	list_add(&entry->list, &bridge->fdb_list);
 	return entry;
 
@@ -831,7 +973,7 @@ static int mlx5_esw_bridge_vport_cleanup(struct mlx5_esw_bridge_offloads *br_off
 		return -EINVAL;
 	}
 
-	mlx5_esw_bridge_port_vlans_flush(port);
+	mlx5_esw_bridge_port_vlans_flush(port, bridge);
 	mlx5_esw_bridge_port_erase(port, bridge);
 	kvfree(port);
 	mlx5_esw_bridge_put(br_offloads, bridge);
@@ -892,11 +1034,12 @@ int mlx5_esw_bridge_port_vlan_add(u16 vid, u16 flags, struct mlx5_eswitch *esw,
 
 	vlan = mlx5_esw_bridge_vlan_lookup(vid, port);
 	if (vlan) {
-		vlan->flags = flags;
-		return 0;
+		if (vlan->flags == flags)
+			return 0;
+		mlx5_esw_bridge_vlan_cleanup(port, vlan, vport->bridge);
 	}
 
-	vlan = mlx5_esw_bridge_vlan_create(vid, flags, port);
+	vlan = mlx5_esw_bridge_vlan_create(vid, flags, port, esw);
 	if (IS_ERR(vlan)) {
 		NL_SET_ERR_MSG_MOD(extack, "Failed to create VLAN entry");
 		return PTR_ERR(vlan);
@@ -916,7 +1059,7 @@ void mlx5_esw_bridge_port_vlan_del(u16 vid, struct mlx5_eswitch *esw, struct mlx
 	vlan = mlx5_esw_bridge_vlan_lookup(vid, port);
 	if (!vlan)
 		return;
-	mlx5_esw_bridge_vlan_cleanup(port, vlan);
+	mlx5_esw_bridge_vlan_cleanup(port, vlan, vport->bridge);
 }
 
 void mlx5_esw_bridge_fdb_create(struct net_device *dev, struct mlx5_eswitch *esw,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [net-next 15/16] net/mlx5: Bridge, filter tagged packets that didn't match tagged fg
  2021-06-10  2:57 [pull request][net-next 00/16] mlx5 updates 2021-06-09 Saeed Mahameed
                   ` (13 preceding siblings ...)
  2021-06-10  2:58 ` [net-next 14/16] net/mlx5: Bridge, support pvid and untagged vlan configurations Saeed Mahameed
@ 2021-06-10  2:58 ` Saeed Mahameed
  2021-06-10  2:58 ` [net-next 16/16] net/mlx5: Bridge, add tracepoints Saeed Mahameed
  2021-06-10 22:36 ` [pull request][net-next 00/16] mlx5 updates 2021-06-09 Jakub Kicinski
  16 siblings, 0 replies; 19+ messages in thread
From: Saeed Mahameed @ 2021-06-10  2:58 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Vlad Buslov, Jianbo Liu, Saeed Mahameed

From: Vlad Buslov <vladbu@nvidia.com>

With support for pvid vlans in mlx5 bridge it is possible to have rules in
untagged flow group when vlan filtering is enabled. However, such rules can
also match tagged packets that didn't match anything in tagged flow group.
Filter such packets by introducing additional flow group between tagged and
untagged groups. When filtering is enabled on the bridge create additional
flow in vlan filtering flow group and matches tagged packets with specified
source MAC address and redirects them to new "skip" table. The skip table
is new lowest-level empty table that is used to skip all further processing
on packet in bridge priority.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  | 141 +++++++++++++++++-
 .../ethernet/mellanox/mlx5/core/esw/bridge.h  |   3 +
 .../net/ethernet/mellanox/mlx5/core/fs_core.c |   2 +-
 3 files changed, 141 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
index 442a62ff7b43..b6345619cbfe 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -15,9 +15,13 @@
 
 #define MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE 64000
 #define MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_FROM 0
-#define MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_TO (MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE / 2 - 1)
-#define MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_FROM \
+#define MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_TO (MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE / 4 - 1)
+#define MLX5_ESW_BRIDGE_INGRESS_TABLE_FILTER_GRP_IDX_FROM \
 	(MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_TO + 1)
+#define MLX5_ESW_BRIDGE_INGRESS_TABLE_FILTER_GRP_IDX_TO \
+	(MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE / 2 - 1)
+#define MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_FROM \
+	(MLX5_ESW_BRIDGE_INGRESS_TABLE_FILTER_GRP_IDX_TO + 1)
 #define MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_TO (MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE - 1)
 
 #define MLX5_ESW_BRIDGE_EGRESS_TABLE_SIZE 64000
@@ -27,9 +31,12 @@
 	(MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_IDX_TO + 1)
 #define MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_TO (MLX5_ESW_BRIDGE_EGRESS_TABLE_SIZE - 1)
 
+#define MLX5_ESW_BRIDGE_SKIP_TABLE_SIZE 0
+
 enum {
 	MLX5_ESW_BRIDGE_LEVEL_INGRESS_TABLE,
 	MLX5_ESW_BRIDGE_LEVEL_EGRESS_TABLE,
+	MLX5_ESW_BRIDGE_LEVEL_SKIP_TABLE,
 };
 
 struct mlx5_esw_bridge_fdb_key {
@@ -54,6 +61,7 @@ struct mlx5_esw_bridge_fdb_entry {
 	struct mlx5_fc *ingress_counter;
 	unsigned long lastuse;
 	struct mlx5_flow_handle *egress_handle;
+	struct mlx5_flow_handle *filter_handle;
 };
 
 static const struct rhashtable_params fdb_ht_params = {
@@ -172,6 +180,44 @@ mlx5_esw_bridge_ingress_vlan_fg_create(struct mlx5_eswitch *esw, struct mlx5_flo
 	return fg;
 }
 
+static struct mlx5_flow_group *
+mlx5_esw_bridge_ingress_filter_fg_create(struct mlx5_eswitch *esw,
+					 struct mlx5_flow_table *ingress_ft)
+{
+	int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in);
+	struct mlx5_flow_group *fg;
+	u32 *in, *match;
+
+	in = kvzalloc(inlen, GFP_KERNEL);
+	if (!in)
+		return ERR_PTR(-ENOMEM);
+
+	MLX5_SET(create_flow_group_in, in, match_criteria_enable,
+		 MLX5_MATCH_OUTER_HEADERS | MLX5_MATCH_MISC_PARAMETERS_2);
+	match = MLX5_ADDR_OF(create_flow_group_in, in, match_criteria);
+
+	MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.smac_47_16);
+	MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.smac_15_0);
+	MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.cvlan_tag);
+
+	MLX5_SET(fte_match_param, match, misc_parameters_2.metadata_reg_c_0,
+		 mlx5_eswitch_get_vport_metadata_mask());
+
+	MLX5_SET(create_flow_group_in, in, start_flow_index,
+		 MLX5_ESW_BRIDGE_INGRESS_TABLE_FILTER_GRP_IDX_FROM);
+	MLX5_SET(create_flow_group_in, in, end_flow_index,
+		 MLX5_ESW_BRIDGE_INGRESS_TABLE_FILTER_GRP_IDX_TO);
+
+	fg = mlx5_create_flow_group(ingress_ft, in);
+	if (IS_ERR(fg))
+		esw_warn(esw->dev,
+			 "Failed to create bridge ingress table VLAN filter flow group (err=%ld)\n",
+			 PTR_ERR(fg));
+
+	kvfree(in);
+	return fg;
+}
+
 static struct mlx5_flow_group *
 mlx5_esw_bridge_ingress_mac_fg_create(struct mlx5_eswitch *esw, struct mlx5_flow_table *ingress_ft)
 {
@@ -275,8 +321,8 @@ mlx5_esw_bridge_egress_mac_fg_create(struct mlx5_eswitch *esw, struct mlx5_flow_
 static int
 mlx5_esw_bridge_ingress_table_init(struct mlx5_esw_bridge_offloads *br_offloads)
 {
-	struct mlx5_flow_group *mac_fg, *vlan_fg;
-	struct mlx5_flow_table *ingress_ft;
+	struct mlx5_flow_group *mac_fg, *filter_fg, *vlan_fg;
+	struct mlx5_flow_table *ingress_ft, *skip_ft;
 	int err;
 
 	if (!mlx5_eswitch_vport_match_metadata_enabled(br_offloads->esw))
@@ -288,12 +334,26 @@ mlx5_esw_bridge_ingress_table_init(struct mlx5_esw_bridge_offloads *br_offloads)
 	if (IS_ERR(ingress_ft))
 		return PTR_ERR(ingress_ft);
 
+	skip_ft = mlx5_esw_bridge_table_create(MLX5_ESW_BRIDGE_SKIP_TABLE_SIZE,
+					       MLX5_ESW_BRIDGE_LEVEL_SKIP_TABLE,
+					       br_offloads->esw);
+	if (IS_ERR(skip_ft)) {
+		err = PTR_ERR(skip_ft);
+		goto err_skip_tbl;
+	}
+
 	vlan_fg = mlx5_esw_bridge_ingress_vlan_fg_create(br_offloads->esw, ingress_ft);
 	if (IS_ERR(vlan_fg)) {
 		err = PTR_ERR(vlan_fg);
 		goto err_vlan_fg;
 	}
 
+	filter_fg = mlx5_esw_bridge_ingress_filter_fg_create(br_offloads->esw, ingress_ft);
+	if (IS_ERR(filter_fg)) {
+		err = PTR_ERR(filter_fg);
+		goto err_filter_fg;
+	}
+
 	mac_fg = mlx5_esw_bridge_ingress_mac_fg_create(br_offloads->esw, ingress_ft);
 	if (IS_ERR(mac_fg)) {
 		err = PTR_ERR(mac_fg);
@@ -301,13 +361,19 @@ mlx5_esw_bridge_ingress_table_init(struct mlx5_esw_bridge_offloads *br_offloads)
 	}
 
 	br_offloads->ingress_ft = ingress_ft;
+	br_offloads->skip_ft = skip_ft;
 	br_offloads->ingress_vlan_fg = vlan_fg;
+	br_offloads->ingress_filter_fg = filter_fg;
 	br_offloads->ingress_mac_fg = mac_fg;
 	return 0;
 
 err_mac_fg:
+	mlx5_destroy_flow_group(filter_fg);
+err_filter_fg:
 	mlx5_destroy_flow_group(vlan_fg);
 err_vlan_fg:
+	mlx5_destroy_flow_table(skip_ft);
+err_skip_tbl:
 	mlx5_destroy_flow_table(ingress_ft);
 	return err;
 }
@@ -317,8 +383,12 @@ mlx5_esw_bridge_ingress_table_cleanup(struct mlx5_esw_bridge_offloads *br_offloa
 {
 	mlx5_destroy_flow_group(br_offloads->ingress_mac_fg);
 	br_offloads->ingress_mac_fg = NULL;
+	mlx5_destroy_flow_group(br_offloads->ingress_filter_fg);
+	br_offloads->ingress_filter_fg = NULL;
 	mlx5_destroy_flow_group(br_offloads->ingress_vlan_fg);
 	br_offloads->ingress_vlan_fg = NULL;
+	mlx5_destroy_flow_table(br_offloads->skip_ft);
+	br_offloads->skip_ft = NULL;
 	mlx5_destroy_flow_table(br_offloads->ingress_ft);
 	br_offloads->ingress_ft = NULL;
 }
@@ -428,6 +498,52 @@ mlx5_esw_bridge_ingress_flow_create(u16 vport_num, const unsigned char *addr,
 	return handle;
 }
 
+static struct mlx5_flow_handle *
+mlx5_esw_bridge_ingress_filter_flow_create(u16 vport_num, const unsigned char *addr,
+					   struct mlx5_esw_bridge *bridge)
+{
+	struct mlx5_esw_bridge_offloads *br_offloads = bridge->br_offloads;
+	struct mlx5_flow_destination dest = {
+		.type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE,
+		.ft = br_offloads->skip_ft,
+	};
+	struct mlx5_flow_act flow_act = {
+		.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST,
+		.flags = FLOW_ACT_NO_APPEND,
+	};
+	struct mlx5_flow_spec *rule_spec;
+	struct mlx5_flow_handle *handle;
+	u8 *smac_v, *smac_c;
+
+	rule_spec = kvzalloc(sizeof(*rule_spec), GFP_KERNEL);
+	if (!rule_spec)
+		return ERR_PTR(-ENOMEM);
+
+	rule_spec->match_criteria_enable = MLX5_MATCH_OUTER_HEADERS | MLX5_MATCH_MISC_PARAMETERS_2;
+
+	smac_v = MLX5_ADDR_OF(fte_match_param, rule_spec->match_value,
+			      outer_headers.smac_47_16);
+	ether_addr_copy(smac_v, addr);
+	smac_c = MLX5_ADDR_OF(fte_match_param, rule_spec->match_criteria,
+			      outer_headers.smac_47_16);
+	eth_broadcast_addr(smac_c);
+
+	MLX5_SET(fte_match_param, rule_spec->match_criteria,
+		 misc_parameters_2.metadata_reg_c_0, mlx5_eswitch_get_vport_metadata_mask());
+	MLX5_SET(fte_match_param, rule_spec->match_value, misc_parameters_2.metadata_reg_c_0,
+		 mlx5_eswitch_get_vport_metadata_for_match(br_offloads->esw, vport_num));
+
+	MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria,
+			 outer_headers.cvlan_tag);
+	MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_value,
+			 outer_headers.cvlan_tag);
+
+	handle = mlx5_add_flow_rules(br_offloads->ingress_ft, rule_spec, &flow_act, &dest, 1);
+
+	kvfree(rule_spec);
+	return handle;
+}
+
 static struct mlx5_flow_handle *
 mlx5_esw_bridge_egress_flow_create(u16 vport_num, const unsigned char *addr,
 				   struct mlx5_esw_bridge_vlan *vlan,
@@ -587,8 +703,11 @@ mlx5_esw_bridge_fdb_entry_cleanup(struct mlx5_esw_bridge_fdb_entry *entry,
 {
 	rhashtable_remove_fast(&bridge->fdb_ht, &entry->ht_node, fdb_ht_params);
 	mlx5_del_flow_rules(entry->egress_handle);
+	if (entry->filter_handle)
+		mlx5_del_flow_rules(entry->filter_handle);
 	mlx5_del_flow_rules(entry->ingress_handle);
 	mlx5_fc_destroy(bridge->br_offloads->esw->dev, entry->ingress_counter);
+	list_del(&entry->vlan_list);
 	list_del(&entry->list);
 	kvfree(entry);
 }
@@ -857,6 +976,17 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsi
 	}
 	entry->ingress_handle = handle;
 
+	if (bridge->flags & MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG) {
+		handle = mlx5_esw_bridge_ingress_filter_flow_create(vport_num, addr, bridge);
+		if (IS_ERR(handle)) {
+			err = PTR_ERR(handle);
+			esw_warn(esw->dev, "Failed to create ingress filter(vport=%u,err=%d)\n",
+				 vport_num, err);
+			goto err_ingress_filter_flow_create;
+		}
+		entry->filter_handle = handle;
+	}
+
 	handle = mlx5_esw_bridge_egress_flow_create(vport_num, addr, vlan, bridge);
 	if (IS_ERR(handle)) {
 		err = PTR_ERR(handle);
@@ -882,6 +1012,9 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsi
 err_ht_init:
 	mlx5_del_flow_rules(entry->egress_handle);
 err_egress_flow_create:
+	if (entry->filter_handle)
+		mlx5_del_flow_rules(entry->filter_handle);
+err_ingress_filter_flow_create:
 	mlx5_del_flow_rules(entry->ingress_handle);
 err_ingress_flow_create:
 	mlx5_fc_destroy(priv->mdev, entry->ingress_counter);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
index bedbda57cdb3..d826942b27fc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h
@@ -23,7 +23,10 @@ struct mlx5_esw_bridge_offloads {
 
 	struct mlx5_flow_table *ingress_ft;
 	struct mlx5_flow_group *ingress_vlan_fg;
+	struct mlx5_flow_group *ingress_filter_fg;
 	struct mlx5_flow_group *ingress_mac_fg;
+
+	struct mlx5_flow_table *skip_ft;
 };
 
 struct mlx5_esw_bridge_offloads *mlx5_esw_bridge_init(struct mlx5_eswitch *esw);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index fc37ac9eab12..2cd7aea5d329 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -2786,7 +2786,7 @@ static int init_fdb_root_ns(struct mlx5_flow_steering *steering)
 		goto out_err;
 	}
 
-	maj_prio = fs_create_prio(&steering->fdb_root_ns->ns, FDB_BR_OFFLOAD, 2);
+	maj_prio = fs_create_prio(&steering->fdb_root_ns->ns, FDB_BR_OFFLOAD, 3);
 	if (IS_ERR(maj_prio)) {
 		err = PTR_ERR(maj_prio);
 		goto out_err;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [net-next 16/16] net/mlx5: Bridge, add tracepoints
  2021-06-10  2:57 [pull request][net-next 00/16] mlx5 updates 2021-06-09 Saeed Mahameed
                   ` (14 preceding siblings ...)
  2021-06-10  2:58 ` [net-next 15/16] net/mlx5: Bridge, filter tagged packets that didn't match tagged fg Saeed Mahameed
@ 2021-06-10  2:58 ` Saeed Mahameed
  2021-06-10 22:36 ` [pull request][net-next 00/16] mlx5 updates 2021-06-09 Jakub Kicinski
  16 siblings, 0 replies; 19+ messages in thread
From: Saeed Mahameed @ 2021-06-10  2:58 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Vlad Buslov, Jianbo Liu, Saeed Mahameed

From: Vlad Buslov <vladbu@nvidia.com>

Move private bridge structures to dedicated headers that is accessible to
bridge tracepoint header. Implemented following tracepoints:

- Initialize FDB entry.
- Refresh FDB entry.
- Cleanup FDB entry.
- Create VLAN.
- Cleanup VLAN.
- Attach port to bridge.
- Detach port from bridge.

Usage example:

># cd /sys/kernel/debug/tracing
># echo mlx5:mlx5_esw_bridge_fdb_entry_init >> set_event
># cat trace
...
   kworker/u20:1-96      [001] ....   231.892503: mlx5_esw_bridge_fdb_entry_init: net_device=enp8s0f0_0 addr=e4:fd:05:08:00:02 vid=3 flags=0 lastuse=4294895695

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../device_drivers/ethernet/mellanox/mlx5.rst |  56 +++++++++
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  |  75 ++++--------
 .../mellanox/mlx5/core/esw/bridge_priv.h      |  53 ++++++++
 .../mlx5/core/esw/diag/bridge_tracepoint.h    | 113 ++++++++++++++++++
 4 files changed, 247 insertions(+), 50 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h

diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
index 058882dca17b..ef8cb62e82a1 100644
--- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
@@ -600,3 +600,59 @@ tc and eswitch offloads tracepoints:
     $ cat /sys/kernel/debug/tracing/trace
     ...
     kworker/u48:7-2221  [009] ...1  1475.387435: mlx5e_rep_neigh_update: netdev: ens1f0 MAC: 24:8a:07:9a:17:9a IPv4: 1.1.1.10 IPv6: ::ffff:1.1.1.10 neigh_connected=1
+
+Bridge offloads tracepoints:
+
+- mlx5_esw_bridge_fdb_entry_init: trace bridge FDB entry offloaded to mlx5::
+
+    $ echo mlx5:mlx5_esw_bridge_fdb_entry_init >> set_event
+    $ cat /sys/kernel/debug/tracing/trace
+    ...
+    kworker/u20:9-2217    [003] ...1   318.582243: mlx5_esw_bridge_fdb_entry_init: net_device=enp8s0f0_0 addr=e4:fd:05:08:00:02 vid=0 flags=0 used=0
+
+- mlx5_esw_bridge_fdb_entry_cleanup: trace bridge FDB entry deleted from mlx5::
+
+    $ echo mlx5:mlx5_esw_bridge_fdb_entry_cleanup >> set_event
+    $ cat /sys/kernel/debug/tracing/trace
+    ...
+    ip-2581    [005] ...1   318.629871: mlx5_esw_bridge_fdb_entry_cleanup: net_device=enp8s0f0_1 addr=e4:fd:05:08:00:03 vid=0 flags=0 used=16
+
+- mlx5_esw_bridge_fdb_entry_refresh: trace bridge FDB entry offload refreshed in
+  mlx5::
+
+    $ echo mlx5:mlx5_esw_bridge_fdb_entry_refresh >> set_event
+    $ cat /sys/kernel/debug/tracing/trace
+    ...
+    kworker/u20:8-3849    [003] ...1       466716: mlx5_esw_bridge_fdb_entry_refresh: net_device=enp8s0f0_0 addr=e4:fd:05:08:00:02 vid=3 flags=0 used=0
+
+- mlx5_esw_bridge_vlan_create: trace bridge VLAN object add on mlx5
+  representor::
+
+    $ echo mlx5:mlx5_esw_bridge_vlan_create >> set_event
+    $ cat /sys/kernel/debug/tracing/trace
+    ...
+    ip-2560    [007] ...1   318.460258: mlx5_esw_bridge_vlan_create: vid=1 flags=6
+
+- mlx5_esw_bridge_vlan_cleanup: trace bridge VLAN object delete from mlx5
+  representor::
+
+    $ echo mlx5:mlx5_esw_bridge_vlan_cleanup >> set_event
+    $ cat /sys/kernel/debug/tracing/trace
+    ...
+    bridge-2582    [007] ...1   318.653496: mlx5_esw_bridge_vlan_cleanup: vid=2 flags=8
+
+- mlx5_esw_bridge_vport_init: trace mlx5 vport assigned with bridge upper
+  device::
+
+    $ echo mlx5:mlx5_esw_bridge_vport_init >> set_event
+    $ cat /sys/kernel/debug/tracing/trace
+    ...
+    ip-2560    [007] ...1   318.458915: mlx5_esw_bridge_vport_init: vport_num=1
+
+- mlx5_esw_bridge_vport_cleanup: trace mlx5 vport removed from bridge upper
+  device::
+
+    $ echo mlx5:mlx5_esw_bridge_vport_cleanup >> set_event
+    $ cat /sys/kernel/debug/tracing/trace
+    ...
+    ip-5387    [000] ...1       573713: mlx5_esw_bridge_vport_cleanup: vport_num=1
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
index b6345619cbfe..a6e1d4f78268 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -1,17 +1,15 @@
 // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
 /* Copyright (c) 2021 Mellanox Technologies. */
 
-#include <linux/netdevice.h>
 #include <linux/list.h>
-#include <linux/rhashtable.h>
-#include <linux/xarray.h>
-#include <linux/if_bridge.h>
-#include <linux/if_vlan.h>
-#include <linux/if_ether.h>
+#include <linux/notifier.h>
+#include <net/netevent.h>
 #include <net/switchdev.h>
 #include "bridge.h"
 #include "eswitch.h"
-#include "fs_core.h"
+#include "bridge_priv.h"
+#define CREATE_TRACE_POINTS
+#include "diag/bridge_tracepoint.h"
 
 #define MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE 64000
 #define MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_FROM 0
@@ -39,31 +37,6 @@ enum {
 	MLX5_ESW_BRIDGE_LEVEL_SKIP_TABLE,
 };
 
-struct mlx5_esw_bridge_fdb_key {
-	unsigned char addr[ETH_ALEN];
-	u16 vid;
-};
-
-enum {
-	MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER = BIT(0),
-};
-
-struct mlx5_esw_bridge_fdb_entry {
-	struct mlx5_esw_bridge_fdb_key key;
-	struct rhash_head ht_node;
-	struct net_device *dev;
-	struct list_head list;
-	struct list_head vlan_list;
-	u16 vport_num;
-	u16 flags;
-
-	struct mlx5_flow_handle *ingress_handle;
-	struct mlx5_fc *ingress_counter;
-	unsigned long lastuse;
-	struct mlx5_flow_handle *egress_handle;
-	struct mlx5_flow_handle *filter_handle;
-};
-
 static const struct rhashtable_params fdb_ht_params = {
 	.key_offset = offsetof(struct mlx5_esw_bridge_fdb_entry, key),
 	.key_len = sizeof(struct mlx5_esw_bridge_fdb_key),
@@ -71,19 +44,6 @@ static const struct rhashtable_params fdb_ht_params = {
 	.automatic_shrinking = true,
 };
 
-struct mlx5_esw_bridge_vlan {
-	u16 vid;
-	u16 flags;
-	struct list_head fdb_list;
-	struct mlx5_pkt_reformat *pkt_reformat_push;
-	struct mlx5_pkt_reformat *pkt_reformat_pop;
-};
-
-struct mlx5_esw_bridge_port {
-	u16 vport_num;
-	struct xarray vlans;
-};
-
 enum {
 	MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG = BIT(0),
 };
@@ -697,10 +657,23 @@ static void mlx5_esw_bridge_port_erase(struct mlx5_esw_bridge_port *port,
 	xa_erase(&bridge->vports, port->vport_num);
 }
 
+static void mlx5_esw_bridge_fdb_entry_refresh(unsigned long lastuse,
+					      struct mlx5_esw_bridge_fdb_entry *entry)
+{
+	trace_mlx5_esw_bridge_fdb_entry_refresh(entry);
+
+	entry->lastuse = lastuse;
+	mlx5_esw_bridge_fdb_offload_notify(entry->dev, entry->key.addr,
+					   entry->key.vid,
+					   SWITCHDEV_FDB_ADD_TO_BRIDGE);
+}
+
 static void
 mlx5_esw_bridge_fdb_entry_cleanup(struct mlx5_esw_bridge_fdb_entry *entry,
 				  struct mlx5_esw_bridge *bridge)
 {
+	trace_mlx5_esw_bridge_fdb_entry_cleanup(entry);
+
 	rhashtable_remove_fast(&bridge->fdb_ht, &entry->ht_node, fdb_ht_params);
 	mlx5_del_flow_rules(entry->egress_handle);
 	if (entry->filter_handle)
@@ -842,6 +815,7 @@ mlx5_esw_bridge_vlan_create(u16 vid, u16 flags, struct mlx5_esw_bridge_port *por
 	if (err)
 		goto err_xa_insert;
 
+	trace_mlx5_esw_bridge_vlan_create(vlan);
 	return vlan;
 
 err_xa_insert:
@@ -884,6 +858,7 @@ static void mlx5_esw_bridge_vlan_cleanup(struct mlx5_esw_bridge_port *port,
 					 struct mlx5_esw_bridge_vlan *vlan,
 					 struct mlx5_esw_bridge *bridge)
 {
+	trace_mlx5_esw_bridge_vlan_cleanup(vlan);
 	mlx5_esw_bridge_vlan_flush(vlan, bridge);
 	mlx5_esw_bridge_vlan_erase(port, vlan);
 	kvfree(vlan);
@@ -1007,6 +982,8 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsi
 	else
 		INIT_LIST_HEAD(&entry->vlan_list);
 	list_add(&entry->list, &bridge->fdb_list);
+
+	trace_mlx5_esw_bridge_fdb_entry_init(entry);
 	return entry;
 
 err_ht_init:
@@ -1078,6 +1055,7 @@ static int mlx5_esw_bridge_vport_init(struct mlx5_esw_bridge_offloads *br_offloa
 			 vport->vport, err);
 		goto err_port_insert;
 	}
+	trace_mlx5_esw_bridge_vport_init(port);
 
 	vport->bridge = bridge;
 	return 0;
@@ -1106,6 +1084,7 @@ static int mlx5_esw_bridge_vport_cleanup(struct mlx5_esw_bridge_offloads *br_off
 		return -EINVAL;
 	}
 
+	trace_mlx5_esw_bridge_vport_cleanup(port);
 	mlx5_esw_bridge_port_vlans_flush(port, bridge);
 	mlx5_esw_bridge_port_erase(port, bridge);
 	kvfree(port);
@@ -1266,11 +1245,7 @@ void mlx5_esw_bridge_update(struct mlx5_esw_bridge_offloads *br_offloads)
 				continue;
 
 			if (time_after(lastuse, entry->lastuse)) {
-				entry->lastuse = lastuse;
-				/* refresh existing bridge entry */
-				mlx5_esw_bridge_fdb_offload_notify(entry->dev, entry->key.addr,
-								   entry->key.vid,
-								   SWITCHDEV_FDB_ADD_TO_BRIDGE);
+				mlx5_esw_bridge_fdb_entry_refresh(lastuse, entry);
 			} else if (time_is_before_jiffies(entry->lastuse + bridge->ageing_time)) {
 				mlx5_esw_bridge_fdb_offload_notify(entry->dev, entry->key.addr,
 								   entry->key.vid,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h
new file mode 100644
index 000000000000..d9ab2e8bc2cb
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h
@@ -0,0 +1,53 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2021 Mellanox Technologies. */
+
+#ifndef _MLX5_ESW_BRIDGE_PRIVATE_
+#define _MLX5_ESW_BRIDGE_PRIVATE_
+
+#include <linux/netdevice.h>
+#include <linux/if_bridge.h>
+#include <linux/if_vlan.h>
+#include <linux/if_ether.h>
+#include <linux/rhashtable.h>
+#include <linux/xarray.h>
+#include "fs_core.h"
+
+struct mlx5_esw_bridge_fdb_key {
+	unsigned char addr[ETH_ALEN];
+	u16 vid;
+};
+
+enum {
+	MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER = BIT(0),
+};
+
+struct mlx5_esw_bridge_fdb_entry {
+	struct mlx5_esw_bridge_fdb_key key;
+	struct rhash_head ht_node;
+	struct net_device *dev;
+	struct list_head list;
+	struct list_head vlan_list;
+	u16 vport_num;
+	u16 flags;
+
+	struct mlx5_flow_handle *ingress_handle;
+	struct mlx5_fc *ingress_counter;
+	unsigned long lastuse;
+	struct mlx5_flow_handle *egress_handle;
+	struct mlx5_flow_handle *filter_handle;
+};
+
+struct mlx5_esw_bridge_vlan {
+	u16 vid;
+	u16 flags;
+	struct list_head fdb_list;
+	struct mlx5_pkt_reformat *pkt_reformat_push;
+	struct mlx5_pkt_reformat *pkt_reformat_pop;
+};
+
+struct mlx5_esw_bridge_port {
+	u16 vport_num;
+	struct xarray vlans;
+};
+
+#endif /* _MLX5_ESW_BRIDGE_PRIVATE_ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h
new file mode 100644
index 000000000000..227964b7d3b9
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h
@@ -0,0 +1,113 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2021 Mellanox Technologies. */
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM mlx5
+
+#if !defined(_MLX5_ESW_BRIDGE_TRACEPOINT_) || defined(TRACE_HEADER_MULTI_READ)
+#define _MLX5_ESW_BRIDGE_TRACEPOINT_
+
+#include <linux/tracepoint.h>
+#include "../bridge_priv.h"
+
+DECLARE_EVENT_CLASS(mlx5_esw_bridge_fdb_template,
+		    TP_PROTO(const struct mlx5_esw_bridge_fdb_entry *fdb),
+		    TP_ARGS(fdb),
+		    TP_STRUCT__entry(
+			    __array(char, dev_name, IFNAMSIZ)
+			    __array(unsigned char, addr, ETH_ALEN)
+			    __field(u16, vid)
+			    __field(u16, flags)
+			    __field(unsigned int, used)
+			    ),
+		    TP_fast_assign(
+			    strncpy(__entry->dev_name,
+				    netdev_name(fdb->dev),
+				    IFNAMSIZ);
+			    memcpy(__entry->addr, fdb->key.addr, ETH_ALEN);
+			    __entry->vid = fdb->key.vid;
+			    __entry->flags = fdb->flags;
+			    __entry->used = jiffies_to_msecs(jiffies - fdb->lastuse)
+			    ),
+		    TP_printk("net_device=%s addr=%pM vid=%hu flags=%hx used=%u",
+			      __entry->dev_name,
+			      __entry->addr,
+			      __entry->vid,
+			      __entry->flags,
+			      __entry->used / 1000)
+	);
+
+DEFINE_EVENT(mlx5_esw_bridge_fdb_template,
+	     mlx5_esw_bridge_fdb_entry_init,
+	     TP_PROTO(const struct mlx5_esw_bridge_fdb_entry *fdb),
+	     TP_ARGS(fdb)
+	);
+DEFINE_EVENT(mlx5_esw_bridge_fdb_template,
+	     mlx5_esw_bridge_fdb_entry_refresh,
+	     TP_PROTO(const struct mlx5_esw_bridge_fdb_entry *fdb),
+	     TP_ARGS(fdb)
+	);
+DEFINE_EVENT(mlx5_esw_bridge_fdb_template,
+	     mlx5_esw_bridge_fdb_entry_cleanup,
+	     TP_PROTO(const struct mlx5_esw_bridge_fdb_entry *fdb),
+	     TP_ARGS(fdb)
+	);
+
+DECLARE_EVENT_CLASS(mlx5_esw_bridge_vlan_template,
+		    TP_PROTO(const struct mlx5_esw_bridge_vlan *vlan),
+		    TP_ARGS(vlan),
+		    TP_STRUCT__entry(
+			    __field(u16, vid)
+			    __field(u16, flags)
+			    ),
+		    TP_fast_assign(
+			    __entry->vid = vlan->vid;
+			    __entry->flags = vlan->flags;
+			    ),
+		    TP_printk("vid=%hu flags=%hx",
+			      __entry->vid,
+			      __entry->flags)
+	);
+
+DEFINE_EVENT(mlx5_esw_bridge_vlan_template,
+	     mlx5_esw_bridge_vlan_create,
+	     TP_PROTO(const struct mlx5_esw_bridge_vlan *vlan),
+	     TP_ARGS(vlan)
+	);
+DEFINE_EVENT(mlx5_esw_bridge_vlan_template,
+	     mlx5_esw_bridge_vlan_cleanup,
+	     TP_PROTO(const struct mlx5_esw_bridge_vlan *vlan),
+	     TP_ARGS(vlan)
+	);
+
+DECLARE_EVENT_CLASS(mlx5_esw_bridge_port_template,
+		    TP_PROTO(const struct mlx5_esw_bridge_port *port),
+		    TP_ARGS(port),
+		    TP_STRUCT__entry(
+			    __field(u16, vport_num)
+			    ),
+		    TP_fast_assign(
+			    __entry->vport_num = port->vport_num;
+			    ),
+		    TP_printk("vport_num=%hu", __entry->vport_num)
+	);
+
+DEFINE_EVENT(mlx5_esw_bridge_port_template,
+	     mlx5_esw_bridge_vport_init,
+	     TP_PROTO(const struct mlx5_esw_bridge_port *port),
+	     TP_ARGS(port)
+	);
+DEFINE_EVENT(mlx5_esw_bridge_port_template,
+	     mlx5_esw_bridge_vport_cleanup,
+	     TP_PROTO(const struct mlx5_esw_bridge_port *port),
+	     TP_ARGS(port)
+	);
+
+#endif
+
+/* This part must be outside protection */
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH esw/diag
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE bridge_tracepoint
+#include <trace/define_trace.h>
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [net-next 01/16] net/mlx5: mlx5_ifc support for header insert/remove
  2021-06-10  2:57 ` [net-next 01/16] net/mlx5: mlx5_ifc support for header insert/remove Saeed Mahameed
@ 2021-06-10 20:40   ` patchwork-bot+netdevbpf
  0 siblings, 0 replies; 19+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-06-10 20:40 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: davem, kuba, netdev, kliteyn, vladbu, jianbol, saeedm

Hello:

This series was applied to netdev/net-next.git (refs/heads/master):

On Wed,  9 Jun 2021 19:57:59 -0700 you wrote:
> From: Yevgeny Kliteynik <kliteyn@nvidia.com>
> 
> Add support for HCA caps 2 that contains capabilities for the new
> insert/remove header actions.
> 
> Added the required definitions for supporting the new reformat type:
> added packet reformat parameters, reformat anchors and definitions
> to allow copy/set into the inserted EMD (Embedded MetaData) tag.
> 
> [...]

Here is the summary with links:
  - [net-next,01/16] net/mlx5: mlx5_ifc support for header insert/remove
    https://git.kernel.org/netdev/net-next/c/67133eaa93e8
  - [net-next,02/16] net/mlx5: DR, Split reformat state to Encap and Decap
    https://git.kernel.org/netdev/net-next/c/28de41a4ba7b
  - [net-next,03/16] net/mlx5: DR, Allow encap action for RX for supporting devices
    https://git.kernel.org/netdev/net-next/c/d7418b4efa3b
  - [net-next,04/16] net/mlx5: Added new parameters to reformat context
    https://git.kernel.org/netdev/net-next/c/3f3f05ab8872
  - [net-next,05/16] net/mlx5: DR, Added support for INSERT_HEADER reformat type
    https://git.kernel.org/netdev/net-next/c/7ea9b39852fa
  - [net-next,06/16] net/mlx5: DR, Support EMD tag in modify header for STEv1
    https://git.kernel.org/netdev/net-next/c/ded6a877a3fc
  - [net-next,07/16] net/mlx5: Create TC-miss priority and table
    https://git.kernel.org/netdev/net-next/c/ec3be8873df3
  - [net-next,08/16] net/mlx5e: Refactor mlx5e_eswitch_{*}rep() helpers
    https://git.kernel.org/netdev/net-next/c/0781015288ec
  - [net-next,09/16] net/mlx5: Bridge, add offload infrastructure
    https://git.kernel.org/netdev/net-next/c/19e9bfa044f3
  - [net-next,10/16] net/mlx5: Bridge, handle FDB events
    https://git.kernel.org/netdev/net-next/c/7cd6a54a8285
  - [net-next,11/16] net/mlx5: Bridge, dynamic entry ageing
    https://git.kernel.org/netdev/net-next/c/c636a0f0f3f0
  - [net-next,12/16] net/mlx5: Bridge, implement infrastructure for vlans
    https://git.kernel.org/netdev/net-next/c/d75b9e804858
  - [net-next,13/16] net/mlx5: Bridge, match FDB entry vlan tag
    https://git.kernel.org/netdev/net-next/c/ffc89ee5e5e8
  - [net-next,14/16] net/mlx5: Bridge, support pvid and untagged vlan configurations
    https://git.kernel.org/netdev/net-next/c/36e55079e549
  - [net-next,15/16] net/mlx5: Bridge, filter tagged packets that didn't match tagged fg
    https://git.kernel.org/netdev/net-next/c/cc2987c44be5
  - [net-next,16/16] net/mlx5: Bridge, add tracepoints
    https://git.kernel.org/netdev/net-next/c/9724fd5d9c2a

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [pull request][net-next 00/16] mlx5 updates 2021-06-09
  2021-06-10  2:57 [pull request][net-next 00/16] mlx5 updates 2021-06-09 Saeed Mahameed
                   ` (15 preceding siblings ...)
  2021-06-10  2:58 ` [net-next 16/16] net/mlx5: Bridge, add tracepoints Saeed Mahameed
@ 2021-06-10 22:36 ` Jakub Kicinski
  16 siblings, 0 replies; 19+ messages in thread
From: Jakub Kicinski @ 2021-06-10 22:36 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: David S. Miller, netdev, Saeed Mahameed

On Wed,  9 Jun 2021 19:57:58 -0700 Saeed Mahameed wrote:
> This series introduces insert/remove header support
> for sw steering and switchdev bridge offloads support.

Great to see bridge offload for NICs! :)

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-06-10 22:36 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-10  2:57 [pull request][net-next 00/16] mlx5 updates 2021-06-09 Saeed Mahameed
2021-06-10  2:57 ` [net-next 01/16] net/mlx5: mlx5_ifc support for header insert/remove Saeed Mahameed
2021-06-10 20:40   ` patchwork-bot+netdevbpf
2021-06-10  2:58 ` [net-next 02/16] net/mlx5: DR, Split reformat state to Encap and Decap Saeed Mahameed
2021-06-10  2:58 ` [net-next 03/16] net/mlx5: DR, Allow encap action for RX for supporting devices Saeed Mahameed
2021-06-10  2:58 ` [net-next 04/16] net/mlx5: Added new parameters to reformat context Saeed Mahameed
2021-06-10  2:58 ` [net-next 05/16] net/mlx5: DR, Added support for INSERT_HEADER reformat type Saeed Mahameed
2021-06-10  2:58 ` [net-next 06/16] net/mlx5: DR, Support EMD tag in modify header for STEv1 Saeed Mahameed
2021-06-10  2:58 ` [net-next 07/16] net/mlx5: Create TC-miss priority and table Saeed Mahameed
2021-06-10  2:58 ` [net-next 08/16] net/mlx5e: Refactor mlx5e_eswitch_{*}rep() helpers Saeed Mahameed
2021-06-10  2:58 ` [net-next 09/16] net/mlx5: Bridge, add offload infrastructure Saeed Mahameed
2021-06-10  2:58 ` [net-next 10/16] net/mlx5: Bridge, handle FDB events Saeed Mahameed
2021-06-10  2:58 ` [net-next 11/16] net/mlx5: Bridge, dynamic entry ageing Saeed Mahameed
2021-06-10  2:58 ` [net-next 12/16] net/mlx5: Bridge, implement infrastructure for vlans Saeed Mahameed
2021-06-10  2:58 ` [net-next 13/16] net/mlx5: Bridge, match FDB entry vlan tag Saeed Mahameed
2021-06-10  2:58 ` [net-next 14/16] net/mlx5: Bridge, support pvid and untagged vlan configurations Saeed Mahameed
2021-06-10  2:58 ` [net-next 15/16] net/mlx5: Bridge, filter tagged packets that didn't match tagged fg Saeed Mahameed
2021-06-10  2:58 ` [net-next 16/16] net/mlx5: Bridge, add tracepoints Saeed Mahameed
2021-06-10 22:36 ` [pull request][net-next 00/16] mlx5 updates 2021-06-09 Jakub Kicinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.