netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [pull request][net-next 0/9] Mellanox, mlx5 updates 2019-02-21
@ 2019-02-22 21:44 Saeed Mahameed
  2019-02-22 21:44 ` [net-next 1/9] net/mlx5: Use read-modify-write when changing PCMR register values Saeed Mahameed
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: Saeed Mahameed @ 2019-02-22 21:44 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Saeed Mahameed

Hi Dave,

This series adds misc updates to mlx5 driver.
For more information please see tag log below.

Please pull and let me know if there is any problem.

Thanks,
Saeed.

---
The following changes since commit 2fce40a592daa92f1565152cb68d4c4ca7e97d52:

  Merge branch 'dsa-vlan' (2019-02-22 11:53:32 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2019-02-21

for you to fetch changes up to 4b89251de024fb85329e4cbd8fbea551ae6c665c:

  net/mlx5: Support ndo bridge_setlink and getlink (2019-02-22 13:38:25 -0800)

----------------------------------------------------------------
mlx5-updates-2019-02-21

This series adds some misc updates to mlx5 driver,

1) Eli Britstein, Introduces tunnel entropy control from PCMR register
and fixes GRE key by controlling port tunnel entropy calculation.

2) Eran Ben Elisha, provides some mlx5 fixes to the latest tx devlink health
reporting mechanism.

3) Huy Nguyen, Added the support for ndo bridge_setlink to allow
   VEPA/VEB E-Switch legacy mode configurations.

----------------------------------------------------------------
Eli Britstein (3):
      net/mlx5: Use read-modify-write when changing PCMR register values
      net/mlx5: Introduce tunnel entropy control in PCMR register
      net/mlx5e: Fix GRE key by controlling port tunnel entropy calculation

Eran Ben Elisha (4):
      net/mlx5e: Fix warn print in case of TX reporter creation failure
      net/mlx5e: Re-add support for TX timeout when TX reporter is not valid
      net/mlx5e: Fix return status of TX reporter timeout recover
      net/mlx5e: Fix mlx5e_tx_reporter_create return value

Huy Nguyen (2):
      net/mlx5: E-Switch, Add support for VEPA in legacy mode.
      net/mlx5: Support ndo bridge_setlink and getlink

 drivers/net/ethernet/mellanox/mlx5/core/Makefile   |   2 +-
 .../ethernet/mellanox/mlx5/core/en/reporter_tx.c   |  22 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |  65 +++++-
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c   |  18 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.h   |   3 +
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c    |   5 +-
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.c  | 223 +++++++++++++++++++--
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.h  |   5 +
 .../net/ethernet/mellanox/mlx5/core/lib/port_tun.c | 205 +++++++++++++++++++
 .../net/ethernet/mellanox/mlx5/core/lib/port_tun.h |  24 +++
 drivers/net/ethernet/mellanox/mlx5/core/port.c     |   9 +-
 include/linux/mlx5/mlx5_ifc.h                      |  12 +-
 include/linux/mlx5/port.h                          |   2 +
 13 files changed, 553 insertions(+), 42 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/port_tun.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/port_tun.h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [net-next 1/9] net/mlx5: Use read-modify-write when changing PCMR register values
  2019-02-22 21:44 [pull request][net-next 0/9] Mellanox, mlx5 updates 2019-02-21 Saeed Mahameed
@ 2019-02-22 21:44 ` Saeed Mahameed
  2019-02-22 21:44 ` [net-next 2/9] net/mlx5: Introduce tunnel entropy control in PCMR register Saeed Mahameed
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Saeed Mahameed @ 2019-02-22 21:44 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Eli Britstein, Oz Shlomo, Roi Dayan, Saeed Mahameed

From: Eli Britstein <elibr@mellanox.com>

Currently changing a PCMR field is done by setting the field in a
zeroed buffer, zeroing other unrelated fields.
Fix this behaviour by modifying only the required field after first
reading the current register values, as a pre-step towards using more
fields in PCMR register.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/port.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c
index b81542820528..55b30d21a73a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/port.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c
@@ -785,7 +785,11 @@ static int mlx5_set_ports_check(struct mlx5_core_dev *mdev, u32 *in, int inlen)
 int mlx5_set_port_fcs(struct mlx5_core_dev *mdev, u8 enable)
 {
 	u32 in[MLX5_ST_SZ_DW(pcmr_reg)] = {0};
+	int err;
 
+	err = mlx5_query_ports_check(mdev, in, sizeof(in));
+	if (err)
+		return err;
 	MLX5_SET(pcmr_reg, in, local_port, 1);
 	MLX5_SET(pcmr_reg, in, fcs_chk, enable);
 	return mlx5_set_ports_check(mdev, in, sizeof(in));
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [net-next 2/9] net/mlx5: Introduce tunnel entropy control in PCMR register
  2019-02-22 21:44 [pull request][net-next 0/9] Mellanox, mlx5 updates 2019-02-21 Saeed Mahameed
  2019-02-22 21:44 ` [net-next 1/9] net/mlx5: Use read-modify-write when changing PCMR register values Saeed Mahameed
@ 2019-02-22 21:44 ` Saeed Mahameed
  2019-02-22 21:44 ` [net-next 3/9] net/mlx5e: Fix GRE key by controlling port tunnel entropy calculation Saeed Mahameed
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Saeed Mahameed @ 2019-02-22 21:44 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Eli Britstein, Oz Shlomo, Roi Dayan, Saeed Mahameed

From: Eli Britstein <elibr@mellanox.com>

When using the device packet encapsulation offload, the device
calculates an entropy value, representing the inner packet headers. The
entropy field is placed inside the outer packet headers. For UDP-type
encapsulations, the entropy is placed in the source port field of the
UDP header. For GRE-type encapsulations, the entropy is placed in the 8
LSB of the key field in the GRE header. If the device does not recognize
the encapsulation type, the entropy is not placed in the packet.

Entropy setting can be controlled using PCMR register. if encapsulation
offload is not used force_entropy_cap should be set to 0x0. Entropy
setting is enabled/disabled using entropy_calc, and could be
additionally enabled/disabled for GRE encapsulation by entropy_gre_calc.

As a pre-step to automatically control the tunnel entropy, introduce
the entropy fields in the PCMR register with no functional change.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 include/linux/mlx5/mlx5_ifc.h | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index b7bb774b57b0..3b83288749c6 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -8473,9 +8473,17 @@ struct mlx5_ifc_pamp_reg_bits {
 struct mlx5_ifc_pcmr_reg_bits {
 	u8         reserved_at_0[0x8];
 	u8         local_port[0x8];
-	u8         reserved_at_10[0x2e];
+	u8         reserved_at_10[0x10];
+	u8         entropy_force_cap[0x1];
+	u8         entropy_calc_cap[0x1];
+	u8         entropy_gre_calc_cap[0x1];
+	u8         reserved_at_23[0x1b];
 	u8         fcs_cap[0x1];
-	u8         reserved_at_3f[0x1f];
+	u8         reserved_at_3f[0x1];
+	u8         entropy_force[0x1];
+	u8         entropy_calc[0x1];
+	u8         entropy_gre_calc[0x1];
+	u8         reserved_at_43[0x1b];
 	u8         fcs_chk[0x1];
 	u8         reserved_at_5f[0x1];
 };
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [net-next 3/9] net/mlx5e: Fix GRE key by controlling port tunnel entropy calculation
  2019-02-22 21:44 [pull request][net-next 0/9] Mellanox, mlx5 updates 2019-02-21 Saeed Mahameed
  2019-02-22 21:44 ` [net-next 1/9] net/mlx5: Use read-modify-write when changing PCMR register values Saeed Mahameed
  2019-02-22 21:44 ` [net-next 2/9] net/mlx5: Introduce tunnel entropy control in PCMR register Saeed Mahameed
@ 2019-02-22 21:44 ` Saeed Mahameed
  2019-02-22 21:44 ` [net-next 4/9] net/mlx5e: Fix warn print in case of TX reporter creation failure Saeed Mahameed
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Saeed Mahameed @ 2019-02-22 21:44 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Eli Britstein, Oz Shlomo, Roi Dayan, Saeed Mahameed

From: Eli Britstein <elibr@mellanox.com>

Flow entropy is calculated on the inner packet headers and used for
flow distribution in processing, routing etc. For GRE-type
encapsulations the entropy value is placed in the eight LSB of the key
field in the GRE header as defined in NVGRE RFC 7637. For UDP based
encapsulations the entropy value is placed in the source port of the
UDP header.
The hardware may support entropy calculation specifically for GRE and
for all tunneling protocols. With commit df2ef3bff193 ("net/mlx5e: Add
GRE protocol offloading") GRE is offloaded, but the hardware is
configured by default to calculate flow entropy so packets transmitted
on the wire have a wrong key. To support UDP based tunnels (i.e VXLAN),
GRE (i.e. no flow entropy) and NVGRE (i.e. with flow entropy) the
hardware behaviour must be controlled by the driver.

Ensure port entropy calculation is enabled for offloaded VXLAN tunnels
and disable port entropy calculation in the presence of offloaded GRE
tunnels by monitoring the presence of entropy enabling tunnels (i.e
VXLAN) and entropy disabing tunnels (i.e GRE).

Fixes: df2ef3bff193 ("net/mlx5e: Add GRE protocol offloading")
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/Makefile  |   2 +-
 .../net/ethernet/mellanox/mlx5/core/en_rep.c  |  18 +-
 .../net/ethernet/mellanox/mlx5/core/en_rep.h  |   3 +
 .../mellanox/mlx5/core/lib/port_tun.c         | 205 ++++++++++++++++++
 .../mellanox/mlx5/core/lib/port_tun.h         |  24 ++
 .../net/ethernet/mellanox/mlx5/core/port.c    |   5 +-
 include/linux/mlx5/port.h                     |   2 +
 7 files changed, 254 insertions(+), 5 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/port_tun.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/port_tun.h

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 82d636baaa4e..17f1a8b28c0a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -30,7 +30,7 @@ mlx5_core-$(CONFIG_MLX5_CORE_EN) += en_main.o en_common.o en_fs.o en_ethtool.o \
 mlx5_core-$(CONFIG_MLX5_EN_ARFS)     += en_arfs.o
 mlx5_core-$(CONFIG_MLX5_EN_RXNFC)    += en_fs_ethtool.o
 mlx5_core-$(CONFIG_MLX5_CORE_EN_DCB) += en_dcbnl.o en/port_buffer.o
-mlx5_core-$(CONFIG_MLX5_ESWITCH)     += en_rep.o en_tc.o en/tc_tun.o
+mlx5_core-$(CONFIG_MLX5_ESWITCH)     += en_rep.o en_tc.o en/tc_tun.o lib/port_tun.o
 
 #
 # Core extra
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 287d48e5b073..4d033e01f6ab 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -44,6 +44,7 @@
 #include "en_tc.h"
 #include "en/tc_tun.h"
 #include "fs_core.h"
+#include "lib/port_tun.h"
 
 #define MLX5E_REP_PARAMS_DEF_LOG_SQ_SIZE \
         max(0x7, MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE)
@@ -1044,14 +1045,23 @@ static void mlx5e_rep_neigh_entry_destroy(struct mlx5e_priv *priv,
 int mlx5e_rep_encap_entry_attach(struct mlx5e_priv *priv,
 				 struct mlx5e_encap_entry *e)
 {
+	struct mlx5e_rep_priv *rpriv = priv->ppriv;
+	struct mlx5_rep_uplink_priv *uplink_priv = &rpriv->uplink_priv;
+	struct mlx5_tun_entropy *tun_entropy = &uplink_priv->tun_entropy;
 	struct mlx5e_neigh_hash_entry *nhe;
 	int err;
 
+	err = mlx5_tun_entropy_refcount_inc(tun_entropy, e->reformat_type);
+	if (err)
+		return err;
 	nhe = mlx5e_rep_neigh_entry_lookup(priv, &e->m_neigh);
 	if (!nhe) {
 		err = mlx5e_rep_neigh_entry_create(priv, e, &nhe);
-		if (err)
+		if (err) {
+			mlx5_tun_entropy_refcount_dec(tun_entropy,
+						      e->reformat_type);
 			return err;
+		}
 	}
 	list_add(&e->encap_list, &nhe->encap_list);
 	return 0;
@@ -1060,6 +1070,9 @@ int mlx5e_rep_encap_entry_attach(struct mlx5e_priv *priv,
 void mlx5e_rep_encap_entry_detach(struct mlx5e_priv *priv,
 				  struct mlx5e_encap_entry *e)
 {
+	struct mlx5e_rep_priv *rpriv = priv->ppriv;
+	struct mlx5_rep_uplink_priv *uplink_priv = &rpriv->uplink_priv;
+	struct mlx5_tun_entropy *tun_entropy = &uplink_priv->tun_entropy;
 	struct mlx5e_neigh_hash_entry *nhe;
 
 	list_del(&e->encap_list);
@@ -1067,6 +1080,7 @@ void mlx5e_rep_encap_entry_detach(struct mlx5e_priv *priv,
 
 	if (list_empty(&nhe->encap_list))
 		mlx5e_rep_neigh_entry_destroy(priv, nhe);
+	mlx5_tun_entropy_refcount_dec(tun_entropy, e->reformat_type);
 }
 
 static int mlx5e_vf_rep_open(struct net_device *dev)
@@ -1564,6 +1578,8 @@ static int mlx5e_init_rep_tx(struct mlx5e_priv *priv)
 		if (err)
 			goto destroy_tises;
 
+		mlx5_init_port_tun_entropy(&uplink_priv->tun_entropy, priv->mdev);
+
 		/* init indirect block notifications */
 		INIT_LIST_HEAD(&uplink_priv->tc_indr_block_priv_list);
 		uplink_priv->netdevice_nb.notifier_call = mlx5e_nic_rep_netdevice_event;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h
index 36eafc877e6b..1aa3e110bb97 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h
@@ -37,6 +37,7 @@
 #include <linux/rhashtable.h>
 #include "eswitch.h"
 #include "en.h"
+#include "lib/port_tun.h"
 
 #ifdef CONFIG_MLX5_ESWITCH
 struct mlx5e_neigh_update_table {
@@ -71,6 +72,8 @@ struct mlx5_rep_uplink_priv {
 	 */
 	struct list_head	    tc_indr_block_priv_list;
 	struct notifier_block	    netdevice_nb;
+
+	struct mlx5_tun_entropy tun_entropy;
 };
 
 struct mlx5e_rep_priv {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/port_tun.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/port_tun.c
new file mode 100644
index 000000000000..40f4a19b1ce1
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/port_tun.c
@@ -0,0 +1,205 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2019 Mellanox Technologies. */
+
+#include <linux/module.h>
+#include <linux/mlx5/driver.h>
+#include <linux/mlx5/port.h>
+#include <linux/mlx5/cmd.h>
+#include "mlx5_core.h"
+#include "lib/port_tun.h"
+
+struct mlx5_port_tun_entropy_flags {
+	bool force_supported, force_enabled;
+	bool calc_supported, calc_enabled;
+	bool gre_calc_supported, gre_calc_enabled;
+};
+
+static void mlx5_query_port_tun_entropy(struct mlx5_core_dev *mdev,
+					struct mlx5_port_tun_entropy_flags *entropy_flags)
+{
+	u32 out[MLX5_ST_SZ_DW(pcmr_reg)];
+	/* Default values for FW which do not support MLX5_REG_PCMR */
+	entropy_flags->force_supported = false;
+	entropy_flags->calc_supported = false;
+	entropy_flags->gre_calc_supported = false;
+	entropy_flags->force_enabled = false;
+	entropy_flags->calc_enabled = true;
+	entropy_flags->gre_calc_enabled = true;
+
+	if (!MLX5_CAP_GEN(mdev, ports_check))
+		return;
+
+	if (mlx5_query_ports_check(mdev, out, sizeof(out)))
+		return;
+
+	entropy_flags->force_supported = !!(MLX5_GET(pcmr_reg, out, entropy_force_cap));
+	entropy_flags->calc_supported = !!(MLX5_GET(pcmr_reg, out, entropy_calc_cap));
+	entropy_flags->gre_calc_supported = !!(MLX5_GET(pcmr_reg, out, entropy_gre_calc_cap));
+	entropy_flags->force_enabled = !!(MLX5_GET(pcmr_reg, out, entropy_force));
+	entropy_flags->calc_enabled = !!(MLX5_GET(pcmr_reg, out, entropy_calc));
+	entropy_flags->gre_calc_enabled = !!(MLX5_GET(pcmr_reg, out, entropy_gre_calc));
+}
+
+static int mlx5_set_port_tun_entropy_calc(struct mlx5_core_dev *mdev, u8 enable,
+					  u8 force)
+{
+	u32 in[MLX5_ST_SZ_DW(pcmr_reg)] = {0};
+	int err;
+
+	err = mlx5_query_ports_check(mdev, in, sizeof(in));
+	if (err)
+		return err;
+	MLX5_SET(pcmr_reg, in, local_port, 1);
+	MLX5_SET(pcmr_reg, in, entropy_force, force);
+	MLX5_SET(pcmr_reg, in, entropy_calc, enable);
+	return mlx5_set_ports_check(mdev, in, sizeof(in));
+}
+
+static int mlx5_set_port_gre_tun_entropy_calc(struct mlx5_core_dev *mdev,
+					      u8 enable, u8 force)
+{
+	u32 in[MLX5_ST_SZ_DW(pcmr_reg)] = {0};
+	int err;
+
+	err = mlx5_query_ports_check(mdev, in, sizeof(in));
+	if (err)
+		return err;
+	MLX5_SET(pcmr_reg, in, local_port, 1);
+	MLX5_SET(pcmr_reg, in, entropy_force, force);
+	MLX5_SET(pcmr_reg, in, entropy_gre_calc, enable);
+	return mlx5_set_ports_check(mdev, in, sizeof(in));
+}
+
+void mlx5_init_port_tun_entropy(struct mlx5_tun_entropy *tun_entropy,
+				struct mlx5_core_dev *mdev)
+{
+	struct mlx5_port_tun_entropy_flags entropy_flags;
+
+	tun_entropy->mdev = mdev;
+	mutex_init(&tun_entropy->lock);
+	mlx5_query_port_tun_entropy(mdev, &entropy_flags);
+	tun_entropy->num_enabling_entries = 0;
+	tun_entropy->num_disabling_entries = 0;
+	tun_entropy->enabled = entropy_flags.calc_enabled;
+	tun_entropy->enabled =
+		(entropy_flags.calc_supported) ?
+		entropy_flags.calc_enabled : true;
+}
+
+static int mlx5_set_entropy(struct mlx5_tun_entropy *tun_entropy,
+			    int reformat_type, bool enable)
+{
+	struct mlx5_port_tun_entropy_flags entropy_flags;
+	int err;
+
+	mlx5_query_port_tun_entropy(tun_entropy->mdev, &entropy_flags);
+	/* Tunnel entropy calculation may be controlled either on port basis
+	 * for all tunneling protocols or specifically for GRE protocol.
+	 * Prioritize GRE protocol control (if capable) over global port
+	 * configuration.
+	 */
+	if (entropy_flags.gre_calc_supported &&
+	    reformat_type == MLX5_REFORMAT_TYPE_L2_TO_NVGRE) {
+		/* Other applications may change the global FW entropy
+		 * calculations settings. Check that the current entropy value
+		 * is the negative of the updated value.
+		 */
+		if (entropy_flags.force_enabled &&
+		    enable == entropy_flags.gre_calc_enabled) {
+			mlx5_core_warn(tun_entropy->mdev,
+				       "Unexpected GRE entropy calc setting - expected %d",
+				       !entropy_flags.gre_calc_enabled);
+			return -EOPNOTSUPP;
+		}
+		err = mlx5_set_port_gre_tun_entropy_calc(tun_entropy->mdev, enable,
+							 entropy_flags.force_supported);
+		if (err)
+			return err;
+		/* if we turn on the entropy we don't need to force it anymore */
+		if (entropy_flags.force_supported && enable) {
+			err = mlx5_set_port_gre_tun_entropy_calc(tun_entropy->mdev, 1, 0);
+			if (err)
+				return err;
+		}
+	} else if (entropy_flags.calc_supported) {
+		/* Other applications may change the global FW entropy
+		 * calculations settings. Check that the current entropy value
+		 * is the negative of the updated value.
+		 */
+		if (entropy_flags.force_enabled &&
+		    enable == entropy_flags.calc_enabled) {
+			mlx5_core_warn(tun_entropy->mdev,
+				       "Unexpected entropy calc setting - expected %d",
+				       !entropy_flags.calc_enabled);
+			return -EOPNOTSUPP;
+		}
+		/* GRE requires disabling entropy calculation. if there are
+		 * enabling entries (i.e VXLAN) we cannot turn it off for them,
+		 * thus fail.
+		 */
+		if (tun_entropy->num_enabling_entries)
+			return -EOPNOTSUPP;
+		err = mlx5_set_port_tun_entropy_calc(tun_entropy->mdev, enable,
+						     entropy_flags.force_supported);
+		if (err)
+			return err;
+		tun_entropy->enabled = enable;
+		/* if we turn on the entropy we don't need to force it anymore */
+		if (entropy_flags.force_supported && enable) {
+			err = mlx5_set_port_tun_entropy_calc(tun_entropy->mdev, 1, 0);
+			if (err)
+				return err;
+		}
+	}
+
+	return 0;
+}
+
+/* the function manages the refcount for enabling/disabling tunnel types.
+ * the return value indicates if the inc is successful or not, depending on
+ * entropy capabilities and configuration.
+ */
+int mlx5_tun_entropy_refcount_inc(struct mlx5_tun_entropy *tun_entropy,
+				  int reformat_type)
+{
+	/* the default is error for unknown (non VXLAN/GRE tunnel types) */
+	int err = -EOPNOTSUPP;
+
+	mutex_lock(&tun_entropy->lock);
+	if (reformat_type == MLX5_REFORMAT_TYPE_L2_TO_VXLAN &&
+	    tun_entropy->enabled) {
+		/* in case entropy calculation is enabled for all tunneling
+		 * types, it is ok for VXLAN, so approve.
+		 * otherwise keep the error default.
+		 */
+		tun_entropy->num_enabling_entries++;
+		err = 0;
+	} else if (reformat_type == MLX5_REFORMAT_TYPE_L2_TO_NVGRE) {
+		/* turn off the entropy only for the first GRE rule.
+		 * for the next rules the entropy was already disabled
+		 * successfully.
+		 */
+		if (tun_entropy->num_disabling_entries == 0)
+			err = mlx5_set_entropy(tun_entropy, reformat_type, 0);
+		else
+			err = 0;
+		if (!err)
+			tun_entropy->num_disabling_entries++;
+	}
+	mutex_unlock(&tun_entropy->lock);
+
+	return err;
+}
+
+void mlx5_tun_entropy_refcount_dec(struct mlx5_tun_entropy *tun_entropy,
+				   int reformat_type)
+{
+	mutex_lock(&tun_entropy->lock);
+	if (reformat_type == MLX5_REFORMAT_TYPE_L2_TO_VXLAN)
+		tun_entropy->num_enabling_entries--;
+	else if (reformat_type == MLX5_REFORMAT_TYPE_L2_TO_NVGRE &&
+		 --tun_entropy->num_disabling_entries == 0)
+		mlx5_set_entropy(tun_entropy, reformat_type, 1);
+	mutex_unlock(&tun_entropy->lock);
+}
+
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/port_tun.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/port_tun.h
new file mode 100644
index 000000000000..54c42a88705e
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/port_tun.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2019 Mellanox Technologies. */
+
+#ifndef __MLX5_PORT_TUN_H__
+#define __MLX5_PORT_TUN_H__
+
+#include <linux/mlx5/driver.h>
+
+struct mlx5_tun_entropy {
+	struct mlx5_core_dev *mdev;
+	u32 num_enabling_entries;
+	u32 num_disabling_entries;
+	u8  enabled;
+	struct mutex lock;	/* lock the entropy fields */
+};
+
+void mlx5_init_port_tun_entropy(struct mlx5_tun_entropy *tun_entropy,
+				struct mlx5_core_dev *mdev);
+int mlx5_tun_entropy_refcount_inc(struct mlx5_tun_entropy *tun_entropy,
+				  int reformat_type);
+void mlx5_tun_entropy_refcount_dec(struct mlx5_tun_entropy *tun_entropy,
+				   int reformat_type);
+
+#endif /* __MLX5_PORT_TUN_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c
index 55b30d21a73a..21b7f05b16a5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/port.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c
@@ -764,8 +764,7 @@ int mlx5_query_port_wol(struct mlx5_core_dev *mdev, u8 *wol_mode)
 }
 EXPORT_SYMBOL_GPL(mlx5_query_port_wol);
 
-static int mlx5_query_ports_check(struct mlx5_core_dev *mdev, u32 *out,
-				  int outlen)
+int mlx5_query_ports_check(struct mlx5_core_dev *mdev, u32 *out, int outlen)
 {
 	u32 in[MLX5_ST_SZ_DW(pcmr_reg)] = {0};
 
@@ -774,7 +773,7 @@ static int mlx5_query_ports_check(struct mlx5_core_dev *mdev, u32 *out,
 				    outlen, MLX5_REG_PCMR, 0, 0);
 }
 
-static int mlx5_set_ports_check(struct mlx5_core_dev *mdev, u32 *in, int inlen)
+int mlx5_set_ports_check(struct mlx5_core_dev *mdev, u32 *in, int inlen)
 {
 	u32 out[MLX5_ST_SZ_DW(pcmr_reg)];
 
diff --git a/include/linux/mlx5/port.h b/include/linux/mlx5/port.h
index 814fa194663b..64e78394fc9c 100644
--- a/include/linux/mlx5/port.h
+++ b/include/linux/mlx5/port.h
@@ -182,6 +182,8 @@ int mlx5_query_port_ets_rate_limit(struct mlx5_core_dev *mdev,
 int mlx5_set_port_wol(struct mlx5_core_dev *mdev, u8 wol_mode);
 int mlx5_query_port_wol(struct mlx5_core_dev *mdev, u8 *wol_mode);
 
+int mlx5_query_ports_check(struct mlx5_core_dev *mdev, u32 *out, int outlen);
+int mlx5_set_ports_check(struct mlx5_core_dev *mdev, u32 *in, int inlen);
 int mlx5_set_port_fcs(struct mlx5_core_dev *mdev, u8 enable);
 void mlx5_query_port_fcs(struct mlx5_core_dev *mdev, bool *supported,
 			 bool *enabled);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [net-next 4/9] net/mlx5e: Fix warn print in case of TX reporter creation failure
  2019-02-22 21:44 [pull request][net-next 0/9] Mellanox, mlx5 updates 2019-02-21 Saeed Mahameed
                   ` (2 preceding siblings ...)
  2019-02-22 21:44 ` [net-next 3/9] net/mlx5e: Fix GRE key by controlling port tunnel entropy calculation Saeed Mahameed
@ 2019-02-22 21:44 ` Saeed Mahameed
  2019-02-22 21:44 ` [net-next 5/9] net/mlx5e: Re-add support for TX timeout when TX reporter is not valid Saeed Mahameed
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Saeed Mahameed @ 2019-02-22 21:44 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Saeed Mahameed

From: Eran Ben Elisha <eranbe@mellanox.com>

Print warning message in case of TX reporter creation failure, only if the
return value is ERR_PTR type. NULL pointer return indicates that
NET_DEVLINK is not set, and the warning print can be skipped.

Fixes: de8650a82071 ("net/mlx5e: Add tx reporter support")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
index 0aebfb377cf0..e05b8fce8dbb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
@@ -281,7 +281,7 @@ int mlx5e_tx_reporter_create(struct mlx5e_priv *priv)
 		devlink_health_reporter_create(devlink, &mlx5_tx_reporter_ops,
 					       MLX5_REPORTER_TX_GRACEFUL_PERIOD,
 					       true, priv);
-	if (IS_ERR_OR_NULL(priv->tx_reporter))
+	if (IS_ERR(priv->tx_reporter))
 		netdev_warn(priv->netdev,
 			    "Failed to create tx reporter, err = %ld\n",
 			    PTR_ERR(priv->tx_reporter));
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [net-next 5/9] net/mlx5e: Re-add support for TX timeout when TX reporter is not valid
  2019-02-22 21:44 [pull request][net-next 0/9] Mellanox, mlx5 updates 2019-02-21 Saeed Mahameed
                   ` (3 preceding siblings ...)
  2019-02-22 21:44 ` [net-next 4/9] net/mlx5e: Fix warn print in case of TX reporter creation failure Saeed Mahameed
@ 2019-02-22 21:44 ` Saeed Mahameed
  2019-02-22 21:44 ` [net-next 6/9] net/mlx5e: Fix return status of TX reporter timeout recover Saeed Mahameed
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Saeed Mahameed @ 2019-02-22 21:44 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Saeed Mahameed

From: Eran Ben Elisha <eranbe@mellanox.com>

When TX reporter was introduced, it took ownership over TX timeout error
handling. this introduced a regression in case TX reporter is not valid
(NET_DEVLINK is not set, or devlink_health_reporter_create failure).

Fix mlx5e_tx_reporter_timeout function so it can be called at all times.

In addition, remove a warning print that indicates that a TX timeout won't
be handled in case of no valid TX reporter.

Fixes: 7d91126b1aea ("net/mlx5e: Add tx timeout support for mlx5e tx reporter")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../ethernet/mellanox/mlx5/core/en/reporter_tx.c | 16 ++++++++++++++--
 .../net/ethernet/mellanox/mlx5/core/en_main.c    |  6 ------
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c  |  5 ++---
 3 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
index e05b8fce8dbb..201ea73e3021 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
@@ -113,6 +113,18 @@ static int mlx5e_tx_reporter_err_cqe_recover(struct mlx5e_txqsq *sq)
 	return 0;
 }
 
+static int mlx5_tx_health_report(struct devlink_health_reporter *tx_reporter,
+				 char *err_str,
+				 struct mlx5e_tx_err_ctx *err_ctx)
+{
+	if (IS_ERR_OR_NULL(tx_reporter)) {
+		netdev_err(err_ctx->sq->channel->netdev, err_str);
+		return err_ctx->recover(err_ctx->sq);
+	}
+
+	return devlink_health_report(tx_reporter, err_str, err_ctx);
+}
+
 void mlx5e_tx_reporter_err_cqe(struct mlx5e_txqsq *sq)
 {
 	char err_str[MLX5E_TX_REPORTER_PER_SQ_MAX_LEN];
@@ -122,7 +134,7 @@ void mlx5e_tx_reporter_err_cqe(struct mlx5e_txqsq *sq)
 	err_ctx.recover  = mlx5e_tx_reporter_err_cqe_recover;
 	sprintf(err_str, "ERR CQE on SQ: 0x%x", sq->sqn);
 
-	devlink_health_report(sq->channel->priv->tx_reporter, err_str,
+	mlx5_tx_health_report(sq->channel->priv->tx_reporter, err_str,
 			      &err_ctx);
 }
 
@@ -160,7 +172,7 @@ int mlx5e_tx_reporter_timeout(struct mlx5e_txqsq *sq)
 		sq->channel->ix, sq->sqn, sq->cq.mcq.cqn, sq->cc, sq->pc,
 		jiffies_to_usecs(jiffies - sq->txq->trans_start));
 
-	return devlink_health_report(sq->channel->priv->tx_reporter, err_str,
+	return mlx5_tx_health_report(sq->channel->priv->tx_reporter, err_str,
 				     &err_ctx);
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 878b3467e459..4f971f3d8ce7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4173,12 +4173,6 @@ static void mlx5e_tx_timeout(struct net_device *dev)
 	struct mlx5e_priv *priv = netdev_priv(dev);
 
 	netdev_err(dev, "TX timeout detected\n");
-
-	if (IS_ERR_OR_NULL(priv->tx_reporter)) {
-		netdev_err_once(priv->netdev, "tx timeout will not be handled, no valid tx reporter\n");
-		return;
-	}
-
 	queue_work(priv->wq, &priv->tx_timeout_work);
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index c1334a8ac8f3..d5fadbd6577e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -519,9 +519,8 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
 					      &sq->state)) {
 				mlx5e_dump_error_cqe(sq,
 						     (struct mlx5_err_cqe *)cqe);
-				if (!IS_ERR_OR_NULL(cq->channel->priv->tx_reporter))
-					queue_work(cq->channel->priv->wq,
-						   &sq->recover_work);
+				queue_work(cq->channel->priv->wq,
+					   &sq->recover_work);
 			}
 			stats->cqe_err++;
 		}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [net-next 6/9] net/mlx5e: Fix return status of TX reporter timeout recover
  2019-02-22 21:44 [pull request][net-next 0/9] Mellanox, mlx5 updates 2019-02-21 Saeed Mahameed
                   ` (4 preceding siblings ...)
  2019-02-22 21:44 ` [net-next 5/9] net/mlx5e: Re-add support for TX timeout when TX reporter is not valid Saeed Mahameed
@ 2019-02-22 21:44 ` Saeed Mahameed
  2019-02-22 21:44 ` [net-next 7/9] net/mlx5e: Fix mlx5e_tx_reporter_create return value Saeed Mahameed
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Saeed Mahameed @ 2019-02-22 21:44 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Maria Pasechnik, Saeed Mahameed

From: Eran Ben Elisha <eranbe@mellanox.com>

In case of lost interrupt recover, we shall return success. Fix that.

Fixes: 7d91126b1aea ("net/mlx5e: Add tx timeout support for mlx5e tx reporter")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reported-by: Maria Pasechnik <mariap@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
index 201ea73e3021..627392bead3b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
@@ -148,7 +148,7 @@ static int mlx5e_tx_reporter_timeout_recover(struct mlx5e_txqsq *sq)
 		   eq->core.eqn, eq->core.cons_index, eq->core.irqn);
 
 	eqe_count = mlx5_eq_poll_irq_disabled(eq);
-	ret = eqe_count ? true : false;
+	ret = eqe_count ? false : true;
 	if (!eqe_count) {
 		clear_bit(MLX5E_SQ_STATE_ENABLED, &sq->state);
 		return ret;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [net-next 7/9] net/mlx5e: Fix mlx5e_tx_reporter_create return value
  2019-02-22 21:44 [pull request][net-next 0/9] Mellanox, mlx5 updates 2019-02-21 Saeed Mahameed
                   ` (5 preceding siblings ...)
  2019-02-22 21:44 ` [net-next 6/9] net/mlx5e: Fix return status of TX reporter timeout recover Saeed Mahameed
@ 2019-02-22 21:44 ` Saeed Mahameed
  2019-02-22 21:44 ` [net-next 8/9] net/mlx5: E-Switch, Add support for VEPA in legacy mode Saeed Mahameed
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Saeed Mahameed @ 2019-02-22 21:44 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Saeed Mahameed

From: Eran Ben Elisha <eranbe@mellanox.com>

If reporter is ERR_PTR or NULL, error code shall be returned. At all other
cases it shall return success. Fix that.

Fixes: de8650a82071 ("net/mlx5e: Add tx reporter support")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
index 627392bead3b..75dad8797720 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
@@ -297,7 +297,7 @@ int mlx5e_tx_reporter_create(struct mlx5e_priv *priv)
 		netdev_warn(priv->netdev,
 			    "Failed to create tx reporter, err = %ld\n",
 			    PTR_ERR(priv->tx_reporter));
-	return PTR_ERR_OR_ZERO(priv->tx_reporter);
+	return IS_ERR_OR_NULL(priv->tx_reporter);
 }
 
 void mlx5e_tx_reporter_destroy(struct mlx5e_priv *priv)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [net-next 8/9] net/mlx5: E-Switch, Add support for VEPA in legacy mode.
  2019-02-22 21:44 [pull request][net-next 0/9] Mellanox, mlx5 updates 2019-02-21 Saeed Mahameed
                   ` (6 preceding siblings ...)
  2019-02-22 21:44 ` [net-next 7/9] net/mlx5e: Fix mlx5e_tx_reporter_create return value Saeed Mahameed
@ 2019-02-22 21:44 ` Saeed Mahameed
  2019-02-22 21:44 ` [net-next 9/9] net/mlx5: Support ndo bridge_setlink and getlink Saeed Mahameed
  2019-02-23 21:56 ` [pull request][net-next 0/9] Mellanox, mlx5 updates 2019-02-21 David Miller
  9 siblings, 0 replies; 11+ messages in thread
From: Saeed Mahameed @ 2019-02-22 21:44 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Huy Nguyen, Saeed Mahameed

From: Huy Nguyen <huyn@mellanox.com>

In Virtual Ethernet Port Aggregator (VEPA) mode, the packet skips
the system internal virtual switch and forwards to external network
switch. In Mellanox HCA case, the virtual switch is the HCA's Eswitch.

To support this, an new FDB flow table are created with level 0 and
linked to the existing FDB flow table in legacy mode. By default,
VEPA is turned off and this FDB flow table is empty. When VEPA is
turned on, two rules are created. One rule to forward on uplink vport
traffic to the legacy FDB. The other rule forward all other traffic
to uplink vport.

Other design alternatives were not chosen as explained below:
1. Create a forward rule in ACL flow table (most efficient design).
This approach is the not chosen because firmware does not support
forward rule to uplink vport (0xffff) for ACL flow table.
2. Add additional source port criteria in all the FDB rules to make the
FDB rules to be received rules only. This approach is not chosen because
it is not efficient as there can many rules in the FDB and VEPA mode
cannot be controlled per vport.
3. Add a highest prioirty flow group in the existing legacy FDB Flow
Table instead of a new flow table. This approoach does not work because the
new flow group has the same match criteria as the promiscuous flow group
and mlx5_add_flow_rules does not allow specifying flow group.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/eswitch.c | 223 ++++++++++++++++--
 .../net/ethernet/mellanox/mlx5/core/eswitch.h |   5 +
 2 files changed, 207 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
index 6cb9710f76d5..84200d93cf9d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -64,6 +64,9 @@ enum {
 	PROMISC_CHANGE = BIT(3),
 };
 
+static void esw_destroy_legacy_fdb_table(struct mlx5_eswitch *esw);
+static void esw_cleanup_vepa_rules(struct mlx5_eswitch *esw);
+
 /* Vport context events */
 #define SRIOV_VPORT_EVENTS (UC_ADDR_CHANGE | \
 			    MC_ADDR_CHANGE | \
@@ -268,6 +271,37 @@ esw_fdb_set_vport_promisc_rule(struct mlx5_eswitch *esw, u16 vport)
 	return __esw_fdb_set_vport_rule(esw, vport, true, mac_c, mac_v);
 }
 
+enum {
+	LEGACY_VEPA_PRIO = 0,
+	LEGACY_FDB_PRIO,
+};
+
+static int esw_create_legacy_vepa_table(struct mlx5_eswitch *esw)
+{
+	struct mlx5_core_dev *dev = esw->dev;
+	struct mlx5_flow_namespace *root_ns;
+	struct mlx5_flow_table *fdb;
+	int err;
+
+	root_ns = mlx5_get_fdb_sub_ns(dev, 0);
+	if (!root_ns) {
+		esw_warn(dev, "Failed to get FDB flow namespace\n");
+		return -EOPNOTSUPP;
+	}
+
+	/* num FTE 2, num FG 2 */
+	fdb = mlx5_create_auto_grouped_flow_table(root_ns, LEGACY_VEPA_PRIO,
+						  2, 2, 0, 0);
+	if (IS_ERR(fdb)) {
+		err = PTR_ERR(fdb);
+		esw_warn(dev, "Failed to create VEPA FDB err %d\n", err);
+		return err;
+	}
+	esw->fdb_table.legacy.vepa_fdb = fdb;
+
+	return 0;
+}
+
 static int esw_create_legacy_fdb_table(struct mlx5_eswitch *esw)
 {
 	int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in);
@@ -296,8 +330,8 @@ static int esw_create_legacy_fdb_table(struct mlx5_eswitch *esw)
 		return -ENOMEM;
 
 	table_size = BIT(MLX5_CAP_ESW_FLOWTABLE_FDB(dev, log_max_ft_size));
-
 	ft_attr.max_fte = table_size;
+	ft_attr.prio = LEGACY_FDB_PRIO;
 	fdb = mlx5_create_flow_table(root_ns, &ft_attr);
 	if (IS_ERR(fdb)) {
 		err = PTR_ERR(fdb);
@@ -356,41 +390,65 @@ static int esw_create_legacy_fdb_table(struct mlx5_eswitch *esw)
 	esw->fdb_table.legacy.promisc_grp = g;
 
 out:
-	if (err) {
-		if (!IS_ERR_OR_NULL(esw->fdb_table.legacy.allmulti_grp)) {
-			mlx5_destroy_flow_group(esw->fdb_table.legacy.allmulti_grp);
-			esw->fdb_table.legacy.allmulti_grp = NULL;
-		}
-		if (!IS_ERR_OR_NULL(esw->fdb_table.legacy.addr_grp)) {
-			mlx5_destroy_flow_group(esw->fdb_table.legacy.addr_grp);
-			esw->fdb_table.legacy.addr_grp = NULL;
-		}
-		if (!IS_ERR_OR_NULL(esw->fdb_table.legacy.fdb)) {
-			mlx5_destroy_flow_table(esw->fdb_table.legacy.fdb);
-			esw->fdb_table.legacy.fdb = NULL;
-		}
-	}
+	if (err)
+		esw_destroy_legacy_fdb_table(esw);
 
 	kvfree(flow_group_in);
 	return err;
 }
 
+static void esw_destroy_legacy_vepa_table(struct mlx5_eswitch *esw)
+{
+	esw_debug(esw->dev, "Destroy VEPA Table\n");
+	if (!esw->fdb_table.legacy.vepa_fdb)
+		return;
+
+	mlx5_destroy_flow_table(esw->fdb_table.legacy.vepa_fdb);
+	esw->fdb_table.legacy.vepa_fdb = NULL;
+}
+
 static void esw_destroy_legacy_fdb_table(struct mlx5_eswitch *esw)
 {
+	esw_debug(esw->dev, "Destroy FDB Table\n");
 	if (!esw->fdb_table.legacy.fdb)
 		return;
 
-	esw_debug(esw->dev, "Destroy FDB Table\n");
-	mlx5_destroy_flow_group(esw->fdb_table.legacy.promisc_grp);
-	mlx5_destroy_flow_group(esw->fdb_table.legacy.allmulti_grp);
-	mlx5_destroy_flow_group(esw->fdb_table.legacy.addr_grp);
+	if (esw->fdb_table.legacy.promisc_grp)
+		mlx5_destroy_flow_group(esw->fdb_table.legacy.promisc_grp);
+	if (esw->fdb_table.legacy.allmulti_grp)
+		mlx5_destroy_flow_group(esw->fdb_table.legacy.allmulti_grp);
+	if (esw->fdb_table.legacy.addr_grp)
+		mlx5_destroy_flow_group(esw->fdb_table.legacy.addr_grp);
 	mlx5_destroy_flow_table(esw->fdb_table.legacy.fdb);
+
 	esw->fdb_table.legacy.fdb = NULL;
 	esw->fdb_table.legacy.addr_grp = NULL;
 	esw->fdb_table.legacy.allmulti_grp = NULL;
 	esw->fdb_table.legacy.promisc_grp = NULL;
 }
 
+static int esw_create_legacy_table(struct mlx5_eswitch *esw)
+{
+	int err;
+
+	err = esw_create_legacy_vepa_table(esw);
+	if (err)
+		return err;
+
+	err = esw_create_legacy_fdb_table(esw);
+	if (err)
+		esw_destroy_legacy_vepa_table(esw);
+
+	return err;
+}
+
+static void esw_destroy_legacy_table(struct mlx5_eswitch *esw)
+{
+	esw_cleanup_vepa_rules(esw);
+	esw_destroy_legacy_fdb_table(esw);
+	esw_destroy_legacy_vepa_table(esw);
+}
+
 /* E-Switch vport UC/MC lists management */
 typedef int (*vport_addr_action)(struct mlx5_eswitch *esw,
 				 struct vport_addr *vaddr);
@@ -1677,7 +1735,9 @@ int mlx5_eswitch_enable_sriov(struct mlx5_eswitch *esw, int nvfs, int mode)
 	mlx5_lag_update(esw->dev);
 
 	if (mode == SRIOV_LEGACY) {
-		err = esw_create_legacy_fdb_table(esw);
+		err = esw_create_legacy_table(esw);
+		if (err)
+			goto abort;
 	} else {
 		mlx5_reload_interface(esw->dev, MLX5_INTERFACE_PROTOCOL_ETH);
 		mlx5_reload_interface(esw->dev, MLX5_INTERFACE_PROTOCOL_IB);
@@ -1758,7 +1818,7 @@ void mlx5_eswitch_disable_sriov(struct mlx5_eswitch *esw)
 	esw_destroy_tsar(esw);
 
 	if (esw->mode == SRIOV_LEGACY)
-		esw_destroy_legacy_fdb_table(esw);
+		esw_destroy_legacy_table(esw);
 	else if (esw->mode == SRIOV_OFFLOADS)
 		esw_offloads_cleanup(esw);
 
@@ -2041,6 +2101,127 @@ int mlx5_eswitch_set_vport_spoofchk(struct mlx5_eswitch *esw,
 	return err;
 }
 
+static void esw_cleanup_vepa_rules(struct mlx5_eswitch *esw)
+{
+	if (esw->fdb_table.legacy.vepa_uplink_rule)
+		mlx5_del_flow_rules(esw->fdb_table.legacy.vepa_uplink_rule);
+
+	if (esw->fdb_table.legacy.vepa_star_rule)
+		mlx5_del_flow_rules(esw->fdb_table.legacy.vepa_star_rule);
+
+	esw->fdb_table.legacy.vepa_uplink_rule = NULL;
+	esw->fdb_table.legacy.vepa_star_rule = NULL;
+}
+
+static int _mlx5_eswitch_set_vepa_locked(struct mlx5_eswitch *esw,
+					 u8 setting)
+{
+	struct mlx5_flow_destination dest = {};
+	struct mlx5_flow_act flow_act = {};
+	struct mlx5_flow_handle *flow_rule;
+	struct mlx5_flow_spec *spec;
+	int err = 0;
+	void *misc;
+
+	if (!setting) {
+		esw_cleanup_vepa_rules(esw);
+		return 0;
+	}
+
+	if (esw->fdb_table.legacy.vepa_uplink_rule)
+		return 0;
+
+	spec = kvzalloc(sizeof(*spec), GFP_KERNEL);
+	if (!spec)
+		return -ENOMEM;
+
+	/* Uplink rule forward uplink traffic to FDB */
+	misc = MLX5_ADDR_OF(fte_match_param, spec->match_value, misc_parameters);
+	MLX5_SET(fte_match_set_misc, misc, source_port, MLX5_VPORT_UPLINK);
+
+	misc = MLX5_ADDR_OF(fte_match_param, spec->match_criteria, misc_parameters);
+	MLX5_SET_TO_ONES(fte_match_set_misc, misc, source_port);
+
+	spec->match_criteria_enable = MLX5_MATCH_MISC_PARAMETERS;
+	dest.type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE;
+	dest.ft = esw->fdb_table.legacy.fdb;
+	flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
+	flow_rule = mlx5_add_flow_rules(esw->fdb_table.legacy.vepa_fdb, spec,
+					&flow_act, &dest, 1);
+	if (IS_ERR(flow_rule)) {
+		err = PTR_ERR(flow_rule);
+		goto out;
+	} else {
+		esw->fdb_table.legacy.vepa_uplink_rule = flow_rule;
+	}
+
+	/* Star rule to forward all traffic to uplink vport */
+	memset(spec, 0, sizeof(*spec));
+	dest.type = MLX5_FLOW_DESTINATION_TYPE_VPORT;
+	dest.vport.num = MLX5_VPORT_UPLINK;
+	flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
+	flow_rule = mlx5_add_flow_rules(esw->fdb_table.legacy.vepa_fdb, spec,
+					&flow_act, &dest, 1);
+	if (IS_ERR(flow_rule)) {
+		err = PTR_ERR(flow_rule);
+		goto out;
+	} else {
+		esw->fdb_table.legacy.vepa_star_rule = flow_rule;
+	}
+
+out:
+	kvfree(spec);
+	if (err)
+		esw_cleanup_vepa_rules(esw);
+	return err;
+}
+
+int mlx5_eswitch_set_vepa(struct mlx5_eswitch *esw, u8 setting)
+{
+	int err = 0;
+
+	if (!esw)
+		return -EOPNOTSUPP;
+
+	if (!ESW_ALLOWED(esw))
+		return -EPERM;
+
+	mutex_lock(&esw->state_lock);
+	if (esw->mode != SRIOV_LEGACY) {
+		err = -EOPNOTSUPP;
+		goto out;
+	}
+
+	err = _mlx5_eswitch_set_vepa_locked(esw, setting);
+
+out:
+	mutex_unlock(&esw->state_lock);
+	return err;
+}
+
+int mlx5_eswitch_get_vepa(struct mlx5_eswitch *esw, u8 *setting)
+{
+	int err = 0;
+
+	if (!esw)
+		return -EOPNOTSUPP;
+
+	if (!ESW_ALLOWED(esw))
+		return -EPERM;
+
+	mutex_lock(&esw->state_lock);
+	if (esw->mode != SRIOV_LEGACY) {
+		err = -EOPNOTSUPP;
+		goto out;
+	}
+
+	*setting = esw->fdb_table.legacy.vepa_uplink_rule ? 1 : 0;
+
+out:
+	mutex_unlock(&esw->state_lock);
+	return err;
+}
+
 int mlx5_eswitch_set_vport_trust(struct mlx5_eswitch *esw,
 				 int vport, bool setting)
 {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index af5581a57e56..5f82e637410b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -137,6 +137,9 @@ struct mlx5_eswitch_fdb {
 			struct mlx5_flow_group *addr_grp;
 			struct mlx5_flow_group *allmulti_grp;
 			struct mlx5_flow_group *promisc_grp;
+			struct mlx5_flow_table *vepa_fdb;
+			struct mlx5_flow_handle *vepa_uplink_rule;
+			struct mlx5_flow_handle *vepa_star_rule;
 		} legacy;
 
 		struct offloads_fdb {
@@ -242,6 +245,8 @@ int mlx5_eswitch_set_vport_trust(struct mlx5_eswitch *esw,
 				 int vport_num, bool setting);
 int mlx5_eswitch_set_vport_rate(struct mlx5_eswitch *esw, int vport,
 				u32 max_rate, u32 min_rate);
+int mlx5_eswitch_set_vepa(struct mlx5_eswitch *esw, u8 setting);
+int mlx5_eswitch_get_vepa(struct mlx5_eswitch *esw, u8 *setting);
 int mlx5_eswitch_get_vport_config(struct mlx5_eswitch *esw,
 				  int vport, struct ifla_vf_info *ivi);
 int mlx5_eswitch_get_vport_stats(struct mlx5_eswitch *esw,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [net-next 9/9] net/mlx5: Support ndo bridge_setlink and getlink
  2019-02-22 21:44 [pull request][net-next 0/9] Mellanox, mlx5 updates 2019-02-21 Saeed Mahameed
                   ` (7 preceding siblings ...)
  2019-02-22 21:44 ` [net-next 8/9] net/mlx5: E-Switch, Add support for VEPA in legacy mode Saeed Mahameed
@ 2019-02-22 21:44 ` Saeed Mahameed
  2019-02-23 21:56 ` [pull request][net-next 0/9] Mellanox, mlx5 updates 2019-02-21 David Miller
  9 siblings, 0 replies; 11+ messages in thread
From: Saeed Mahameed @ 2019-02-22 21:44 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Huy Nguyen, Saeed Mahameed

From: Huy Nguyen <huyn@mellanox.com>

Allow enabling VEPA mode on the HCA's port in legacy devlink mode.

Example:
bridge link set dev ens1f0 hwmode vepa
will turn on VEPA mode on the netdev ens1f0.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 59 +++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 4f971f3d8ce7..e5f74eb986b3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -35,6 +35,7 @@
 #include <linux/mlx5/fs.h>
 #include <net/vxlan.h>
 #include <linux/bpf.h>
+#include <linux/if_bridge.h>
 #include <net/page_pool.h>
 #include "eswitch.h"
 #include "en.h"
@@ -4305,6 +4306,61 @@ static int mlx5e_xdp(struct net_device *dev, struct netdev_bpf *xdp)
 	}
 }
 
+#ifdef CONFIG_MLX5_ESWITCH
+static int mlx5e_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq,
+				struct net_device *dev, u32 filter_mask,
+				int nlflags)
+{
+	struct mlx5e_priv *priv = netdev_priv(dev);
+	struct mlx5_core_dev *mdev = priv->mdev;
+	u8 mode, setting;
+	int err;
+
+	err = mlx5_eswitch_get_vepa(mdev->priv.eswitch, &setting);
+	if (err)
+		return err;
+	mode = setting ? BRIDGE_MODE_VEPA : BRIDGE_MODE_VEB;
+	return ndo_dflt_bridge_getlink(skb, pid, seq, dev,
+				       mode,
+				       0, 0, nlflags, filter_mask, NULL);
+}
+
+static int mlx5e_bridge_setlink(struct net_device *dev, struct nlmsghdr *nlh,
+				u16 flags, struct netlink_ext_ack *extack)
+{
+	struct mlx5e_priv *priv = netdev_priv(dev);
+	struct mlx5_core_dev *mdev = priv->mdev;
+	struct nlattr *attr, *br_spec;
+	u16 mode = BRIDGE_MODE_UNDEF;
+	u8 setting;
+	int rem;
+
+	br_spec = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg), IFLA_AF_SPEC);
+	if (!br_spec)
+		return -EINVAL;
+
+	nla_for_each_nested(attr, br_spec, rem) {
+		if (nla_type(attr) != IFLA_BRIDGE_MODE)
+			continue;
+
+		if (nla_len(attr) < sizeof(mode))
+			return -EINVAL;
+
+		mode = nla_get_u16(attr);
+		if (mode > BRIDGE_MODE_VEPA)
+			return -EINVAL;
+
+		break;
+	}
+
+	if (mode == BRIDGE_MODE_UNDEF)
+		return -EINVAL;
+
+	setting = (mode == BRIDGE_MODE_VEPA) ?  1 : 0;
+	return mlx5_eswitch_set_vepa(mdev->priv.eswitch, setting);
+}
+#endif
+
 const struct net_device_ops mlx5e_netdev_ops = {
 	.ndo_open                = mlx5e_open,
 	.ndo_stop                = mlx5e_close,
@@ -4331,6 +4387,9 @@ const struct net_device_ops mlx5e_netdev_ops = {
 	.ndo_rx_flow_steer	 = mlx5e_rx_flow_steer,
 #endif
 #ifdef CONFIG_MLX5_ESWITCH
+	.ndo_bridge_setlink      = mlx5e_bridge_setlink,
+	.ndo_bridge_getlink      = mlx5e_bridge_getlink,
+
 	/* SRIOV E-Switch NDOs */
 	.ndo_set_vf_mac          = mlx5e_set_vf_mac,
 	.ndo_set_vf_vlan         = mlx5e_set_vf_vlan,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [pull request][net-next 0/9] Mellanox, mlx5 updates 2019-02-21
  2019-02-22 21:44 [pull request][net-next 0/9] Mellanox, mlx5 updates 2019-02-21 Saeed Mahameed
                   ` (8 preceding siblings ...)
  2019-02-22 21:44 ` [net-next 9/9] net/mlx5: Support ndo bridge_setlink and getlink Saeed Mahameed
@ 2019-02-23 21:56 ` David Miller
  9 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2019-02-23 21:56 UTC (permalink / raw)
  To: saeedm; +Cc: netdev

From: Saeed Mahameed <saeedm@mellanox.com>
Date: Fri, 22 Feb 2019 13:44:17 -0800

> This series adds misc updates to mlx5 driver.
> For more information please see tag log below.
> 
> Please pull and let me know if there is any problem.

Pulled, thanks Saeed.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-02-23 21:56 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-22 21:44 [pull request][net-next 0/9] Mellanox, mlx5 updates 2019-02-21 Saeed Mahameed
2019-02-22 21:44 ` [net-next 1/9] net/mlx5: Use read-modify-write when changing PCMR register values Saeed Mahameed
2019-02-22 21:44 ` [net-next 2/9] net/mlx5: Introduce tunnel entropy control in PCMR register Saeed Mahameed
2019-02-22 21:44 ` [net-next 3/9] net/mlx5e: Fix GRE key by controlling port tunnel entropy calculation Saeed Mahameed
2019-02-22 21:44 ` [net-next 4/9] net/mlx5e: Fix warn print in case of TX reporter creation failure Saeed Mahameed
2019-02-22 21:44 ` [net-next 5/9] net/mlx5e: Re-add support for TX timeout when TX reporter is not valid Saeed Mahameed
2019-02-22 21:44 ` [net-next 6/9] net/mlx5e: Fix return status of TX reporter timeout recover Saeed Mahameed
2019-02-22 21:44 ` [net-next 7/9] net/mlx5e: Fix mlx5e_tx_reporter_create return value Saeed Mahameed
2019-02-22 21:44 ` [net-next 8/9] net/mlx5: E-Switch, Add support for VEPA in legacy mode Saeed Mahameed
2019-02-22 21:44 ` [net-next 9/9] net/mlx5: Support ndo bridge_setlink and getlink Saeed Mahameed
2019-02-23 21:56 ` [pull request][net-next 0/9] Mellanox, mlx5 updates 2019-02-21 David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).