All of lore.kernel.org
 help / color / mirror / Atom feed
* [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19
@ 2017-12-19 22:24 Saeed Mahameed
  2017-12-19 22:24 ` [net 01/14] net/mlx5: FPGA, return -EINVAL if size is zero Saeed Mahameed
                   ` (14 more replies)
  0 siblings, 15 replies; 23+ messages in thread
From: Saeed Mahameed @ 2017-12-19 22:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Saeed Mahameed

Hi Dave,

The follwoing series includes some fixes for mlx5 core and etherent
driver.

Please pull and let me know if there is any problem.

This series doesn't introduce any conflict with the ongoing mlx5 for-next
submission.

For -stable:

kernels >= v4.7.y
    ("net/mlx5e: Fix possible deadlock of VXLAN lock")
    ("net/mlx5e: Add refcount to VXLAN structure")
    ("net/mlx5e: Prevent possible races in VXLAN control flow")
    ("net/mlx5e: Fix features check of IPv6 traffic")

kernels >= v4.9.y
    ("net/mlx5: Fix error flow in CREATE_QP command")
    ("net/mlx5: Fix rate limit packet pacing naming and struct")

kernels >= v4.13.y
    ("net/mlx5: FPGA, return -EINVAL if size is zero")

kernels >= v4.14.y
    ("Revert "mlx5: move affinity hints assignments to generic code")

All above patches apply and compile with no issues on corresponding -stable.

Thanks,
Saeed.

---

The following changes since commit d03a45572efa068fa64db211d6d45222660e76c5:

  ipv4: fib: Fix metrics match when deleting a route (2017-12-19 14:21:58 -0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-fixes-2017-12-19

for you to fetch changes up to a2fba188fd5eadd6061bef4f2f2577a43231ebf3:

  net/mlx5: Stay in polling mode when command EQ destroy fails (2017-12-19 23:24:05 +0200)

----------------------------------------------------------------
mlx5-fixes-2017-12-19

Misc fixes for mlx5 core and mlx5 netdev driver.

----------------------------------------------------------------
Eran Ben Elisha (1):
      net/mlx5: Fix rate limit packet pacing naming and struct

Eugenia Emantayev (2):
      net/mlx5e: Fix defaulting RX ring size when not needed
      net/mlx5: Fix misspelling in the error message and comment

Gal Pressman (4):
      net/mlx5e: Fix features check of IPv6 traffic
      net/mlx5e: Fix possible deadlock of VXLAN lock
      net/mlx5e: Add refcount to VXLAN structure
      net/mlx5e: Prevent possible races in VXLAN control flow

Huy Nguyen (1):
      net/mlx5e: Fix ETS BW check

Kamal Heib (1):
      net/mlx5: FPGA, return -EINVAL if size is zero

Maor Gottlieb (1):
      net/mlx5: Fix steering memory leak

Moni Shoua (1):
      net/mlx5: Fix error flow in CREATE_QP command

Moshe Shemesh (2):
      net/mlx5: Cleanup IRQs in case of unload failure
      net/mlx5: Stay in polling mode when command EQ destroy fails

Saeed Mahameed (1):
      Revert "mlx5: move affinity hints assignments to generic code"

 drivers/net/ethernet/mellanox/mlx5/core/cmd.c      |  4 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  9 ++-
 drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c | 10 ++-
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   | 10 ++-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 63 +++++++++---------
 drivers/net/ethernet/mellanox/mlx5/core/eq.c       | 20 +++---
 drivers/net/ethernet/mellanox/mlx5/core/fpga/sdk.c |  6 ++
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c  | 16 ++++-
 drivers/net/ethernet/mellanox/mlx5/core/health.c   |  2 +-
 .../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c  |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/main.c     | 75 ++++++++++++++++++++--
 drivers/net/ethernet/mellanox/mlx5/core/qp.c       |  4 +-
 drivers/net/ethernet/mellanox/mlx5/core/rl.c       | 22 +++----
 drivers/net/ethernet/mellanox/mlx5/core/vxlan.c    | 64 ++++++++++--------
 drivers/net/ethernet/mellanox/mlx5/core/vxlan.h    |  1 +
 include/linux/mlx5/driver.h                        |  3 +-
 include/linux/mlx5/mlx5_ifc.h                      |  8 ++-
 17 files changed, 215 insertions(+), 104 deletions(-)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [net 01/14] net/mlx5: FPGA, return -EINVAL if size is zero
  2017-12-19 22:24 [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19 Saeed Mahameed
@ 2017-12-19 22:24 ` Saeed Mahameed
  2017-12-19 22:24 ` [net 02/14] Revert "mlx5: move affinity hints assignments to generic code" Saeed Mahameed
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Saeed Mahameed @ 2017-12-19 22:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Kamal Heib, Saeed Mahameed

From: Kamal Heib <kamalh@mellanox.com>

Currently, if a size of zero is passed to
mlx5_fpga_mem_{read|write}_i2c()
the "err" return value will not be initialized, which triggers gcc
warnings:

[..]/mlx5/core/fpga/sdk.c:87 mlx5_fpga_mem_read_i2c() error:
uninitialized symbol 'err'.
[..]/mlx5/core/fpga/sdk.c:115 mlx5_fpga_mem_write_i2c() error:
uninitialized symbol 'err'.

fix that.

Fixes: a9956d35d199 ('net/mlx5: FPGA, Add SBU infrastructure')
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/fpga/sdk.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fpga/sdk.c b/drivers/net/ethernet/mellanox/mlx5/core/fpga/sdk.c
index 3c11d6e2160a..14962969c5ba 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fpga/sdk.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fpga/sdk.c
@@ -66,6 +66,9 @@ static int mlx5_fpga_mem_read_i2c(struct mlx5_fpga_device *fdev, size_t size,
 	u8 actual_size;
 	int err;
 
+	if (!size)
+		return -EINVAL;
+
 	if (!fdev->mdev)
 		return -ENOTCONN;
 
@@ -95,6 +98,9 @@ static int mlx5_fpga_mem_write_i2c(struct mlx5_fpga_device *fdev, size_t size,
 	u8 actual_size;
 	int err;
 
+	if (!size)
+		return -EINVAL;
+
 	if (!fdev->mdev)
 		return -ENOTCONN;
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net 02/14] Revert "mlx5: move affinity hints assignments to generic code"
  2017-12-19 22:24 [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19 Saeed Mahameed
  2017-12-19 22:24 ` [net 01/14] net/mlx5: FPGA, return -EINVAL if size is zero Saeed Mahameed
@ 2017-12-19 22:24 ` Saeed Mahameed
  2017-12-19 22:42   ` Jes Sorensen
  2017-12-25 13:53   ` Sagi Grimberg
  2017-12-19 22:24 ` [net 03/14] net/mlx5: Fix rate limit packet pacing naming and struct Saeed Mahameed
                   ` (12 subsequent siblings)
  14 siblings, 2 replies; 23+ messages in thread
From: Saeed Mahameed @ 2017-12-19 22:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Saeed Mahameed, Sagi Grimberg, Thomas Gleixner, Jes Sorensen

Before the offending commit, mlx5 core did the IRQ affinity itself,
and it seems that the new generic code have some drawbacks and one
of them is the lack for user ability to modify irq affinity after
the initial affinity values got assigned.

The issue is still being discussed and a solution in the new generic code
is required, until then we need to revert this patch.

This fixes the following issue:
echo <new affinity> > /proc/irq/<x>/smp_affinity
fails with  -EIO

This reverts commit a435393acafbf0ecff4deb3e3cb554b34f0d0664.
Note: kept mlx5_get_vector_affinity in include/linux/mlx5/driver.h since
it is used in mlx5_ib driver.

Fixes: a435393acafb ("mlx5: move affinity hints assignments to generic code")
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jes Sorensen <jsorensen@fb.com>
Reported-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |  1 +
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 45 +++++++-------
 drivers/net/ethernet/mellanox/mlx5/core/main.c    | 75 +++++++++++++++++++++--
 include/linux/mlx5/driver.h                       |  1 +
 4 files changed, 93 insertions(+), 29 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index c0872b3284cb..43f9054830e5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -590,6 +590,7 @@ struct mlx5e_channel {
 	struct mlx5_core_dev      *mdev;
 	struct hwtstamp_config    *tstamp;
 	int                        ix;
+	int                        cpu;
 };
 
 struct mlx5e_channels {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index d2b057a3e512..cbec66bc82f1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -71,11 +71,6 @@ struct mlx5e_channel_param {
 	struct mlx5e_cq_param      icosq_cq;
 };
 
-static int mlx5e_get_node(struct mlx5e_priv *priv, int ix)
-{
-	return pci_irq_get_node(priv->mdev->pdev, MLX5_EQ_VEC_COMP_BASE + ix);
-}
-
 static bool mlx5e_check_fragmented_striding_rq_cap(struct mlx5_core_dev *mdev)
 {
 	return MLX5_CAP_GEN(mdev, striding_rq) &&
@@ -444,17 +439,16 @@ static int mlx5e_rq_alloc_mpwqe_info(struct mlx5e_rq *rq,
 	int wq_sz = mlx5_wq_ll_get_size(&rq->wq);
 	int mtt_sz = mlx5e_get_wqe_mtt_sz();
 	int mtt_alloc = mtt_sz + MLX5_UMR_ALIGN - 1;
-	int node = mlx5e_get_node(c->priv, c->ix);
 	int i;
 
 	rq->mpwqe.info = kzalloc_node(wq_sz * sizeof(*rq->mpwqe.info),
-					GFP_KERNEL, node);
+				      GFP_KERNEL, cpu_to_node(c->cpu));
 	if (!rq->mpwqe.info)
 		goto err_out;
 
 	/* We allocate more than mtt_sz as we will align the pointer */
-	rq->mpwqe.mtt_no_align = kzalloc_node(mtt_alloc * wq_sz,
-					GFP_KERNEL, node);
+	rq->mpwqe.mtt_no_align = kzalloc_node(mtt_alloc * wq_sz, GFP_KERNEL,
+					cpu_to_node(c->cpu));
 	if (unlikely(!rq->mpwqe.mtt_no_align))
 		goto err_free_wqe_info;
 
@@ -562,7 +556,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
 	int err;
 	int i;
 
-	rqp->wq.db_numa_node = mlx5e_get_node(c->priv, c->ix);
+	rqp->wq.db_numa_node = cpu_to_node(c->cpu);
 
 	err = mlx5_wq_ll_create(mdev, &rqp->wq, rqc_wq, &rq->wq,
 				&rq->wq_ctrl);
@@ -629,8 +623,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
 	default: /* MLX5_WQ_TYPE_LINKED_LIST */
 		rq->wqe.frag_info =
 			kzalloc_node(wq_sz * sizeof(*rq->wqe.frag_info),
-				     GFP_KERNEL,
-				     mlx5e_get_node(c->priv, c->ix));
+				     GFP_KERNEL, cpu_to_node(c->cpu));
 		if (!rq->wqe.frag_info) {
 			err = -ENOMEM;
 			goto err_rq_wq_destroy;
@@ -1000,13 +993,13 @@ static int mlx5e_alloc_xdpsq(struct mlx5e_channel *c,
 	sq->uar_map   = mdev->mlx5e_res.bfreg.map;
 	sq->min_inline_mode = params->tx_min_inline_mode;
 
-	param->wq.db_numa_node = mlx5e_get_node(c->priv, c->ix);
+	param->wq.db_numa_node = cpu_to_node(c->cpu);
 	err = mlx5_wq_cyc_create(mdev, &param->wq, sqc_wq, &sq->wq, &sq->wq_ctrl);
 	if (err)
 		return err;
 	sq->wq.db = &sq->wq.db[MLX5_SND_DBR];
 
-	err = mlx5e_alloc_xdpsq_db(sq, mlx5e_get_node(c->priv, c->ix));
+	err = mlx5e_alloc_xdpsq_db(sq, cpu_to_node(c->cpu));
 	if (err)
 		goto err_sq_wq_destroy;
 
@@ -1053,13 +1046,13 @@ static int mlx5e_alloc_icosq(struct mlx5e_channel *c,
 	sq->channel   = c;
 	sq->uar_map   = mdev->mlx5e_res.bfreg.map;
 
-	param->wq.db_numa_node = mlx5e_get_node(c->priv, c->ix);
+	param->wq.db_numa_node = cpu_to_node(c->cpu);
 	err = mlx5_wq_cyc_create(mdev, &param->wq, sqc_wq, &sq->wq, &sq->wq_ctrl);
 	if (err)
 		return err;
 	sq->wq.db = &sq->wq.db[MLX5_SND_DBR];
 
-	err = mlx5e_alloc_icosq_db(sq, mlx5e_get_node(c->priv, c->ix));
+	err = mlx5e_alloc_icosq_db(sq, cpu_to_node(c->cpu));
 	if (err)
 		goto err_sq_wq_destroy;
 
@@ -1126,13 +1119,13 @@ static int mlx5e_alloc_txqsq(struct mlx5e_channel *c,
 	if (MLX5_IPSEC_DEV(c->priv->mdev))
 		set_bit(MLX5E_SQ_STATE_IPSEC, &sq->state);
 
-	param->wq.db_numa_node = mlx5e_get_node(c->priv, c->ix);
+	param->wq.db_numa_node = cpu_to_node(c->cpu);
 	err = mlx5_wq_cyc_create(mdev, &param->wq, sqc_wq, &sq->wq, &sq->wq_ctrl);
 	if (err)
 		return err;
 	sq->wq.db    = &sq->wq.db[MLX5_SND_DBR];
 
-	err = mlx5e_alloc_txqsq_db(sq, mlx5e_get_node(c->priv, c->ix));
+	err = mlx5e_alloc_txqsq_db(sq, cpu_to_node(c->cpu));
 	if (err)
 		goto err_sq_wq_destroy;
 
@@ -1504,8 +1497,8 @@ static int mlx5e_alloc_cq(struct mlx5e_channel *c,
 	struct mlx5_core_dev *mdev = c->priv->mdev;
 	int err;
 
-	param->wq.buf_numa_node = mlx5e_get_node(c->priv, c->ix);
-	param->wq.db_numa_node  = mlx5e_get_node(c->priv, c->ix);
+	param->wq.buf_numa_node = cpu_to_node(c->cpu);
+	param->wq.db_numa_node  = cpu_to_node(c->cpu);
 	param->eq_ix   = c->ix;
 
 	err = mlx5e_alloc_cq_common(mdev, param, cq);
@@ -1604,6 +1597,11 @@ static void mlx5e_close_cq(struct mlx5e_cq *cq)
 	mlx5e_free_cq(cq);
 }
 
+static int mlx5e_get_cpu(struct mlx5e_priv *priv, int ix)
+{
+	return cpumask_first(priv->mdev->priv.irq_info[ix].mask);
+}
+
 static int mlx5e_open_tx_cqs(struct mlx5e_channel *c,
 			     struct mlx5e_params *params,
 			     struct mlx5e_channel_param *cparam)
@@ -1752,12 +1750,13 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
 {
 	struct mlx5e_cq_moder icocq_moder = {0, 0};
 	struct net_device *netdev = priv->netdev;
+	int cpu = mlx5e_get_cpu(priv, ix);
 	struct mlx5e_channel *c;
 	unsigned int irq;
 	int err;
 	int eqn;
 
-	c = kzalloc_node(sizeof(*c), GFP_KERNEL, mlx5e_get_node(priv, ix));
+	c = kzalloc_node(sizeof(*c), GFP_KERNEL, cpu_to_node(cpu));
 	if (!c)
 		return -ENOMEM;
 
@@ -1765,6 +1764,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
 	c->mdev     = priv->mdev;
 	c->tstamp   = &priv->tstamp;
 	c->ix       = ix;
+	c->cpu      = cpu;
 	c->pdev     = &priv->mdev->pdev->dev;
 	c->netdev   = priv->netdev;
 	c->mkey_be  = cpu_to_be32(priv->mdev->mlx5e_res.mkey.key);
@@ -1853,8 +1853,7 @@ static void mlx5e_activate_channel(struct mlx5e_channel *c)
 	for (tc = 0; tc < c->num_tc; tc++)
 		mlx5e_activate_txqsq(&c->sq[tc]);
 	mlx5e_activate_rq(&c->rq);
-	netif_set_xps_queue(c->netdev,
-		mlx5_get_vector_affinity(c->priv->mdev, c->ix), c->ix);
+	netif_set_xps_queue(c->netdev, get_cpu_mask(c->cpu), c->ix);
 }
 
 static void mlx5e_deactivate_channel(struct mlx5e_channel *c)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 5f323442cc5a..8a89c7e8cd63 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -317,9 +317,6 @@ static int mlx5_alloc_irq_vectors(struct mlx5_core_dev *dev)
 {
 	struct mlx5_priv *priv = &dev->priv;
 	struct mlx5_eq_table *table = &priv->eq_table;
-	struct irq_affinity irqdesc = {
-		.pre_vectors = MLX5_EQ_VEC_COMP_BASE,
-	};
 	int num_eqs = 1 << MLX5_CAP_GEN(dev, log_max_eq);
 	int nvec;
 
@@ -333,10 +330,9 @@ static int mlx5_alloc_irq_vectors(struct mlx5_core_dev *dev)
 	if (!priv->irq_info)
 		goto err_free_msix;
 
-	nvec = pci_alloc_irq_vectors_affinity(dev->pdev,
+	nvec = pci_alloc_irq_vectors(dev->pdev,
 			MLX5_EQ_VEC_COMP_BASE + 1, nvec,
-			PCI_IRQ_MSIX | PCI_IRQ_AFFINITY,
-			&irqdesc);
+			PCI_IRQ_MSIX);
 	if (nvec < 0)
 		return nvec;
 
@@ -622,6 +618,63 @@ u64 mlx5_read_internal_timer(struct mlx5_core_dev *dev)
 	return (u64)timer_l | (u64)timer_h1 << 32;
 }
 
+static int mlx5_irq_set_affinity_hint(struct mlx5_core_dev *mdev, int i)
+{
+	struct mlx5_priv *priv  = &mdev->priv;
+	int irq = pci_irq_vector(mdev->pdev, MLX5_EQ_VEC_COMP_BASE + i);
+
+	if (!zalloc_cpumask_var(&priv->irq_info[i].mask, GFP_KERNEL)) {
+		mlx5_core_warn(mdev, "zalloc_cpumask_var failed");
+		return -ENOMEM;
+	}
+
+	cpumask_set_cpu(cpumask_local_spread(i, priv->numa_node),
+			priv->irq_info[i].mask);
+
+	if (IS_ENABLED(CONFIG_SMP) &&
+	    irq_set_affinity_hint(irq, priv->irq_info[i].mask))
+		mlx5_core_warn(mdev, "irq_set_affinity_hint failed, irq 0x%.4x", irq);
+
+	return 0;
+}
+
+static void mlx5_irq_clear_affinity_hint(struct mlx5_core_dev *mdev, int i)
+{
+	struct mlx5_priv *priv  = &mdev->priv;
+	int irq = pci_irq_vector(mdev->pdev, MLX5_EQ_VEC_COMP_BASE + i);
+
+	irq_set_affinity_hint(irq, NULL);
+	free_cpumask_var(priv->irq_info[i].mask);
+}
+
+static int mlx5_irq_set_affinity_hints(struct mlx5_core_dev *mdev)
+{
+	int err;
+	int i;
+
+	for (i = 0; i < mdev->priv.eq_table.num_comp_vectors; i++) {
+		err = mlx5_irq_set_affinity_hint(mdev, i);
+		if (err)
+			goto err_out;
+	}
+
+	return 0;
+
+err_out:
+	for (i--; i >= 0; i--)
+		mlx5_irq_clear_affinity_hint(mdev, i);
+
+	return err;
+}
+
+static void mlx5_irq_clear_affinity_hints(struct mlx5_core_dev *mdev)
+{
+	int i;
+
+	for (i = 0; i < mdev->priv.eq_table.num_comp_vectors; i++)
+		mlx5_irq_clear_affinity_hint(mdev, i);
+}
+
 int mlx5_vector2eqn(struct mlx5_core_dev *dev, int vector, int *eqn,
 		    unsigned int *irqn)
 {
@@ -1097,6 +1150,12 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
 		goto err_stop_eqs;
 	}
 
+	err = mlx5_irq_set_affinity_hints(dev);
+	if (err) {
+		dev_err(&pdev->dev, "Failed to alloc affinity hint cpumask\n");
+		goto err_affinity_hints;
+	}
+
 	err = mlx5_init_fs(dev);
 	if (err) {
 		dev_err(&pdev->dev, "Failed to init flow steering\n");
@@ -1154,6 +1213,9 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
 	mlx5_cleanup_fs(dev);
 
 err_fs:
+	mlx5_irq_clear_affinity_hints(dev);
+
+err_affinity_hints:
 	free_comp_eqs(dev);
 
 err_stop_eqs:
@@ -1222,6 +1284,7 @@ static int mlx5_unload_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
 
 	mlx5_sriov_detach(dev);
 	mlx5_cleanup_fs(dev);
+	mlx5_irq_clear_affinity_hints(dev);
 	free_comp_eqs(dev);
 	mlx5_stop_eqs(dev);
 	mlx5_put_uars_page(dev, priv->uar);
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index a886b51511ab..40a6f33c4cde 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -556,6 +556,7 @@ struct mlx5_core_sriov {
 };
 
 struct mlx5_irq_info {
+	cpumask_var_t mask;
 	char name[MLX5_MAX_IRQ_NAME];
 };
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net 03/14] net/mlx5: Fix rate limit packet pacing naming and struct
  2017-12-19 22:24 [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19 Saeed Mahameed
  2017-12-19 22:24 ` [net 01/14] net/mlx5: FPGA, return -EINVAL if size is zero Saeed Mahameed
  2017-12-19 22:24 ` [net 02/14] Revert "mlx5: move affinity hints assignments to generic code" Saeed Mahameed
@ 2017-12-19 22:24 ` Saeed Mahameed
  2017-12-19 22:24 ` [net 04/14] net/mlx5e: Fix ETS BW check Saeed Mahameed
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Saeed Mahameed @ 2017-12-19 22:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, Saeed Mahameed

From: Eran Ben Elisha <eranbe@mellanox.com>

In mlx5_ifc, struct size was not complete, and thus driver was sending
garbage after the last defined field. Fixed it by adding reserved field
to complete the struct size.

In addition, rename all set_rate_limit to set_pp_rate_limit to be
compliant with the Firmware <-> Driver definition.

Fixes: 7486216b3a0b ("{net,IB}/mlx5: mlx5_ifc updates")
Fixes: 1466cc5b23d1 ("net/mlx5: Rate limit tables support")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c |  4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/rl.c  | 22 +++++++++++-----------
 include/linux/mlx5/mlx5_ifc.h                 |  8 +++++---
 3 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 1fffdebbc9e8..e9a1fbcc4adf 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -362,7 +362,7 @@ static int mlx5_internal_err_ret_value(struct mlx5_core_dev *dev, u16 op,
 	case MLX5_CMD_OP_QUERY_VPORT_COUNTER:
 	case MLX5_CMD_OP_ALLOC_Q_COUNTER:
 	case MLX5_CMD_OP_QUERY_Q_COUNTER:
-	case MLX5_CMD_OP_SET_RATE_LIMIT:
+	case MLX5_CMD_OP_SET_PP_RATE_LIMIT:
 	case MLX5_CMD_OP_QUERY_RATE_LIMIT:
 	case MLX5_CMD_OP_CREATE_SCHEDULING_ELEMENT:
 	case MLX5_CMD_OP_QUERY_SCHEDULING_ELEMENT:
@@ -505,7 +505,7 @@ const char *mlx5_command_str(int command)
 	MLX5_COMMAND_STR_CASE(ALLOC_Q_COUNTER);
 	MLX5_COMMAND_STR_CASE(DEALLOC_Q_COUNTER);
 	MLX5_COMMAND_STR_CASE(QUERY_Q_COUNTER);
-	MLX5_COMMAND_STR_CASE(SET_RATE_LIMIT);
+	MLX5_COMMAND_STR_CASE(SET_PP_RATE_LIMIT);
 	MLX5_COMMAND_STR_CASE(QUERY_RATE_LIMIT);
 	MLX5_COMMAND_STR_CASE(CREATE_SCHEDULING_ELEMENT);
 	MLX5_COMMAND_STR_CASE(DESTROY_SCHEDULING_ELEMENT);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/rl.c b/drivers/net/ethernet/mellanox/mlx5/core/rl.c
index e651e4c02867..d3c33e9eea72 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/rl.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/rl.c
@@ -125,16 +125,16 @@ static struct mlx5_rl_entry *find_rl_entry(struct mlx5_rl_table *table,
 	return ret_entry;
 }
 
-static int mlx5_set_rate_limit_cmd(struct mlx5_core_dev *dev,
+static int mlx5_set_pp_rate_limit_cmd(struct mlx5_core_dev *dev,
 				   u32 rate, u16 index)
 {
-	u32 in[MLX5_ST_SZ_DW(set_rate_limit_in)]   = {0};
-	u32 out[MLX5_ST_SZ_DW(set_rate_limit_out)] = {0};
+	u32 in[MLX5_ST_SZ_DW(set_pp_rate_limit_in)]   = {0};
+	u32 out[MLX5_ST_SZ_DW(set_pp_rate_limit_out)] = {0};
 
-	MLX5_SET(set_rate_limit_in, in, opcode,
-		 MLX5_CMD_OP_SET_RATE_LIMIT);
-	MLX5_SET(set_rate_limit_in, in, rate_limit_index, index);
-	MLX5_SET(set_rate_limit_in, in, rate_limit, rate);
+	MLX5_SET(set_pp_rate_limit_in, in, opcode,
+		 MLX5_CMD_OP_SET_PP_RATE_LIMIT);
+	MLX5_SET(set_pp_rate_limit_in, in, rate_limit_index, index);
+	MLX5_SET(set_pp_rate_limit_in, in, rate_limit, rate);
 	return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
 }
 
@@ -173,7 +173,7 @@ int mlx5_rl_add_rate(struct mlx5_core_dev *dev, u32 rate, u16 *index)
 		entry->refcount++;
 	} else {
 		/* new rate limit */
-		err = mlx5_set_rate_limit_cmd(dev, rate, entry->index);
+		err = mlx5_set_pp_rate_limit_cmd(dev, rate, entry->index);
 		if (err) {
 			mlx5_core_err(dev, "Failed configuring rate: %u (%d)\n",
 				      rate, err);
@@ -209,7 +209,7 @@ void mlx5_rl_remove_rate(struct mlx5_core_dev *dev, u32 rate)
 	entry->refcount--;
 	if (!entry->refcount) {
 		/* need to remove rate */
-		mlx5_set_rate_limit_cmd(dev, 0, entry->index);
+		mlx5_set_pp_rate_limit_cmd(dev, 0, entry->index);
 		entry->rate = 0;
 	}
 
@@ -262,8 +262,8 @@ void mlx5_cleanup_rl_table(struct mlx5_core_dev *dev)
 	/* Clear all configured rates */
 	for (i = 0; i < table->max_size; i++)
 		if (table->rl_entry[i].rate)
-			mlx5_set_rate_limit_cmd(dev, 0,
-						table->rl_entry[i].index);
+			mlx5_set_pp_rate_limit_cmd(dev, 0,
+						   table->rl_entry[i].index);
 
 	kfree(dev->priv.rl_table.rl_entry);
 }
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 38a7577a9ce7..d44ec5f41d4a 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -147,7 +147,7 @@ enum {
 	MLX5_CMD_OP_ALLOC_Q_COUNTER               = 0x771,
 	MLX5_CMD_OP_DEALLOC_Q_COUNTER             = 0x772,
 	MLX5_CMD_OP_QUERY_Q_COUNTER               = 0x773,
-	MLX5_CMD_OP_SET_RATE_LIMIT                = 0x780,
+	MLX5_CMD_OP_SET_PP_RATE_LIMIT             = 0x780,
 	MLX5_CMD_OP_QUERY_RATE_LIMIT              = 0x781,
 	MLX5_CMD_OP_CREATE_SCHEDULING_ELEMENT      = 0x782,
 	MLX5_CMD_OP_DESTROY_SCHEDULING_ELEMENT     = 0x783,
@@ -7239,7 +7239,7 @@ struct mlx5_ifc_add_vxlan_udp_dport_in_bits {
 	u8         vxlan_udp_port[0x10];
 };
 
-struct mlx5_ifc_set_rate_limit_out_bits {
+struct mlx5_ifc_set_pp_rate_limit_out_bits {
 	u8         status[0x8];
 	u8         reserved_at_8[0x18];
 
@@ -7248,7 +7248,7 @@ struct mlx5_ifc_set_rate_limit_out_bits {
 	u8         reserved_at_40[0x40];
 };
 
-struct mlx5_ifc_set_rate_limit_in_bits {
+struct mlx5_ifc_set_pp_rate_limit_in_bits {
 	u8         opcode[0x10];
 	u8         reserved_at_10[0x10];
 
@@ -7261,6 +7261,8 @@ struct mlx5_ifc_set_rate_limit_in_bits {
 	u8         reserved_at_60[0x20];
 
 	u8         rate_limit[0x20];
+
+	u8         reserved_at_a0[0x160];
 };
 
 struct mlx5_ifc_access_register_out_bits {
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net 04/14] net/mlx5e: Fix ETS BW check
  2017-12-19 22:24 [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19 Saeed Mahameed
                   ` (2 preceding siblings ...)
  2017-12-19 22:24 ` [net 03/14] net/mlx5: Fix rate limit packet pacing naming and struct Saeed Mahameed
@ 2017-12-19 22:24 ` Saeed Mahameed
  2017-12-20  7:42   ` Or Gerlitz
  2017-12-19 22:24 ` [net 05/14] net/mlx5e: Fix features check of IPv6 traffic Saeed Mahameed
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 23+ messages in thread
From: Saeed Mahameed @ 2017-12-19 22:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Huy Nguyen, Moshe Shemesh, Saeed Mahameed

From: Huy Nguyen <huyn@mellanox.com>

Fix bug that allows ets bw sum to be 0% when ets tc type exists.

Fixes: 08fb1dacdd76 ('net/mlx5e: Support DCBNL IEEE ETS')
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
index c6d90b6dd80e..9bcf38f4123b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
@@ -274,6 +274,7 @@ int mlx5e_dcbnl_ieee_setets_core(struct mlx5e_priv *priv, struct ieee_ets *ets)
 static int mlx5e_dbcnl_validate_ets(struct net_device *netdev,
 				    struct ieee_ets *ets)
 {
+	bool have_ets_tc = false;
 	int bw_sum = 0;
 	int i;
 
@@ -288,11 +289,14 @@ static int mlx5e_dbcnl_validate_ets(struct net_device *netdev,
 	}
 
 	/* Validate Bandwidth Sum */
-	for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++)
-		if (ets->tc_tsa[i] == IEEE_8021QAZ_TSA_ETS)
+	for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) {
+		if (ets->tc_tsa[i] == IEEE_8021QAZ_TSA_ETS) {
+			have_ets_tc = true;
 			bw_sum += ets->tc_tx_bw[i];
+		}
+	}
 
-	if (bw_sum != 0 && bw_sum != 100) {
+	if (have_ets_tc && bw_sum != 100) {
 		netdev_err(netdev,
 			   "Failed to validate ETS: BW sum is illegal\n");
 		return -EINVAL;
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net 05/14] net/mlx5e: Fix features check of IPv6 traffic
  2017-12-19 22:24 [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19 Saeed Mahameed
                   ` (3 preceding siblings ...)
  2017-12-19 22:24 ` [net 04/14] net/mlx5e: Fix ETS BW check Saeed Mahameed
@ 2017-12-19 22:24 ` Saeed Mahameed
  2017-12-20  7:45   ` Or Gerlitz
  2017-12-19 22:24 ` [net 06/14] net/mlx5e: Fix defaulting RX ring size when not needed Saeed Mahameed
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 23+ messages in thread
From: Saeed Mahameed @ 2017-12-19 22:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Gal Pressman, Saeed Mahameed

From: Gal Pressman <galp@mellanox.com>

The assumption that the next header field contains the transport
protocol is wrong for IPv6 packets with extension headers.
Instead, we should look the inner-most next header field in the buffer.
This will fix TSO offload for tunnels over IPv6 with extension headers.

Performance testing: 19.25x improvement, cool!
Measuring bandwidth of 16 threads TCP traffic over IPv6 GRE tap.
CPU: Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz
NIC: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
TSO: Enabled
Before: 4,926.24  Mbps
Now   : 94,827.91 Mbps

Fixes: b3f63c3d5e2c ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index cbec66bc82f1..c535a44ab8ac 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3678,6 +3678,7 @@ static netdev_features_t mlx5e_tunnel_features_check(struct mlx5e_priv *priv,
 						     struct sk_buff *skb,
 						     netdev_features_t features)
 {
+	unsigned int offset = 0;
 	struct udphdr *udph;
 	u8 proto;
 	u16 port;
@@ -3687,7 +3688,7 @@ static netdev_features_t mlx5e_tunnel_features_check(struct mlx5e_priv *priv,
 		proto = ip_hdr(skb)->protocol;
 		break;
 	case htons(ETH_P_IPV6):
-		proto = ipv6_hdr(skb)->nexthdr;
+		proto = ipv6_find_hdr(skb, &offset, -1, NULL, NULL);
 		break;
 	default:
 		goto out;
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net 06/14] net/mlx5e: Fix defaulting RX ring size when not needed
  2017-12-19 22:24 [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19 Saeed Mahameed
                   ` (4 preceding siblings ...)
  2017-12-19 22:24 ` [net 05/14] net/mlx5e: Fix features check of IPv6 traffic Saeed Mahameed
@ 2017-12-19 22:24 ` Saeed Mahameed
  2017-12-19 22:24 ` [net 07/14] net/mlx5: Fix misspelling in the error message and comment Saeed Mahameed
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Saeed Mahameed @ 2017-12-19 22:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eugenia Emantayev, Saeed Mahameed

From: Eugenia Emantayev <eugenia@mellanox.com>

Fixes the bug when turning on/off CQE compression mechanism
resets the RX rings size to default value when it is not
needed.

Fixes: 2fc4bfb7250d ("net/mlx5e: Dynamic RQ type infrastructure")
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h          |  8 ++++++--
 drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c  | 10 ++++++++--
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c     | 15 +++++++--------
 drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c |  2 +-
 4 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 43f9054830e5..543060c305a0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -82,6 +82,9 @@
 	max_t(u32, MLX5_MPWRQ_MIN_LOG_STRIDE_SZ(mdev), req)
 #define MLX5_MPWRQ_DEF_LOG_STRIDE_SZ(mdev)       MLX5_MPWRQ_LOG_STRIDE_SZ(mdev, 6)
 #define MLX5_MPWRQ_CQE_CMPRS_LOG_STRIDE_SZ(mdev) MLX5_MPWRQ_LOG_STRIDE_SZ(mdev, 8)
+#define MLX5E_MPWQE_STRIDE_SZ(mdev, cqe_cmprs) \
+	(cqe_cmprs ? MLX5_MPWRQ_CQE_CMPRS_LOG_STRIDE_SZ(mdev) : \
+	MLX5_MPWRQ_DEF_LOG_STRIDE_SZ(mdev))
 
 #define MLX5_MPWRQ_LOG_WQE_SZ			18
 #define MLX5_MPWRQ_WQE_PAGE_ORDER  (MLX5_MPWRQ_LOG_WQE_SZ - PAGE_SHIFT > 0 ? \
@@ -936,8 +939,9 @@ void mlx5e_set_tx_cq_mode_params(struct mlx5e_params *params,
 				 u8 cq_period_mode);
 void mlx5e_set_rx_cq_mode_params(struct mlx5e_params *params,
 				 u8 cq_period_mode);
-void mlx5e_set_rq_type_params(struct mlx5_core_dev *mdev,
-			      struct mlx5e_params *params, u8 rq_type);
+void mlx5e_init_rq_type_params(struct mlx5_core_dev *mdev,
+			       struct mlx5e_params *params,
+			       u8 rq_type);
 
 static inline bool mlx5e_tunnel_inner_ft_supported(struct mlx5_core_dev *mdev)
 {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 23425f028405..8f05efa5c829 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -1523,8 +1523,10 @@ int mlx5e_modify_rx_cqe_compression_locked(struct mlx5e_priv *priv, bool new_val
 	new_channels.params = priv->channels.params;
 	MLX5E_SET_PFLAG(&new_channels.params, MLX5E_PFLAG_RX_CQE_COMPRESS, new_val);
 
-	mlx5e_set_rq_type_params(priv->mdev, &new_channels.params,
-				 new_channels.params.rq_wq_type);
+	new_channels.params.mpwqe_log_stride_sz =
+		MLX5E_MPWQE_STRIDE_SZ(priv->mdev, new_val);
+	new_channels.params.mpwqe_log_num_strides =
+		MLX5_MPWRQ_LOG_WQE_SZ - new_channels.params.mpwqe_log_stride_sz;
 
 	if (!test_bit(MLX5E_STATE_OPENED, &priv->state)) {
 		priv->channels.params = new_channels.params;
@@ -1536,6 +1538,10 @@ int mlx5e_modify_rx_cqe_compression_locked(struct mlx5e_priv *priv, bool new_val
 		return err;
 
 	mlx5e_switch_priv_channels(priv, &new_channels, NULL);
+	mlx5e_dbg(DRV, priv, "MLX5E: RxCqeCmprss was turned %s\n",
+		  MLX5E_GET_PFLAG(&priv->channels.params,
+				  MLX5E_PFLAG_RX_CQE_COMPRESS) ? "ON" : "OFF");
+
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index c535a44ab8ac..d9d8227f195f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -78,8 +78,8 @@ static bool mlx5e_check_fragmented_striding_rq_cap(struct mlx5_core_dev *mdev)
 		MLX5_CAP_ETH(mdev, reg_umr_sq);
 }
 
-void mlx5e_set_rq_type_params(struct mlx5_core_dev *mdev,
-			      struct mlx5e_params *params, u8 rq_type)
+void mlx5e_init_rq_type_params(struct mlx5_core_dev *mdev,
+			       struct mlx5e_params *params, u8 rq_type)
 {
 	params->rq_wq_type = rq_type;
 	params->lro_wqe_sz = MLX5E_PARAMS_DEFAULT_LRO_WQE_SZ;
@@ -88,10 +88,8 @@ void mlx5e_set_rq_type_params(struct mlx5_core_dev *mdev,
 		params->log_rq_size = is_kdump_kernel() ?
 			MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE_MPW :
 			MLX5E_PARAMS_DEFAULT_LOG_RQ_SIZE_MPW;
-		params->mpwqe_log_stride_sz =
-			MLX5E_GET_PFLAG(params, MLX5E_PFLAG_RX_CQE_COMPRESS) ?
-			MLX5_MPWRQ_CQE_CMPRS_LOG_STRIDE_SZ(mdev) :
-			MLX5_MPWRQ_DEF_LOG_STRIDE_SZ(mdev);
+		params->mpwqe_log_stride_sz = MLX5E_MPWQE_STRIDE_SZ(mdev,
+			MLX5E_GET_PFLAG(params, MLX5E_PFLAG_RX_CQE_COMPRESS));
 		params->mpwqe_log_num_strides = MLX5_MPWRQ_LOG_WQE_SZ -
 			params->mpwqe_log_stride_sz;
 		break;
@@ -115,13 +113,14 @@ void mlx5e_set_rq_type_params(struct mlx5_core_dev *mdev,
 		       MLX5E_GET_PFLAG(params, MLX5E_PFLAG_RX_CQE_COMPRESS));
 }
 
-static void mlx5e_set_rq_params(struct mlx5_core_dev *mdev, struct mlx5e_params *params)
+static void mlx5e_set_rq_params(struct mlx5_core_dev *mdev,
+				struct mlx5e_params *params)
 {
 	u8 rq_type = mlx5e_check_fragmented_striding_rq_cap(mdev) &&
 		    !params->xdp_prog && !MLX5_IPSEC_DEV(mdev) ?
 		    MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ :
 		    MLX5_WQ_TYPE_LINKED_LIST;
-	mlx5e_set_rq_type_params(mdev, params, rq_type);
+	mlx5e_init_rq_type_params(mdev, params, rq_type);
 }
 
 static void mlx5e_update_carrier(struct mlx5e_priv *priv)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
index d2a66dc4adc6..8812d7208e8f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
@@ -57,7 +57,7 @@ static void mlx5i_build_nic_params(struct mlx5_core_dev *mdev,
 				   struct mlx5e_params *params)
 {
 	/* Override RQ params as IPoIB supports only LINKED LIST RQ for now */
-	mlx5e_set_rq_type_params(mdev, params, MLX5_WQ_TYPE_LINKED_LIST);
+	mlx5e_init_rq_type_params(mdev, params, MLX5_WQ_TYPE_LINKED_LIST);
 
 	/* RQ size in ipoib by default is 512 */
 	params->log_rq_size = is_kdump_kernel() ?
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net 07/14] net/mlx5: Fix misspelling in the error message and comment
  2017-12-19 22:24 [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19 Saeed Mahameed
                   ` (5 preceding siblings ...)
  2017-12-19 22:24 ` [net 06/14] net/mlx5e: Fix defaulting RX ring size when not needed Saeed Mahameed
@ 2017-12-19 22:24 ` Saeed Mahameed
  2017-12-19 22:24 ` [net 08/14] net/mlx5: Fix error flow in CREATE_QP command Saeed Mahameed
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Saeed Mahameed @ 2017-12-19 22:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eugenia Emantayev, Saeed Mahameed

From: Eugenia Emantayev <eugenia@mellanox.com>

Fix misspelling in word syndrome.

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/eq.c     | 2 +-
 drivers/net/ethernet/mellanox/mlx5/core/health.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 60771865c99c..0308a2b4823c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -466,7 +466,7 @@ static irqreturn_t mlx5_eq_int(int irq, void *eq_ptr)
 			break;
 		case MLX5_EVENT_TYPE_CQ_ERROR:
 			cqn = be32_to_cpu(eqe->data.cq_err.cqn) & 0xffffff;
-			mlx5_core_warn(dev, "CQ error on CQN 0x%x, syndrom 0x%x\n",
+			mlx5_core_warn(dev, "CQ error on CQN 0x%x, syndrome 0x%x\n",
 				       cqn, eqe->data.cq_err.syndrome);
 			mlx5_cq_event(dev, cqn, eqe->type);
 			break;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 1a0e797ad001..21d29f7936f6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -241,7 +241,7 @@ static void print_health_info(struct mlx5_core_dev *dev)
 	u32 fw;
 	int i;
 
-	/* If the syndrom is 0, the device is OK and no need to print buffer */
+	/* If the syndrome is 0, the device is OK and no need to print buffer */
 	if (!ioread8(&h->synd))
 		return;
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net 08/14] net/mlx5: Fix error flow in CREATE_QP command
  2017-12-19 22:24 [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19 Saeed Mahameed
                   ` (6 preceding siblings ...)
  2017-12-19 22:24 ` [net 07/14] net/mlx5: Fix misspelling in the error message and comment Saeed Mahameed
@ 2017-12-19 22:24 ` Saeed Mahameed
  2017-12-19 22:24 ` [net 09/14] net/mlx5e: Fix possible deadlock of VXLAN lock Saeed Mahameed
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Saeed Mahameed @ 2017-12-19 22:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Moni Shoua, Saeed Mahameed

From: Moni Shoua <monis@mellanox.com>

In error flow, when DESTROY_QP command should be executed, the wrong
mailbox was set with data, not the one that is written to hardware,
Fix that.

Fixes: 09a7d9eca1a6 '{net,IB}/mlx5: QP/XRCD commands via mlx5 ifc'
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/qp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/qp.c b/drivers/net/ethernet/mellanox/mlx5/core/qp.c
index db9e665ab104..889130edb715 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/qp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/qp.c
@@ -213,8 +213,8 @@ int mlx5_core_create_qp(struct mlx5_core_dev *dev,
 err_cmd:
 	memset(din, 0, sizeof(din));
 	memset(dout, 0, sizeof(dout));
-	MLX5_SET(destroy_qp_in, in, opcode, MLX5_CMD_OP_DESTROY_QP);
-	MLX5_SET(destroy_qp_in, in, qpn, qp->qpn);
+	MLX5_SET(destroy_qp_in, din, opcode, MLX5_CMD_OP_DESTROY_QP);
+	MLX5_SET(destroy_qp_in, din, qpn, qp->qpn);
 	mlx5_cmd_exec(dev, din, sizeof(din), dout, sizeof(dout));
 	return err;
 }
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net 09/14] net/mlx5e: Fix possible deadlock of VXLAN lock
  2017-12-19 22:24 [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19 Saeed Mahameed
                   ` (7 preceding siblings ...)
  2017-12-19 22:24 ` [net 08/14] net/mlx5: Fix error flow in CREATE_QP command Saeed Mahameed
@ 2017-12-19 22:24 ` Saeed Mahameed
  2017-12-19 22:24 ` [net 10/14] net/mlx5e: Add refcount to VXLAN structure Saeed Mahameed
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Saeed Mahameed @ 2017-12-19 22:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Gal Pressman, Saeed Mahameed

From: Gal Pressman <galp@mellanox.com>

mlx5e_vxlan_lookup_port is called both from mlx5e_add_vxlan_port (user
context) and mlx5e_features_check (softirq), but the lock acquired does
not disable bottom half and might result in deadlock. Fix it by simply
replacing spin_lock() with spin_lock_bh().
While at it, replace all unnecessary spin_lock_irq() to spin_lock_bh().

lockdep's WARNING: inconsistent lock state
[  654.028136] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[  654.028229] swapper/5/0 [HC0[0]:SC1[9]:HE1:SE0] takes:
[  654.028321]  (&(&vxlan_db->lock)->rlock){+.?.}, at: [<ffffffffa06e7f0e>] mlx5e_vxlan_lookup_port+0x1e/0x50 [mlx5_core]
[  654.028528] {SOFTIRQ-ON-W} state was registered at:
[  654.028607]   _raw_spin_lock+0x3c/0x70
[  654.028689]   mlx5e_vxlan_lookup_port+0x1e/0x50 [mlx5_core]
[  654.028794]   mlx5e_vxlan_add_port+0x2e/0x120 [mlx5_core]
[  654.028878]   process_one_work+0x1e9/0x640
[  654.028942]   worker_thread+0x4a/0x3f0
[  654.029002]   kthread+0x141/0x180
[  654.029056]   ret_from_fork+0x24/0x30
[  654.029114] irq event stamp: 579088
[  654.029174] hardirqs last  enabled at (579088): [<ffffffff818f475a>] ip6_finish_output2+0x49a/0x8c0
[  654.029309] hardirqs last disabled at (579087): [<ffffffff818f470e>] ip6_finish_output2+0x44e/0x8c0
[  654.029446] softirqs last  enabled at (579030): [<ffffffff810b3b3d>] irq_enter+0x6d/0x80
[  654.029567] softirqs last disabled at (579031): [<ffffffff810b3c05>] irq_exit+0xb5/0xc0
[  654.029684] other info that might help us debug this:
[  654.029781]  Possible unsafe locking scenario:

[  654.029868]        CPU0
[  654.029908]        ----
[  654.029947]   lock(&(&vxlan_db->lock)->rlock);
[  654.030045]   <Interrupt>
[  654.030090]     lock(&(&vxlan_db->lock)->rlock);
[  654.030162]
 *** DEADLOCK ***

Fixes: b3f63c3d5e2c ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/vxlan.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/vxlan.c b/drivers/net/ethernet/mellanox/mlx5/core/vxlan.c
index 07a9ba6cfc70..f8238275759f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/vxlan.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/vxlan.c
@@ -71,9 +71,9 @@ struct mlx5e_vxlan *mlx5e_vxlan_lookup_port(struct mlx5e_priv *priv, u16 port)
 	struct mlx5e_vxlan_db *vxlan_db = &priv->vxlan;
 	struct mlx5e_vxlan *vxlan;
 
-	spin_lock(&vxlan_db->lock);
+	spin_lock_bh(&vxlan_db->lock);
 	vxlan = radix_tree_lookup(&vxlan_db->tree, port);
-	spin_unlock(&vxlan_db->lock);
+	spin_unlock_bh(&vxlan_db->lock);
 
 	return vxlan;
 }
@@ -100,9 +100,9 @@ static void mlx5e_vxlan_add_port(struct work_struct *work)
 
 	vxlan->udp_port = port;
 
-	spin_lock_irq(&vxlan_db->lock);
+	spin_lock_bh(&vxlan_db->lock);
 	err = radix_tree_insert(&vxlan_db->tree, vxlan->udp_port, vxlan);
-	spin_unlock_irq(&vxlan_db->lock);
+	spin_unlock_bh(&vxlan_db->lock);
 	if (err)
 		goto err_free;
 
@@ -121,9 +121,9 @@ static void __mlx5e_vxlan_core_del_port(struct mlx5e_priv *priv, u16 port)
 	struct mlx5e_vxlan_db *vxlan_db = &priv->vxlan;
 	struct mlx5e_vxlan *vxlan;
 
-	spin_lock_irq(&vxlan_db->lock);
+	spin_lock_bh(&vxlan_db->lock);
 	vxlan = radix_tree_delete(&vxlan_db->tree, port);
-	spin_unlock_irq(&vxlan_db->lock);
+	spin_unlock_bh(&vxlan_db->lock);
 
 	if (!vxlan)
 		return;
@@ -171,12 +171,12 @@ void mlx5e_vxlan_cleanup(struct mlx5e_priv *priv)
 	struct mlx5e_vxlan *vxlan;
 	unsigned int port = 0;
 
-	spin_lock_irq(&vxlan_db->lock);
+	spin_lock_bh(&vxlan_db->lock);
 	while (radix_tree_gang_lookup(&vxlan_db->tree, (void **)&vxlan, port, 1)) {
 		port = vxlan->udp_port;
-		spin_unlock_irq(&vxlan_db->lock);
+		spin_unlock_bh(&vxlan_db->lock);
 		__mlx5e_vxlan_core_del_port(priv, (u16)port);
-		spin_lock_irq(&vxlan_db->lock);
+		spin_lock_bh(&vxlan_db->lock);
 	}
-	spin_unlock_irq(&vxlan_db->lock);
+	spin_unlock_bh(&vxlan_db->lock);
 }
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net 10/14] net/mlx5e: Add refcount to VXLAN structure
  2017-12-19 22:24 [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19 Saeed Mahameed
                   ` (8 preceding siblings ...)
  2017-12-19 22:24 ` [net 09/14] net/mlx5e: Fix possible deadlock of VXLAN lock Saeed Mahameed
@ 2017-12-19 22:24 ` Saeed Mahameed
  2017-12-19 22:24 ` [net 11/14] net/mlx5e: Prevent possible races in VXLAN control flow Saeed Mahameed
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Saeed Mahameed @ 2017-12-19 22:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Gal Pressman, Saeed Mahameed

From: Gal Pressman <galp@mellanox.com>

A refcount mechanism must be implemented in order to prevent unwanted
scenarios such as:
- Open an IPv4 VXLAN interface
- Open an IPv6 VXLAN interface (different socket)
- Remove one of the interfaces

With current implementation, the UDP port will be removed from our VXLAN
database and turn off the offloads for the other interface, which is
still active.
The reference count mechanism will only allow UDP port removals once all
consumers are gone.

Fixes: b3f63c3d5e2c ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/vxlan.c | 50 +++++++++++++------------
 drivers/net/ethernet/mellanox/mlx5/core/vxlan.h |  1 +
 2 files changed, 28 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/vxlan.c b/drivers/net/ethernet/mellanox/mlx5/core/vxlan.c
index f8238275759f..25f782344667 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/vxlan.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/vxlan.c
@@ -88,8 +88,11 @@ static void mlx5e_vxlan_add_port(struct work_struct *work)
 	struct mlx5e_vxlan *vxlan;
 	int err;
 
-	if (mlx5e_vxlan_lookup_port(priv, port))
+	vxlan = mlx5e_vxlan_lookup_port(priv, port);
+	if (vxlan) {
+		atomic_inc(&vxlan->refcount);
 		goto free_work;
+	}
 
 	if (mlx5e_vxlan_core_add_port_cmd(priv->mdev, port))
 		goto free_work;
@@ -99,6 +102,7 @@ static void mlx5e_vxlan_add_port(struct work_struct *work)
 		goto err_delete_port;
 
 	vxlan->udp_port = port;
+	atomic_set(&vxlan->refcount, 1);
 
 	spin_lock_bh(&vxlan_db->lock);
 	err = radix_tree_insert(&vxlan_db->tree, vxlan->udp_port, vxlan);
@@ -116,32 +120,33 @@ static void mlx5e_vxlan_add_port(struct work_struct *work)
 	kfree(vxlan_work);
 }
 
-static void __mlx5e_vxlan_core_del_port(struct mlx5e_priv *priv, u16 port)
+static void mlx5e_vxlan_del_port(struct work_struct *work)
 {
+	struct mlx5e_vxlan_work *vxlan_work =
+		container_of(work, struct mlx5e_vxlan_work, work);
+	struct mlx5e_priv *priv         = vxlan_work->priv;
 	struct mlx5e_vxlan_db *vxlan_db = &priv->vxlan;
+	u16 port = vxlan_work->port;
 	struct mlx5e_vxlan *vxlan;
+	bool remove = false;
 
 	spin_lock_bh(&vxlan_db->lock);
-	vxlan = radix_tree_delete(&vxlan_db->tree, port);
-	spin_unlock_bh(&vxlan_db->lock);
-
+	vxlan = radix_tree_lookup(&vxlan_db->tree, port);
 	if (!vxlan)
-		return;
-
-	mlx5e_vxlan_core_del_port_cmd(priv->mdev, vxlan->udp_port);
-
-	kfree(vxlan);
-}
+		goto out_unlock;
 
-static void mlx5e_vxlan_del_port(struct work_struct *work)
-{
-	struct mlx5e_vxlan_work *vxlan_work =
-		container_of(work, struct mlx5e_vxlan_work, work);
-	struct mlx5e_priv *priv = vxlan_work->priv;
-	u16 port = vxlan_work->port;
+	if (atomic_dec_and_test(&vxlan->refcount)) {
+		radix_tree_delete(&vxlan_db->tree, port);
+		remove = true;
+	}
 
-	__mlx5e_vxlan_core_del_port(priv, port);
+out_unlock:
+	spin_unlock_bh(&vxlan_db->lock);
 
+	if (remove) {
+		mlx5e_vxlan_core_del_port_cmd(priv->mdev, port);
+		kfree(vxlan);
+	}
 	kfree(vxlan_work);
 }
 
@@ -171,12 +176,11 @@ void mlx5e_vxlan_cleanup(struct mlx5e_priv *priv)
 	struct mlx5e_vxlan *vxlan;
 	unsigned int port = 0;
 
-	spin_lock_bh(&vxlan_db->lock);
+	/* Lockless since we are the only radix-tree consumers, wq is disabled */
 	while (radix_tree_gang_lookup(&vxlan_db->tree, (void **)&vxlan, port, 1)) {
 		port = vxlan->udp_port;
-		spin_unlock_bh(&vxlan_db->lock);
-		__mlx5e_vxlan_core_del_port(priv, (u16)port);
-		spin_lock_bh(&vxlan_db->lock);
+		radix_tree_delete(&vxlan_db->tree, port);
+		mlx5e_vxlan_core_del_port_cmd(priv->mdev, port);
+		kfree(vxlan);
 	}
-	spin_unlock_bh(&vxlan_db->lock);
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/vxlan.h b/drivers/net/ethernet/mellanox/mlx5/core/vxlan.h
index 5def12c048e3..5ef6ae7d568a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/vxlan.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/vxlan.h
@@ -36,6 +36,7 @@
 #include "en.h"
 
 struct mlx5e_vxlan {
+	atomic_t refcount;
 	u16 udp_port;
 };
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net 11/14] net/mlx5e: Prevent possible races in VXLAN control flow
  2017-12-19 22:24 [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19 Saeed Mahameed
                   ` (9 preceding siblings ...)
  2017-12-19 22:24 ` [net 10/14] net/mlx5e: Add refcount to VXLAN structure Saeed Mahameed
@ 2017-12-19 22:24 ` Saeed Mahameed
  2017-12-19 22:24 ` [net 12/14] net/mlx5: Fix steering memory leak Saeed Mahameed
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Saeed Mahameed @ 2017-12-19 22:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Gal Pressman, Saeed Mahameed

From: Gal Pressman <galp@mellanox.com>

When calling add/remove VXLAN port, a lock must be held in order to
prevent race scenarios when more than one add/remove happens at the
same time.
Fix by holding our state_lock (mutex) as done by all other parts of the
driver.
Note that the spinlock protecting the radix-tree is still needed in
order to synchronize radix-tree access from softirq context.

Fixes: b3f63c3d5e2c ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/vxlan.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/vxlan.c b/drivers/net/ethernet/mellanox/mlx5/core/vxlan.c
index 25f782344667..2f74953e4561 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/vxlan.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/vxlan.c
@@ -88,6 +88,7 @@ static void mlx5e_vxlan_add_port(struct work_struct *work)
 	struct mlx5e_vxlan *vxlan;
 	int err;
 
+	mutex_lock(&priv->state_lock);
 	vxlan = mlx5e_vxlan_lookup_port(priv, port);
 	if (vxlan) {
 		atomic_inc(&vxlan->refcount);
@@ -117,6 +118,7 @@ static void mlx5e_vxlan_add_port(struct work_struct *work)
 err_delete_port:
 	mlx5e_vxlan_core_del_port_cmd(priv->mdev, port);
 free_work:
+	mutex_unlock(&priv->state_lock);
 	kfree(vxlan_work);
 }
 
@@ -130,6 +132,7 @@ static void mlx5e_vxlan_del_port(struct work_struct *work)
 	struct mlx5e_vxlan *vxlan;
 	bool remove = false;
 
+	mutex_lock(&priv->state_lock);
 	spin_lock_bh(&vxlan_db->lock);
 	vxlan = radix_tree_lookup(&vxlan_db->tree, port);
 	if (!vxlan)
@@ -147,6 +150,7 @@ static void mlx5e_vxlan_del_port(struct work_struct *work)
 		mlx5e_vxlan_core_del_port_cmd(priv->mdev, port);
 		kfree(vxlan);
 	}
+	mutex_unlock(&priv->state_lock);
 	kfree(vxlan_work);
 }
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net 12/14] net/mlx5: Fix steering memory leak
  2017-12-19 22:24 [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19 Saeed Mahameed
                   ` (10 preceding siblings ...)
  2017-12-19 22:24 ` [net 11/14] net/mlx5e: Prevent possible races in VXLAN control flow Saeed Mahameed
@ 2017-12-19 22:24 ` Saeed Mahameed
  2017-12-19 22:24 ` [net 13/14] net/mlx5: Cleanup IRQs in case of unload failure Saeed Mahameed
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Saeed Mahameed @ 2017-12-19 22:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Maor Gottlieb, Saeed Mahameed

From: Maor Gottlieb <maorg@mellanox.com>

Flow steering priority and namespace are software only objects that
didn't have the proper destructors and were not freed during steering
cleanup.

Fix it by adding destructor functions for these objects.

Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel")
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index c70fd663a633..dfaad9ecb2b8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -174,6 +174,8 @@ static void del_hw_fte(struct fs_node *node);
 static void del_sw_flow_table(struct fs_node *node);
 static void del_sw_flow_group(struct fs_node *node);
 static void del_sw_fte(struct fs_node *node);
+static void del_sw_prio(struct fs_node *node);
+static void del_sw_ns(struct fs_node *node);
 /* Delete rule (destination) is special case that 
  * requires to lock the FTE for all the deletion process.
  */
@@ -408,6 +410,16 @@ static inline struct mlx5_core_dev *get_dev(struct fs_node *node)
 	return NULL;
 }
 
+static void del_sw_ns(struct fs_node *node)
+{
+	kfree(node);
+}
+
+static void del_sw_prio(struct fs_node *node)
+{
+	kfree(node);
+}
+
 static void del_hw_flow_table(struct fs_node *node)
 {
 	struct mlx5_flow_table *ft;
@@ -2064,7 +2076,7 @@ static struct fs_prio *fs_create_prio(struct mlx5_flow_namespace *ns,
 		return ERR_PTR(-ENOMEM);
 
 	fs_prio->node.type = FS_TYPE_PRIO;
-	tree_init_node(&fs_prio->node, NULL, NULL);
+	tree_init_node(&fs_prio->node, NULL, del_sw_prio);
 	tree_add_node(&fs_prio->node, &ns->node);
 	fs_prio->num_levels = num_levels;
 	fs_prio->prio = prio;
@@ -2090,7 +2102,7 @@ static struct mlx5_flow_namespace *fs_create_namespace(struct fs_prio *prio)
 		return ERR_PTR(-ENOMEM);
 
 	fs_init_namespace(ns);
-	tree_init_node(&ns->node, NULL, NULL);
+	tree_init_node(&ns->node, NULL, del_sw_ns);
 	tree_add_node(&ns->node, &prio->node);
 	list_add_tail(&ns->node.list, &prio->node.children);
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net 13/14] net/mlx5: Cleanup IRQs in case of unload failure
  2017-12-19 22:24 [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19 Saeed Mahameed
                   ` (11 preceding siblings ...)
  2017-12-19 22:24 ` [net 12/14] net/mlx5: Fix steering memory leak Saeed Mahameed
@ 2017-12-19 22:24 ` Saeed Mahameed
  2017-12-19 22:24 ` [net 14/14] net/mlx5: Stay in polling mode when command EQ destroy fails Saeed Mahameed
  2017-12-20 18:42 ` [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19 David Miller
  14 siblings, 0 replies; 23+ messages in thread
From: Saeed Mahameed @ 2017-12-19 22:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Moshe Shemesh, Saeed Mahameed

From: Moshe Shemesh <moshe@mellanox.com>

When mlx5_stop_eqs fails to destroy any of the eqs it returns with an error.
In such failure flow the function will return without
releasing all EQs irqs and then pci_free_irq_vectors will fail.
Fix by only warn on destroy EQ failure and continue to release other
EQs and their irqs.

It fixes the following kernel trace:
kernel: kernel BUG at drivers/pci/msi.c:352!
...
...
kernel: Call Trace:
kernel: pci_disable_msix+0xd3/0x100
kernel: pci_free_irq_vectors+0xe/0x20
kernel: mlx5_load_one.isra.17+0x9f5/0xec0 [mlx5_core]

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/eq.c | 20 +++++++++++++-------
 include/linux/mlx5/driver.h                  |  2 +-
 2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 0308a2b4823c..ab4d1465b7e4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -775,7 +775,7 @@ int mlx5_start_eqs(struct mlx5_core_dev *dev)
 	return err;
 }
 
-int mlx5_stop_eqs(struct mlx5_core_dev *dev)
+void mlx5_stop_eqs(struct mlx5_core_dev *dev)
 {
 	struct mlx5_eq_table *table = &dev->priv.eq_table;
 	int err;
@@ -784,22 +784,28 @@ int mlx5_stop_eqs(struct mlx5_core_dev *dev)
 	if (MLX5_CAP_GEN(dev, pg)) {
 		err = mlx5_destroy_unmap_eq(dev, &table->pfault_eq);
 		if (err)
-			return err;
+			mlx5_core_err(dev, "failed to destroy page fault eq, err(%d)\n",
+				      err);
 	}
 #endif
 
 	err = mlx5_destroy_unmap_eq(dev, &table->pages_eq);
 	if (err)
-		return err;
+		mlx5_core_err(dev, "failed to destroy pages eq, err(%d)\n",
+			      err);
 
-	mlx5_destroy_unmap_eq(dev, &table->async_eq);
+	err = mlx5_destroy_unmap_eq(dev, &table->async_eq);
+	if (err)
+		mlx5_core_err(dev, "failed to destroy async eq, err(%d)\n",
+			      err);
 	mlx5_cmd_use_polling(dev);
 
 	err = mlx5_destroy_unmap_eq(dev, &table->cmd_eq);
-	if (err)
+	if (err) {
+		mlx5_core_err(dev, "failed to destroy command eq, err(%d)\n",
+			      err);
 		mlx5_cmd_use_events(dev);
-
-	return err;
+	}
 }
 
 int mlx5_core_eq_query(struct mlx5_core_dev *dev, struct mlx5_eq *eq,
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 40a6f33c4cde..57b109c6e422 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -1049,7 +1049,7 @@ int mlx5_create_map_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq, u8 vecidx,
 		       enum mlx5_eq_type type);
 int mlx5_destroy_unmap_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq);
 int mlx5_start_eqs(struct mlx5_core_dev *dev);
-int mlx5_stop_eqs(struct mlx5_core_dev *dev);
+void mlx5_stop_eqs(struct mlx5_core_dev *dev);
 int mlx5_vector2eqn(struct mlx5_core_dev *dev, int vector, int *eqn,
 		    unsigned int *irqn);
 int mlx5_core_attach_mcg(struct mlx5_core_dev *dev, union ib_gid *mgid, u32 qpn);
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net 14/14] net/mlx5: Stay in polling mode when command EQ destroy fails
  2017-12-19 22:24 [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19 Saeed Mahameed
                   ` (12 preceding siblings ...)
  2017-12-19 22:24 ` [net 13/14] net/mlx5: Cleanup IRQs in case of unload failure Saeed Mahameed
@ 2017-12-19 22:24 ` Saeed Mahameed
  2017-12-20 18:42 ` [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19 David Miller
  14 siblings, 0 replies; 23+ messages in thread
From: Saeed Mahameed @ 2017-12-19 22:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Moshe Shemesh, Saeed Mahameed

From: Moshe Shemesh <moshe@mellanox.com>

During unload, on mlx5_stop_eqs we move command interface from events
mode to polling mode, but if command interface EQ destroy fail we move
back to events mode.
That's wrong since even if we fail to destroy command interface EQ, we
do release its irq, so no interrupts will be received.

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/eq.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index ab4d1465b7e4..e7e7cef2bde4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -801,11 +801,9 @@ void mlx5_stop_eqs(struct mlx5_core_dev *dev)
 	mlx5_cmd_use_polling(dev);
 
 	err = mlx5_destroy_unmap_eq(dev, &table->cmd_eq);
-	if (err) {
+	if (err)
 		mlx5_core_err(dev, "failed to destroy command eq, err(%d)\n",
 			      err);
-		mlx5_cmd_use_events(dev);
-	}
 }
 
 int mlx5_core_eq_query(struct mlx5_core_dev *dev, struct mlx5_eq *eq,
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [net 02/14] Revert "mlx5: move affinity hints assignments to generic code"
  2017-12-19 22:24 ` [net 02/14] Revert "mlx5: move affinity hints assignments to generic code" Saeed Mahameed
@ 2017-12-19 22:42   ` Jes Sorensen
  2017-12-25 13:53   ` Sagi Grimberg
  1 sibling, 0 replies; 23+ messages in thread
From: Jes Sorensen @ 2017-12-19 22:42 UTC (permalink / raw)
  To: Saeed Mahameed, David S. Miller; +Cc: netdev, Sagi Grimberg, Thomas Gleixner

On 12/19/2017 05:24 PM, Saeed Mahameed wrote:
> Before the offending commit, mlx5 core did the IRQ affinity itself,
> and it seems that the new generic code have some drawbacks and one
> of them is the lack for user ability to modify irq affinity after
> the initial affinity values got assigned.
> 
> The issue is still being discussed and a solution in the new generic code
> is required, until then we need to revert this patch.
> 
> This fixes the following issue:
> echo <new affinity> > /proc/irq/<x>/smp_affinity
> fails with  -EIO
> 
> This reverts commit a435393acafbf0ecff4deb3e3cb554b34f0d0664.
> Note: kept mlx5_get_vector_affinity in include/linux/mlx5/driver.h since
> it is used in mlx5_ib driver.
> 
> Fixes: a435393acafb ("mlx5: move affinity hints assignments to generic code")
> Cc: Sagi Grimberg <sagi@grimberg.me>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Jes Sorensen <jsorensen@fb.com>
> Reported-by: Jes Sorensen <jsorensen@fb.com>
> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

Acked-by: Jes Sorensen <jsorensen@fb.com>

Cheers,
Jes


> ---
>  drivers/net/ethernet/mellanox/mlx5/core/en.h      |  1 +
>  drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 45 +++++++-------
>  drivers/net/ethernet/mellanox/mlx5/core/main.c    | 75 +++++++++++++++++++++--
>  include/linux/mlx5/driver.h                       |  1 +
>  4 files changed, 93 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> index c0872b3284cb..43f9054830e5 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> @@ -590,6 +590,7 @@ struct mlx5e_channel {
>  	struct mlx5_core_dev      *mdev;
>  	struct hwtstamp_config    *tstamp;
>  	int                        ix;
> +	int                        cpu;
>  };
>  
>  struct mlx5e_channels {
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> index d2b057a3e512..cbec66bc82f1 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> @@ -71,11 +71,6 @@ struct mlx5e_channel_param {
>  	struct mlx5e_cq_param      icosq_cq;
>  };
>  
> -static int mlx5e_get_node(struct mlx5e_priv *priv, int ix)
> -{
> -	return pci_irq_get_node(priv->mdev->pdev, MLX5_EQ_VEC_COMP_BASE + ix);
> -}
> -
>  static bool mlx5e_check_fragmented_striding_rq_cap(struct mlx5_core_dev *mdev)
>  {
>  	return MLX5_CAP_GEN(mdev, striding_rq) &&
> @@ -444,17 +439,16 @@ static int mlx5e_rq_alloc_mpwqe_info(struct mlx5e_rq *rq,
>  	int wq_sz = mlx5_wq_ll_get_size(&rq->wq);
>  	int mtt_sz = mlx5e_get_wqe_mtt_sz();
>  	int mtt_alloc = mtt_sz + MLX5_UMR_ALIGN - 1;
> -	int node = mlx5e_get_node(c->priv, c->ix);
>  	int i;
>  
>  	rq->mpwqe.info = kzalloc_node(wq_sz * sizeof(*rq->mpwqe.info),
> -					GFP_KERNEL, node);
> +				      GFP_KERNEL, cpu_to_node(c->cpu));
>  	if (!rq->mpwqe.info)
>  		goto err_out;
>  
>  	/* We allocate more than mtt_sz as we will align the pointer */
> -	rq->mpwqe.mtt_no_align = kzalloc_node(mtt_alloc * wq_sz,
> -					GFP_KERNEL, node);
> +	rq->mpwqe.mtt_no_align = kzalloc_node(mtt_alloc * wq_sz, GFP_KERNEL,
> +					cpu_to_node(c->cpu));
>  	if (unlikely(!rq->mpwqe.mtt_no_align))
>  		goto err_free_wqe_info;
>  
> @@ -562,7 +556,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
>  	int err;
>  	int i;
>  
> -	rqp->wq.db_numa_node = mlx5e_get_node(c->priv, c->ix);
> +	rqp->wq.db_numa_node = cpu_to_node(c->cpu);
>  
>  	err = mlx5_wq_ll_create(mdev, &rqp->wq, rqc_wq, &rq->wq,
>  				&rq->wq_ctrl);
> @@ -629,8 +623,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
>  	default: /* MLX5_WQ_TYPE_LINKED_LIST */
>  		rq->wqe.frag_info =
>  			kzalloc_node(wq_sz * sizeof(*rq->wqe.frag_info),
> -				     GFP_KERNEL,
> -				     mlx5e_get_node(c->priv, c->ix));
> +				     GFP_KERNEL, cpu_to_node(c->cpu));
>  		if (!rq->wqe.frag_info) {
>  			err = -ENOMEM;
>  			goto err_rq_wq_destroy;
> @@ -1000,13 +993,13 @@ static int mlx5e_alloc_xdpsq(struct mlx5e_channel *c,
>  	sq->uar_map   = mdev->mlx5e_res.bfreg.map;
>  	sq->min_inline_mode = params->tx_min_inline_mode;
>  
> -	param->wq.db_numa_node = mlx5e_get_node(c->priv, c->ix);
> +	param->wq.db_numa_node = cpu_to_node(c->cpu);
>  	err = mlx5_wq_cyc_create(mdev, &param->wq, sqc_wq, &sq->wq, &sq->wq_ctrl);
>  	if (err)
>  		return err;
>  	sq->wq.db = &sq->wq.db[MLX5_SND_DBR];
>  
> -	err = mlx5e_alloc_xdpsq_db(sq, mlx5e_get_node(c->priv, c->ix));
> +	err = mlx5e_alloc_xdpsq_db(sq, cpu_to_node(c->cpu));
>  	if (err)
>  		goto err_sq_wq_destroy;
>  
> @@ -1053,13 +1046,13 @@ static int mlx5e_alloc_icosq(struct mlx5e_channel *c,
>  	sq->channel   = c;
>  	sq->uar_map   = mdev->mlx5e_res.bfreg.map;
>  
> -	param->wq.db_numa_node = mlx5e_get_node(c->priv, c->ix);
> +	param->wq.db_numa_node = cpu_to_node(c->cpu);
>  	err = mlx5_wq_cyc_create(mdev, &param->wq, sqc_wq, &sq->wq, &sq->wq_ctrl);
>  	if (err)
>  		return err;
>  	sq->wq.db = &sq->wq.db[MLX5_SND_DBR];
>  
> -	err = mlx5e_alloc_icosq_db(sq, mlx5e_get_node(c->priv, c->ix));
> +	err = mlx5e_alloc_icosq_db(sq, cpu_to_node(c->cpu));
>  	if (err)
>  		goto err_sq_wq_destroy;
>  
> @@ -1126,13 +1119,13 @@ static int mlx5e_alloc_txqsq(struct mlx5e_channel *c,
>  	if (MLX5_IPSEC_DEV(c->priv->mdev))
>  		set_bit(MLX5E_SQ_STATE_IPSEC, &sq->state);
>  
> -	param->wq.db_numa_node = mlx5e_get_node(c->priv, c->ix);
> +	param->wq.db_numa_node = cpu_to_node(c->cpu);
>  	err = mlx5_wq_cyc_create(mdev, &param->wq, sqc_wq, &sq->wq, &sq->wq_ctrl);
>  	if (err)
>  		return err;
>  	sq->wq.db    = &sq->wq.db[MLX5_SND_DBR];
>  
> -	err = mlx5e_alloc_txqsq_db(sq, mlx5e_get_node(c->priv, c->ix));
> +	err = mlx5e_alloc_txqsq_db(sq, cpu_to_node(c->cpu));
>  	if (err)
>  		goto err_sq_wq_destroy;
>  
> @@ -1504,8 +1497,8 @@ static int mlx5e_alloc_cq(struct mlx5e_channel *c,
>  	struct mlx5_core_dev *mdev = c->priv->mdev;
>  	int err;
>  
> -	param->wq.buf_numa_node = mlx5e_get_node(c->priv, c->ix);
> -	param->wq.db_numa_node  = mlx5e_get_node(c->priv, c->ix);
> +	param->wq.buf_numa_node = cpu_to_node(c->cpu);
> +	param->wq.db_numa_node  = cpu_to_node(c->cpu);
>  	param->eq_ix   = c->ix;
>  
>  	err = mlx5e_alloc_cq_common(mdev, param, cq);
> @@ -1604,6 +1597,11 @@ static void mlx5e_close_cq(struct mlx5e_cq *cq)
>  	mlx5e_free_cq(cq);
>  }
>  
> +static int mlx5e_get_cpu(struct mlx5e_priv *priv, int ix)
> +{
> +	return cpumask_first(priv->mdev->priv.irq_info[ix].mask);
> +}
> +
>  static int mlx5e_open_tx_cqs(struct mlx5e_channel *c,
>  			     struct mlx5e_params *params,
>  			     struct mlx5e_channel_param *cparam)
> @@ -1752,12 +1750,13 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
>  {
>  	struct mlx5e_cq_moder icocq_moder = {0, 0};
>  	struct net_device *netdev = priv->netdev;
> +	int cpu = mlx5e_get_cpu(priv, ix);
>  	struct mlx5e_channel *c;
>  	unsigned int irq;
>  	int err;
>  	int eqn;
>  
> -	c = kzalloc_node(sizeof(*c), GFP_KERNEL, mlx5e_get_node(priv, ix));
> +	c = kzalloc_node(sizeof(*c), GFP_KERNEL, cpu_to_node(cpu));
>  	if (!c)
>  		return -ENOMEM;
>  
> @@ -1765,6 +1764,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
>  	c->mdev     = priv->mdev;
>  	c->tstamp   = &priv->tstamp;
>  	c->ix       = ix;
> +	c->cpu      = cpu;
>  	c->pdev     = &priv->mdev->pdev->dev;
>  	c->netdev   = priv->netdev;
>  	c->mkey_be  = cpu_to_be32(priv->mdev->mlx5e_res.mkey.key);
> @@ -1853,8 +1853,7 @@ static void mlx5e_activate_channel(struct mlx5e_channel *c)
>  	for (tc = 0; tc < c->num_tc; tc++)
>  		mlx5e_activate_txqsq(&c->sq[tc]);
>  	mlx5e_activate_rq(&c->rq);
> -	netif_set_xps_queue(c->netdev,
> -		mlx5_get_vector_affinity(c->priv->mdev, c->ix), c->ix);
> +	netif_set_xps_queue(c->netdev, get_cpu_mask(c->cpu), c->ix);
>  }
>  
>  static void mlx5e_deactivate_channel(struct mlx5e_channel *c)
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> index 5f323442cc5a..8a89c7e8cd63 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> @@ -317,9 +317,6 @@ static int mlx5_alloc_irq_vectors(struct mlx5_core_dev *dev)
>  {
>  	struct mlx5_priv *priv = &dev->priv;
>  	struct mlx5_eq_table *table = &priv->eq_table;
> -	struct irq_affinity irqdesc = {
> -		.pre_vectors = MLX5_EQ_VEC_COMP_BASE,
> -	};
>  	int num_eqs = 1 << MLX5_CAP_GEN(dev, log_max_eq);
>  	int nvec;
>  
> @@ -333,10 +330,9 @@ static int mlx5_alloc_irq_vectors(struct mlx5_core_dev *dev)
>  	if (!priv->irq_info)
>  		goto err_free_msix;
>  
> -	nvec = pci_alloc_irq_vectors_affinity(dev->pdev,
> +	nvec = pci_alloc_irq_vectors(dev->pdev,
>  			MLX5_EQ_VEC_COMP_BASE + 1, nvec,
> -			PCI_IRQ_MSIX | PCI_IRQ_AFFINITY,
> -			&irqdesc);
> +			PCI_IRQ_MSIX);
>  	if (nvec < 0)
>  		return nvec;
>  
> @@ -622,6 +618,63 @@ u64 mlx5_read_internal_timer(struct mlx5_core_dev *dev)
>  	return (u64)timer_l | (u64)timer_h1 << 32;
>  }
>  
> +static int mlx5_irq_set_affinity_hint(struct mlx5_core_dev *mdev, int i)
> +{
> +	struct mlx5_priv *priv  = &mdev->priv;
> +	int irq = pci_irq_vector(mdev->pdev, MLX5_EQ_VEC_COMP_BASE + i);
> +
> +	if (!zalloc_cpumask_var(&priv->irq_info[i].mask, GFP_KERNEL)) {
> +		mlx5_core_warn(mdev, "zalloc_cpumask_var failed");
> +		return -ENOMEM;
> +	}
> +
> +	cpumask_set_cpu(cpumask_local_spread(i, priv->numa_node),
> +			priv->irq_info[i].mask);
> +
> +	if (IS_ENABLED(CONFIG_SMP) &&
> +	    irq_set_affinity_hint(irq, priv->irq_info[i].mask))
> +		mlx5_core_warn(mdev, "irq_set_affinity_hint failed, irq 0x%.4x", irq);
> +
> +	return 0;
> +}
> +
> +static void mlx5_irq_clear_affinity_hint(struct mlx5_core_dev *mdev, int i)
> +{
> +	struct mlx5_priv *priv  = &mdev->priv;
> +	int irq = pci_irq_vector(mdev->pdev, MLX5_EQ_VEC_COMP_BASE + i);
> +
> +	irq_set_affinity_hint(irq, NULL);
> +	free_cpumask_var(priv->irq_info[i].mask);
> +}
> +
> +static int mlx5_irq_set_affinity_hints(struct mlx5_core_dev *mdev)
> +{
> +	int err;
> +	int i;
> +
> +	for (i = 0; i < mdev->priv.eq_table.num_comp_vectors; i++) {
> +		err = mlx5_irq_set_affinity_hint(mdev, i);
> +		if (err)
> +			goto err_out;
> +	}
> +
> +	return 0;
> +
> +err_out:
> +	for (i--; i >= 0; i--)
> +		mlx5_irq_clear_affinity_hint(mdev, i);
> +
> +	return err;
> +}
> +
> +static void mlx5_irq_clear_affinity_hints(struct mlx5_core_dev *mdev)
> +{
> +	int i;
> +
> +	for (i = 0; i < mdev->priv.eq_table.num_comp_vectors; i++)
> +		mlx5_irq_clear_affinity_hint(mdev, i);
> +}
> +
>  int mlx5_vector2eqn(struct mlx5_core_dev *dev, int vector, int *eqn,
>  		    unsigned int *irqn)
>  {
> @@ -1097,6 +1150,12 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
>  		goto err_stop_eqs;
>  	}
>  
> +	err = mlx5_irq_set_affinity_hints(dev);
> +	if (err) {
> +		dev_err(&pdev->dev, "Failed to alloc affinity hint cpumask\n");
> +		goto err_affinity_hints;
> +	}
> +
>  	err = mlx5_init_fs(dev);
>  	if (err) {
>  		dev_err(&pdev->dev, "Failed to init flow steering\n");
> @@ -1154,6 +1213,9 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
>  	mlx5_cleanup_fs(dev);
>  
>  err_fs:
> +	mlx5_irq_clear_affinity_hints(dev);
> +
> +err_affinity_hints:
>  	free_comp_eqs(dev);
>  
>  err_stop_eqs:
> @@ -1222,6 +1284,7 @@ static int mlx5_unload_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
>  
>  	mlx5_sriov_detach(dev);
>  	mlx5_cleanup_fs(dev);
> +	mlx5_irq_clear_affinity_hints(dev);
>  	free_comp_eqs(dev);
>  	mlx5_stop_eqs(dev);
>  	mlx5_put_uars_page(dev, priv->uar);
> diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
> index a886b51511ab..40a6f33c4cde 100644
> --- a/include/linux/mlx5/driver.h
> +++ b/include/linux/mlx5/driver.h
> @@ -556,6 +556,7 @@ struct mlx5_core_sriov {
>  };
>  
>  struct mlx5_irq_info {
> +	cpumask_var_t mask;
>  	char name[MLX5_MAX_IRQ_NAME];
>  };
>  
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [net 04/14] net/mlx5e: Fix ETS BW check
  2017-12-19 22:24 ` [net 04/14] net/mlx5e: Fix ETS BW check Saeed Mahameed
@ 2017-12-20  7:42   ` Or Gerlitz
  0 siblings, 0 replies; 23+ messages in thread
From: Or Gerlitz @ 2017-12-20  7:42 UTC (permalink / raw)
  To: Huy Nguyen; +Cc: Linux Netdev List, Moshe Shemesh, David Miller, Saeed Mahameed

On Wed, Dec 20, 2017 at 12:24 AM, Saeed Mahameed <saeedm@mellanox.com> wrote:
>
> From: Huy Nguyen <huyn@mellanox.com>
>
> Fix bug that allows ets bw sum to be 0% when ets tc type exists.
>
> Fixes: 08fb1dacdd76 ('net/mlx5e: Support DCBNL IEEE ETS')
> Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
> Reviewed-by: Huy Nguyen <huyn@mellanox.com>



Huy, if you are the author, you should have your signature here,
please fix it up and put
things properly next time.

Or.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [net 05/14] net/mlx5e: Fix features check of IPv6 traffic
  2017-12-19 22:24 ` [net 05/14] net/mlx5e: Fix features check of IPv6 traffic Saeed Mahameed
@ 2017-12-20  7:45   ` Or Gerlitz
  0 siblings, 0 replies; 23+ messages in thread
From: Or Gerlitz @ 2017-12-20  7:45 UTC (permalink / raw)
  To: Gal Pressman; +Cc: David S. Miller, Linux Netdev List, Saeed Mahameed

On Wed, Dec 20, 2017 at 12:24 AM, Saeed Mahameed <saeedm@mellanox.com> wrote:
> From: Gal Pressman <galp@mellanox.com>
>
> The assumption that the next header field contains the transport
> protocol is wrong for IPv6 packets with extension headers.
> Instead, we should look the inner-most next header field in the buffer.
> This will fix TSO offload for tunnels over IPv6 with extension headers.

nice!

I would guess that there is some limitation to how many ipv6 ext
headers / bytes
the HW can deal with for TSO, are we enforcing that somehow (also for
the non-tunnel case)?

Or.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19
  2017-12-19 22:24 [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19 Saeed Mahameed
                   ` (13 preceding siblings ...)
  2017-12-19 22:24 ` [net 14/14] net/mlx5: Stay in polling mode when command EQ destroy fails Saeed Mahameed
@ 2017-12-20 18:42 ` David Miller
  14 siblings, 0 replies; 23+ messages in thread
From: David Miller @ 2017-12-20 18:42 UTC (permalink / raw)
  To: saeedm; +Cc: netdev

From: Saeed Mahameed <saeedm@mellanox.com>
Date: Wed, 20 Dec 2017 00:24:42 +0200

> The follwoing series includes some fixes for mlx5 core and etherent
> driver.
> 
> Please pull and let me know if there is any problem.

Pulled.

> For -stable:
> 
> kernels >= v4.7.y
>     ("net/mlx5e: Fix possible deadlock of VXLAN lock")
>     ("net/mlx5e: Add refcount to VXLAN structure")
>     ("net/mlx5e: Prevent possible races in VXLAN control flow")
>     ("net/mlx5e: Fix features check of IPv6 traffic")
> 
> kernels >= v4.9.y
>     ("net/mlx5: Fix error flow in CREATE_QP command")
>     ("net/mlx5: Fix rate limit packet pacing naming and struct")
> 
> kernels >= v4.13.y
>     ("net/mlx5: FPGA, return -EINVAL if size is zero")
> 
> kernels >= v4.14.y
>     ("Revert "mlx5: move affinity hints assignments to generic code")

Queued up.

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [net 02/14] Revert "mlx5: move affinity hints assignments to generic code"
  2017-12-19 22:24 ` [net 02/14] Revert "mlx5: move affinity hints assignments to generic code" Saeed Mahameed
  2017-12-19 22:42   ` Jes Sorensen
@ 2017-12-25 13:53   ` Sagi Grimberg
  2017-12-26 15:58     ` Saeed Mahameed
  1 sibling, 1 reply; 23+ messages in thread
From: Sagi Grimberg @ 2017-12-25 13:53 UTC (permalink / raw)
  To: Saeed Mahameed, David S. Miller; +Cc: netdev, Thomas Gleixner, Jes Sorensen


> Before the offending commit, mlx5 core did the IRQ affinity itself,
> and it seems that the new generic code have some drawbacks and one
> of them is the lack for user ability to modify irq affinity after
> the initial affinity values got assigned.
> 
> The issue is still being discussed and a solution in the new generic code
> is required, until then we need to revert this patch.
> 
> This fixes the following issue:
> echo <new affinity> > /proc/irq/<x>/smp_affinity
> fails with  -EIO
> 
> This reverts commit a435393acafbf0ecff4deb3e3cb554b34f0d0664.
> Note: kept mlx5_get_vector_affinity in include/linux/mlx5/driver.h since
 > it is used in mlx5_ib driver.

This won't work for sure because the msi_desc affinity cpumask won't
ever be populated. You need to re-implement it in mlx5 if you don't want
to break rdma ULPs that rely on it.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [net 02/14] Revert "mlx5: move affinity hints assignments to generic code"
  2017-12-25 13:53   ` Sagi Grimberg
@ 2017-12-26 15:58     ` Saeed Mahameed
  2017-12-26 17:14       ` Sagi Grimberg
  0 siblings, 1 reply; 23+ messages in thread
From: Saeed Mahameed @ 2017-12-26 15:58 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Saeed Mahameed, David S. Miller, Linux Netdev List,
	Thomas Gleixner, Jes Sorensen

On Mon, Dec 25, 2017 at 5:53 AM, Sagi Grimberg <sagi@grimberg.me> wrote:
>
>> Before the offending commit, mlx5 core did the IRQ affinity itself,
>> and it seems that the new generic code have some drawbacks and one
>> of them is the lack for user ability to modify irq affinity after
>> the initial affinity values got assigned.
>>
>> The issue is still being discussed and a solution in the new generic code
>> is required, until then we need to revert this patch.
>>
>> This fixes the following issue:
>> echo <new affinity> > /proc/irq/<x>/smp_affinity
>> fails with  -EIO
>>
>> This reverts commit a435393acafbf0ecff4deb3e3cb554b34f0d0664.
>> Note: kept mlx5_get_vector_affinity in include/linux/mlx5/driver.h since
>
>> it is used in mlx5_ib driver.
>
> This won't work for sure because the msi_desc affinity cpumask won't
> ever be populated. You need to re-implement it in mlx5 if you don't want
> to break rdma ULPs that rely on it.

Are you sure it won't get populated at all  ? even if you manually set
IRQ affinity via sysfs ?
Anyway we can implement this driver helper function to return the IRQ
affinity hint stored in the driver:
 "cpumask_first(mdev->priv.irq_info[vector].mask);"

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [net 02/14] Revert "mlx5: move affinity hints assignments to generic code"
  2017-12-26 15:58     ` Saeed Mahameed
@ 2017-12-26 17:14       ` Sagi Grimberg
  2018-01-04 20:19         ` Saeed Mahameed
  0 siblings, 1 reply; 23+ messages in thread
From: Sagi Grimberg @ 2017-12-26 17:14 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Saeed Mahameed, David S. Miller, Linux Netdev List,
	Thomas Gleixner, Jes Sorensen


> Are you sure it won't get populated at all  ? even if you manually set
> IRQ affinity via sysfs ?

Yes, the msi_desc affinity is not initialized without the affinity
descriptor passed (which is what PCI_IRQ_AFFINITY is for).

> Anyway we can implement this driver helper function to return the IRQ
> affinity hint stored in the driver:
>   "cpumask_first(mdev->priv.irq_info[vector].mask);"

minus the cpumask_first, but yea. Please send a new patch so we can
test it out.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [net 02/14] Revert "mlx5: move affinity hints assignments to generic code"
  2017-12-26 17:14       ` Sagi Grimberg
@ 2018-01-04 20:19         ` Saeed Mahameed
  0 siblings, 0 replies; 23+ messages in thread
From: Saeed Mahameed @ 2018-01-04 20:19 UTC (permalink / raw)
  To: Sagi Grimberg, Saeed Mahameed
  Cc: David S. Miller, Linux Netdev List, Thomas Gleixner, Jes Sorensen



On 12/26/2017 9:14 AM, Sagi Grimberg wrote:
> 
>> Are you sure it won't get populated at all  ? even if you manually set
>> IRQ affinity via sysfs ?
> 
> Yes, the msi_desc affinity is not initialized without the affinity
> descriptor passed (which is what PCI_IRQ_AFFINITY is for).
> 
>> Anyway we can implement this driver helper function to return the IRQ
>> affinity hint stored in the driver:
>>   "cpumask_first(mdev->priv.irq_info[vector].mask);"
> 
> minus the cpumask_first, but yea. Please send a new patch so we can
> test it out.

Actually using mdev->priv.irq_info[vector].mask is wrong since it only 
gives the initial hint and not the current actual affinity mask.

I found a better way to address this and return the actual dynamic 
affinity of an interrupt vector.

I will send a patch to net soon.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2018-01-04 20:19 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-19 22:24 [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19 Saeed Mahameed
2017-12-19 22:24 ` [net 01/14] net/mlx5: FPGA, return -EINVAL if size is zero Saeed Mahameed
2017-12-19 22:24 ` [net 02/14] Revert "mlx5: move affinity hints assignments to generic code" Saeed Mahameed
2017-12-19 22:42   ` Jes Sorensen
2017-12-25 13:53   ` Sagi Grimberg
2017-12-26 15:58     ` Saeed Mahameed
2017-12-26 17:14       ` Sagi Grimberg
2018-01-04 20:19         ` Saeed Mahameed
2017-12-19 22:24 ` [net 03/14] net/mlx5: Fix rate limit packet pacing naming and struct Saeed Mahameed
2017-12-19 22:24 ` [net 04/14] net/mlx5e: Fix ETS BW check Saeed Mahameed
2017-12-20  7:42   ` Or Gerlitz
2017-12-19 22:24 ` [net 05/14] net/mlx5e: Fix features check of IPv6 traffic Saeed Mahameed
2017-12-20  7:45   ` Or Gerlitz
2017-12-19 22:24 ` [net 06/14] net/mlx5e: Fix defaulting RX ring size when not needed Saeed Mahameed
2017-12-19 22:24 ` [net 07/14] net/mlx5: Fix misspelling in the error message and comment Saeed Mahameed
2017-12-19 22:24 ` [net 08/14] net/mlx5: Fix error flow in CREATE_QP command Saeed Mahameed
2017-12-19 22:24 ` [net 09/14] net/mlx5e: Fix possible deadlock of VXLAN lock Saeed Mahameed
2017-12-19 22:24 ` [net 10/14] net/mlx5e: Add refcount to VXLAN structure Saeed Mahameed
2017-12-19 22:24 ` [net 11/14] net/mlx5e: Prevent possible races in VXLAN control flow Saeed Mahameed
2017-12-19 22:24 ` [net 12/14] net/mlx5: Fix steering memory leak Saeed Mahameed
2017-12-19 22:24 ` [net 13/14] net/mlx5: Cleanup IRQs in case of unload failure Saeed Mahameed
2017-12-19 22:24 ` [net 14/14] net/mlx5: Stay in polling mode when command EQ destroy fails Saeed Mahameed
2017-12-20 18:42 ` [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19 David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.