All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net 00/12] Mellanox 100G mlx5 fixes 2016-10-25
@ 2016-10-25 15:36 Saeed Mahameed
  2016-10-25 15:36 ` [PATCH net 01/12] {net, ib}/mlx5: Make cache line size determination at runtime Saeed Mahameed
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: Saeed Mahameed @ 2016-10-25 15:36 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Or Gerlitz, Leon Romanovsky, Saeed Mahameed

Hi Dave,

This series contains some bug fixes for the mlx5 core and mlx5e driver.

>From Daniel:
    - Cache line size determination at runtime, instead of using 
      L1_CACHE_BYTES hard coded value, use cache_line_size()
    - Always Query HCA caps after setting them even on reset flow

>From Mohamad:
    - Reorder netdev cleanup to uregister netdev before detaching it
      for the kernel to not complain about open resources such as vlans
    - Change the acl enable prototype to return status, for better error
      resiliency
    - Clear health sick bit when starting health poll after reset flow
    - Fix race between PCI error handlers and health work
    - PCI error recovery health care simulation, in case when the kernel 
      PCI error handlers are not triggered for some internal firmware errors

>From Noa:
    - Avoid passing dma address 0 to firmware when mapping system pages
      to the firmware

>From Paul: Some straight forward flow steering fixes
    - Keep autogroups list ordered
    - Fix autogroups groups num not decreasing
    - Correctly initialize last use of flow counters

>From Saeed:
    - Choose the nearest LRO timeout to the wanted one
      instead of blindly choosing "dev_cap.lro_timeout[2]"

This series has no conflict with the for-next pull request posted
earlier today ("Mellanox mlx5 core driver updates 2016-10-25").

Thanks,
Saeed.

Daniel Jurgens (2):
  {net, ib}/mlx5: Make cache line size determination at runtime.
  net/mlx5: Always Query HCA caps after setting them

Mohamad Haj Yahia (5):
  net/mlx5e: Unregister netdev before detaching it
  net/mlx5: Change the acl enable prototype to return status
  net/mlx5: Clear health sick bit when starting health poll
  net/mlx5: Fix race between PCI error handlers and health work
  net/mlx5: PCI error recovery health care simulation

Noa Osherovich (1):
  net/mlx5: Avoid passing dma address 0 to firmware

Paul Blakey (3):
  net/mlx5: Keep autogroups list ordered
  net/mlx5: Fix autogroups groups num not decreasing
  net/mlx5: Correctly initialize last use of flow counters

Saeed Mahameed (1):
  net/mlx5e: Choose best nearest LRO timeout

 drivers/infiniband/hw/mlx5/main.c                  |  2 +-
 drivers/infiniband/hw/mlx5/qp.c                    |  1 -
 drivers/net/ethernet/mellanox/mlx5/core/alloc.c    | 31 +++++++--
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  5 ++
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 21 ++++--
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c   |  1 +
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.c  | 50 +++++++++-----
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c  |  9 ++-
 .../net/ethernet/mellanox/mlx5/core/fs_counters.c  |  1 +
 drivers/net/ethernet/mellanox/mlx5/core/health.c   | 76 ++++++++++++++++++++--
 drivers/net/ethernet/mellanox/mlx5/core/main.c     | 39 +++++++----
 .../net/ethernet/mellanox/mlx5/core/mlx5_core.h    |  1 +
 .../net/ethernet/mellanox/mlx5/core/pagealloc.c    | 26 +++++---
 include/linux/mlx5/driver.h                        | 16 ++---
 14 files changed, 210 insertions(+), 69 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH net 01/12] {net, ib}/mlx5: Make cache line size determination at runtime.
  2016-10-25 15:36 [PATCH net 00/12] Mellanox 100G mlx5 fixes 2016-10-25 Saeed Mahameed
@ 2016-10-25 15:36 ` Saeed Mahameed
  2016-10-25 15:36 ` [PATCH net 02/12] net/mlx5: Always Query HCA caps after setting them Saeed Mahameed
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Saeed Mahameed @ 2016-10-25 15:36 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, Leon Romanovsky, Daniel Jurgens, Saeed Mahameed

From: Daniel Jurgens <danielj@mellanox.com>

ARM 64B cache line systems have L1_CACHE_BYTES set to 128.
cache_line_size() will return the correct size.

Fixes: cf50b5efa2fe('net/mlx5_core/ib: New device capabilities
handling.')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/infiniband/hw/mlx5/main.c               |  2 +-
 drivers/infiniband/hw/mlx5/qp.c                 |  1 -
 drivers/net/ethernet/mellanox/mlx5/core/alloc.c | 31 +++++++++++++++++++++----
 include/linux/mlx5/driver.h                     | 11 ---------
 4 files changed, 27 insertions(+), 18 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 2217477..63036c7 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -1019,7 +1019,7 @@ static struct ib_ucontext *mlx5_ib_alloc_ucontext(struct ib_device *ibdev,
 	resp.qp_tab_size = 1 << MLX5_CAP_GEN(dev->mdev, log_max_qp);
 	if (mlx5_core_is_pf(dev->mdev) && MLX5_CAP_GEN(dev->mdev, bf))
 		resp.bf_reg_size = 1 << MLX5_CAP_GEN(dev->mdev, log_bf_reg_size);
-	resp.cache_line_size = L1_CACHE_BYTES;
+	resp.cache_line_size = cache_line_size();
 	resp.max_sq_desc_sz = MLX5_CAP_GEN(dev->mdev, max_wqe_sz_sq);
 	resp.max_rq_desc_sz = MLX5_CAP_GEN(dev->mdev, max_wqe_sz_rq);
 	resp.max_send_wqebb = 1 << MLX5_CAP_GEN(dev->mdev, log_max_qp_sz);
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 41f4c2a..7ce97da 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -52,7 +52,6 @@ enum {
 
 enum {
 	MLX5_IB_SQ_STRIDE	= 6,
-	MLX5_IB_CACHE_LINE_SIZE	= 64,
 };
 
 static const u32 mlx5_ib_opcode[] = {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
index 6cb3830..2c6e3c7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
@@ -41,6 +41,13 @@
 
 #include "mlx5_core.h"
 
+struct mlx5_db_pgdir {
+	struct list_head	list;
+	unsigned long	       *bitmap;
+	__be32		       *db_page;
+	dma_addr_t		db_dma;
+};
+
 /* Handling for queue buffers -- we allocate a bunch of memory and
  * register it in a memory region at HCA virtual address 0.
  */
@@ -102,17 +109,28 @@ EXPORT_SYMBOL_GPL(mlx5_buf_free);
 static struct mlx5_db_pgdir *mlx5_alloc_db_pgdir(struct mlx5_core_dev *dev,
 						 int node)
 {
+	u32 db_per_page = PAGE_SIZE / cache_line_size();
 	struct mlx5_db_pgdir *pgdir;
 
 	pgdir = kzalloc(sizeof(*pgdir), GFP_KERNEL);
 	if (!pgdir)
 		return NULL;
 
-	bitmap_fill(pgdir->bitmap, MLX5_DB_PER_PAGE);
+	pgdir->bitmap = kcalloc(BITS_TO_LONGS(db_per_page),
+				sizeof(unsigned long),
+				GFP_KERNEL);
+
+	if (!pgdir->bitmap) {
+		kfree(pgdir);
+		return NULL;
+	}
+
+	bitmap_fill(pgdir->bitmap, db_per_page);
 
 	pgdir->db_page = mlx5_dma_zalloc_coherent_node(dev, PAGE_SIZE,
 						       &pgdir->db_dma, node);
 	if (!pgdir->db_page) {
+		kfree(pgdir->bitmap);
 		kfree(pgdir);
 		return NULL;
 	}
@@ -123,18 +141,19 @@ static struct mlx5_db_pgdir *mlx5_alloc_db_pgdir(struct mlx5_core_dev *dev,
 static int mlx5_alloc_db_from_pgdir(struct mlx5_db_pgdir *pgdir,
 				    struct mlx5_db *db)
 {
+	u32 db_per_page = PAGE_SIZE / cache_line_size();
 	int offset;
 	int i;
 
-	i = find_first_bit(pgdir->bitmap, MLX5_DB_PER_PAGE);
-	if (i >= MLX5_DB_PER_PAGE)
+	i = find_first_bit(pgdir->bitmap, db_per_page);
+	if (i >= db_per_page)
 		return -ENOMEM;
 
 	__clear_bit(i, pgdir->bitmap);
 
 	db->u.pgdir = pgdir;
 	db->index   = i;
-	offset = db->index * L1_CACHE_BYTES;
+	offset = db->index * cache_line_size();
 	db->db      = pgdir->db_page + offset / sizeof(*pgdir->db_page);
 	db->dma     = pgdir->db_dma  + offset;
 
@@ -181,14 +200,16 @@ EXPORT_SYMBOL_GPL(mlx5_db_alloc);
 
 void mlx5_db_free(struct mlx5_core_dev *dev, struct mlx5_db *db)
 {
+	u32 db_per_page = PAGE_SIZE / cache_line_size();
 	mutex_lock(&dev->priv.pgdir_mutex);
 
 	__set_bit(db->index, db->u.pgdir->bitmap);
 
-	if (bitmap_full(db->u.pgdir->bitmap, MLX5_DB_PER_PAGE)) {
+	if (bitmap_full(db->u.pgdir->bitmap, db_per_page)) {
 		dma_free_coherent(&(dev->pdev->dev), PAGE_SIZE,
 				  db->u.pgdir->db_page, db->u.pgdir->db_dma);
 		list_del(&db->u.pgdir->list);
+		kfree(db->u.pgdir->bitmap);
 		kfree(db->u.pgdir);
 	}
 
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 85c4786..5dbda60 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -626,10 +626,6 @@ struct mlx5_db {
 };
 
 enum {
-	MLX5_DB_PER_PAGE = PAGE_SIZE / L1_CACHE_BYTES,
-};
-
-enum {
 	MLX5_COMP_EQ_SIZE = 1024,
 };
 
@@ -638,13 +634,6 @@ enum {
 	MLX5_PTYS_EN = 1 << 2,
 };
 
-struct mlx5_db_pgdir {
-	struct list_head	list;
-	DECLARE_BITMAP(bitmap, MLX5_DB_PER_PAGE);
-	__be32		       *db_page;
-	dma_addr_t		db_dma;
-};
-
 typedef void (*mlx5_cmd_cbk_t)(int status, void *context);
 
 struct mlx5_cmd_work_ent {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 02/12] net/mlx5: Always Query HCA caps after setting them
  2016-10-25 15:36 [PATCH net 00/12] Mellanox 100G mlx5 fixes 2016-10-25 Saeed Mahameed
  2016-10-25 15:36 ` [PATCH net 01/12] {net, ib}/mlx5: Make cache line size determination at runtime Saeed Mahameed
@ 2016-10-25 15:36 ` Saeed Mahameed
  2016-10-25 15:36 ` [PATCH net 03/12] net/mlx5: Keep autogroups list ordered Saeed Mahameed
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Saeed Mahameed @ 2016-10-25 15:36 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, Leon Romanovsky, Daniel Jurgens, Saeed Mahameed

From: Daniel Jurgens <danielj@mellanox.com>

Always query the HCA caps after setting them to update the capablities
data structures. Not doing so results in incorrect capabilities being
reported including max_dc, max_qp and several others.

Fixes: 59211bd3b632 ("net/mlx5: Split the load/unload flow into hardware
and software flows")
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/main.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index d9c3c70..8a63910 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -844,12 +844,6 @@ static int mlx5_init_once(struct mlx5_core_dev *dev, struct mlx5_priv *priv)
 	struct pci_dev *pdev = dev->pdev;
 	int err;
 
-	err = mlx5_query_hca_caps(dev);
-	if (err) {
-		dev_err(&pdev->dev, "query hca failed\n");
-		goto out;
-	}
-
 	err = mlx5_query_board_id(dev);
 	if (err) {
 		dev_err(&pdev->dev, "query board id failed\n");
@@ -1023,6 +1017,12 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
 
 	mlx5_start_health_poll(dev);
 
+	err = mlx5_query_hca_caps(dev);
+	if (err) {
+		dev_err(&pdev->dev, "query hca failed\n");
+		goto err_stop_poll;
+	}
+
 	if (boot && mlx5_init_once(dev, priv)) {
 		dev_err(&pdev->dev, "sw objs init failed\n");
 		goto err_stop_poll;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 03/12] net/mlx5: Keep autogroups list ordered
  2016-10-25 15:36 [PATCH net 00/12] Mellanox 100G mlx5 fixes 2016-10-25 Saeed Mahameed
  2016-10-25 15:36 ` [PATCH net 01/12] {net, ib}/mlx5: Make cache line size determination at runtime Saeed Mahameed
  2016-10-25 15:36 ` [PATCH net 02/12] net/mlx5: Always Query HCA caps after setting them Saeed Mahameed
@ 2016-10-25 15:36 ` Saeed Mahameed
  2016-10-25 15:36 ` [PATCH net 04/12] net/mlx5: Fix autogroups groups num not decreasing Saeed Mahameed
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Saeed Mahameed @ 2016-10-25 15:36 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, Leon Romanovsky, Paul Blakey, Saeed Mahameed

From: Paul Blakey <paulb@mellanox.com>

Finding a new autogroup range is done by going over a group list
sorted by each group start index. The search is stopped after finding
the first free range. Adding the newly created group to the list is
wrongly added to the end of the list regardless of its start index as
the parameter of where to insert it is ignored.

This commit makes sure to use that unused parameter to insert
it where requested.

Fixes: f0d22d187473 ('net/mlx5_core: Introduce flow steering autogrouped flow table')
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 5da2cc8..17559c8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -879,7 +879,7 @@ static struct mlx5_flow_group *create_flow_group_common(struct mlx5_flow_table *
 	tree_init_node(&fg->node, !is_auto_fg, del_flow_group);
 	tree_add_node(&fg->node, &ft->node);
 	/* Add node to group list */
-	list_add(&fg->node.list, ft->node.children.prev);
+	list_add(&fg->node.list, prev_fg);
 
 	return fg;
 }
@@ -893,7 +893,7 @@ struct mlx5_flow_group *mlx5_create_flow_group(struct mlx5_flow_table *ft,
 		return ERR_PTR(-EPERM);
 
 	lock_ref_node(&ft->node);
-	fg = create_flow_group_common(ft, fg_in, &ft->node.children, false);
+	fg = create_flow_group_common(ft, fg_in, ft->node.children.prev, false);
 	unlock_ref_node(&ft->node);
 
 	return fg;
@@ -1012,7 +1012,7 @@ static struct mlx5_flow_group *create_autogroup(struct mlx5_flow_table *ft,
 						u32 *match_criteria)
 {
 	int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in);
-	struct list_head *prev = &ft->node.children;
+	struct list_head *prev = ft->node.children.prev;
 	unsigned int candidate_index = 0;
 	struct mlx5_flow_group *fg;
 	void *match_criteria_addr;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 04/12] net/mlx5: Fix autogroups groups num not decreasing
  2016-10-25 15:36 [PATCH net 00/12] Mellanox 100G mlx5 fixes 2016-10-25 Saeed Mahameed
                   ` (2 preceding siblings ...)
  2016-10-25 15:36 ` [PATCH net 03/12] net/mlx5: Keep autogroups list ordered Saeed Mahameed
@ 2016-10-25 15:36 ` Saeed Mahameed
  2016-10-25 15:36 ` [PATCH net 05/12] net/mlx5: Correctly initialize last use of flow counters Saeed Mahameed
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Saeed Mahameed @ 2016-10-25 15:36 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, Leon Romanovsky, Paul Blakey, Saeed Mahameed

From: Paul Blakey <paulb@mellanox.com>

Autogroups groups num is increased when creating a new flow group,
but is never decreased.

Now decreasing it when deleting a flow group.

Fixes: f0d22d187473 ('net/mlx5_core: Introduce flow steering autogrouped flow table')
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 17559c8..8969604 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -436,6 +436,9 @@ static void del_flow_group(struct fs_node *node)
 	fs_get_obj(ft, fg->node.parent);
 	dev = get_dev(&ft->node);
 
+	if (ft->autogroup.active)
+		ft->autogroup.num_groups--;
+
 	if (mlx5_cmd_destroy_flow_group(dev, ft, fg->id))
 		mlx5_core_warn(dev, "flow steering can't destroy fg %d of ft %d\n",
 			       fg->id, ft->id);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 05/12] net/mlx5: Correctly initialize last use of flow counters
  2016-10-25 15:36 [PATCH net 00/12] Mellanox 100G mlx5 fixes 2016-10-25 Saeed Mahameed
                   ` (3 preceding siblings ...)
  2016-10-25 15:36 ` [PATCH net 04/12] net/mlx5: Fix autogroups groups num not decreasing Saeed Mahameed
@ 2016-10-25 15:36 ` Saeed Mahameed
  2016-10-25 15:36 ` [PATCH net 06/12] net/mlx5e: Choose best nearest LRO timeout Saeed Mahameed
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Saeed Mahameed @ 2016-10-25 15:36 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, Leon Romanovsky, Paul Blakey, Saeed Mahameed

From: Paul Blakey <paulb@mellanox.com>

Currently, last use timestamp is initialized to zero.
This is not the expected value by higher layers such as
when we do TC action offloading. To fix that, set it to
the current time, e.g when the counter/rule is offloaded.
This is the same behaviour of non-offloaded TC actions.

Fixes: 43a335e055bb ('mlx5_core: Flow counters infrastructure')
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
index 3a9195b..3b026c1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
@@ -218,6 +218,7 @@ struct mlx5_fc *mlx5_fc_create(struct mlx5_core_dev *dev, bool aging)
 		goto err_out;
 
 	if (aging) {
+		counter->cache.lastuse = jiffies;
 		counter->aging = true;
 
 		spin_lock(&fc_stats->addlist_lock);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 06/12] net/mlx5e: Choose best nearest LRO timeout
  2016-10-25 15:36 [PATCH net 00/12] Mellanox 100G mlx5 fixes 2016-10-25 Saeed Mahameed
                   ` (4 preceding siblings ...)
  2016-10-25 15:36 ` [PATCH net 05/12] net/mlx5: Correctly initialize last use of flow counters Saeed Mahameed
@ 2016-10-25 15:36 ` Saeed Mahameed
  2016-10-25 15:36 ` [PATCH net 07/12] net/mlx5e: Unregister netdev before detaching it Saeed Mahameed
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Saeed Mahameed @ 2016-10-25 15:36 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, Leon Romanovsky, Saeed Mahameed, Mohamad Haj Yahia

Instead of predicting the index of the wanted LRO timeout value from
hardware capabilities, look for the nearest LRO timeout value.

Fixes: 5c50368f3831 ('net/mlx5e: Light-weight netdev open/stop')
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |  5 +++++
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 19 ++++++++++++++++---
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 460363b..7a43502 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -85,6 +85,9 @@
 #define MLX5_MPWRQ_SMALL_PACKET_THRESHOLD	(128)
 
 #define MLX5E_PARAMS_DEFAULT_LRO_WQE_SZ                 (64 * 1024)
+#define MLX5E_DEFAULT_LRO_TIMEOUT                       32
+#define MLX5E_LRO_TIMEOUT_ARR_SIZE                      4
+
 #define MLX5E_PARAMS_DEFAULT_RX_CQ_MODERATION_USEC      0x10
 #define MLX5E_PARAMS_DEFAULT_RX_CQ_MODERATION_USEC_FROM_CQE 0x3
 #define MLX5E_PARAMS_DEFAULT_RX_CQ_MODERATION_PKTS      0x20
@@ -221,6 +224,7 @@ struct mlx5e_params {
 	struct ieee_ets ets;
 #endif
 	bool rx_am_enabled;
+	u32 lro_timeout;
 };
 
 struct mlx5e_tstamp {
@@ -888,5 +892,6 @@ int mlx5e_attach_netdev(struct mlx5_core_dev *mdev, struct net_device *netdev);
 void mlx5e_detach_netdev(struct mlx5_core_dev *mdev, struct net_device *netdev);
 struct rtnl_link_stats64 *
 mlx5e_get_stats(struct net_device *dev, struct rtnl_link_stats64 *stats);
+u32 mlx5e_choose_lro_timeout(struct mlx5_core_dev *mdev, u32 wanted_timeout);
 
 #endif /* __MLX5_EN_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 7eaf380..5c8ab3e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -1971,9 +1971,7 @@ static void mlx5e_build_tir_ctx_lro(void *tirc, struct mlx5e_priv *priv)
 	MLX5_SET(tirc, tirc, lro_max_ip_payload_size,
 		 (priv->params.lro_wqe_sz -
 		  ROUGH_MAX_L2_L3_HDR_SZ) >> 8);
-	MLX5_SET(tirc, tirc, lro_timeout_period_usecs,
-		 MLX5_CAP_ETH(priv->mdev,
-			      lro_timer_supported_periods[2]));
+	MLX5_SET(tirc, tirc, lro_timeout_period_usecs, priv->params.lro_timeout);
 }
 
 void mlx5e_build_tir_ctx_hash(void *tirc, struct mlx5e_priv *priv)
@@ -3401,6 +3399,18 @@ static void mlx5e_query_min_inline(struct mlx5_core_dev *mdev,
 	}
 }
 
+u32 mlx5e_choose_lro_timeout(struct mlx5_core_dev *mdev, u32 wanted_timeout)
+{
+	int i;
+
+	/* The supported periods are organized in ascending order */
+	for (i = 0; i < MLX5E_LRO_TIMEOUT_ARR_SIZE - 1; i++)
+		if (MLX5_CAP_ETH(mdev, lro_timer_supported_periods[i]) >= wanted_timeout)
+			break;
+
+	return MLX5_CAP_ETH(mdev, lro_timer_supported_periods[i]);
+}
+
 static void mlx5e_build_nic_netdev_priv(struct mlx5_core_dev *mdev,
 					struct net_device *netdev,
 					const struct mlx5e_profile *profile,
@@ -3419,6 +3429,9 @@ static void mlx5e_build_nic_netdev_priv(struct mlx5_core_dev *mdev,
 	priv->profile                      = profile;
 	priv->ppriv                        = ppriv;
 
+	priv->params.lro_timeout =
+		mlx5e_choose_lro_timeout(mdev, MLX5E_DEFAULT_LRO_TIMEOUT);
+
 	priv->params.log_sq_size = MLX5E_PARAMS_DEFAULT_LOG_SQ_SIZE;
 
 	/* set CQE compression */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 07/12] net/mlx5e: Unregister netdev before detaching it
  2016-10-25 15:36 [PATCH net 00/12] Mellanox 100G mlx5 fixes 2016-10-25 Saeed Mahameed
                   ` (5 preceding siblings ...)
  2016-10-25 15:36 ` [PATCH net 06/12] net/mlx5e: Choose best nearest LRO timeout Saeed Mahameed
@ 2016-10-25 15:36 ` Saeed Mahameed
  2016-10-25 15:36 ` [PATCH net 08/12] net/mlx5: Change the acl enable prototype to return status Saeed Mahameed
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Saeed Mahameed @ 2016-10-25 15:36 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, Leon Romanovsky, Mohamad Haj Yahia, Saeed Mahameed

From: Mohamad Haj Yahia <mohamad@mellanox.com>

Detaching the netdev before unregistering it cause some netdev cleanup
ndos to fail because they check presence of the netdev, so we need to
unregister the netdev first.

Fixes: 26e59d8077a3 ('net/mlx5e: Implement mlx5e interface attach/detach callbacks')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c  | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 5c8ab3e..f4c687c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4048,7 +4048,6 @@ void mlx5e_destroy_netdev(struct mlx5_core_dev *mdev, struct mlx5e_priv *priv)
 	const struct mlx5e_profile *profile = priv->profile;
 	struct net_device *netdev = priv->netdev;
 
-	unregister_netdev(netdev);
 	destroy_workqueue(priv->wq);
 	if (profile->cleanup)
 		profile->cleanup(priv);
@@ -4065,6 +4064,7 @@ static void mlx5e_remove(struct mlx5_core_dev *mdev, void *vpriv)
 	for (vport = 1; vport < total_vfs; vport++)
 		mlx5_eswitch_unregister_vport_rep(esw, vport);
 
+	unregister_netdev(priv->netdev);
 	mlx5e_detach(mdev, vpriv);
 	mlx5e_destroy_netdev(mdev, priv);
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 3c97da1..7fe6559 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -457,6 +457,7 @@ void mlx5e_vport_rep_unload(struct mlx5_eswitch *esw,
 	struct mlx5e_priv *priv = rep->priv_data;
 	struct net_device *netdev = priv->netdev;
 
+	unregister_netdev(netdev);
 	mlx5e_detach_netdev(esw->dev, netdev);
 	mlx5e_destroy_netdev(esw->dev, priv);
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 08/12] net/mlx5: Change the acl enable prototype to return status
  2016-10-25 15:36 [PATCH net 00/12] Mellanox 100G mlx5 fixes 2016-10-25 Saeed Mahameed
                   ` (6 preceding siblings ...)
  2016-10-25 15:36 ` [PATCH net 07/12] net/mlx5e: Unregister netdev before detaching it Saeed Mahameed
@ 2016-10-25 15:36 ` Saeed Mahameed
  2016-10-25 15:36 ` [PATCH net 09/12] net/mlx5: Clear health sick bit when starting health poll Saeed Mahameed
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Saeed Mahameed @ 2016-10-25 15:36 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, Leon Romanovsky, Mohamad Haj Yahia, Saeed Mahameed

From: Mohamad Haj Yahia <mohamad@mellanox.com>

The Ingress/Egress ACL enable function may fail and it should return
status to its caller to avoid NULL pointer dereference.

Fixes: f942380c1239 ('net/mlx5: E-Switch, Vport ingress/egress ACLs rules for spoofchk')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 50 +++++++++++++++--------
 1 file changed, 34 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
index abbf2c3..be1f733 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -931,8 +931,8 @@ static void esw_vport_change_handler(struct work_struct *work)
 	mutex_unlock(&esw->state_lock);
 }
 
-static void esw_vport_enable_egress_acl(struct mlx5_eswitch *esw,
-					struct mlx5_vport *vport)
+static int esw_vport_enable_egress_acl(struct mlx5_eswitch *esw,
+				       struct mlx5_vport *vport)
 {
 	int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in);
 	struct mlx5_flow_group *vlan_grp = NULL;
@@ -949,9 +949,11 @@ static void esw_vport_enable_egress_acl(struct mlx5_eswitch *esw,
 	int table_size = 2;
 	int err = 0;
 
-	if (!MLX5_CAP_ESW_EGRESS_ACL(dev, ft_support) ||
-	    !IS_ERR_OR_NULL(vport->egress.acl))
-		return;
+	if (!MLX5_CAP_ESW_EGRESS_ACL(dev, ft_support))
+		return -EOPNOTSUPP;
+
+	if (!IS_ERR_OR_NULL(vport->egress.acl))
+		return 0;
 
 	esw_debug(dev, "Create vport[%d] egress ACL log_max_size(%d)\n",
 		  vport->vport, MLX5_CAP_ESW_EGRESS_ACL(dev, log_max_ft_size));
@@ -959,12 +961,12 @@ static void esw_vport_enable_egress_acl(struct mlx5_eswitch *esw,
 	root_ns = mlx5_get_flow_namespace(dev, MLX5_FLOW_NAMESPACE_ESW_EGRESS);
 	if (!root_ns) {
 		esw_warn(dev, "Failed to get E-Switch egress flow namespace\n");
-		return;
+		return -EIO;
 	}
 
 	flow_group_in = mlx5_vzalloc(inlen);
 	if (!flow_group_in)
-		return;
+		return -ENOMEM;
 
 	acl = mlx5_create_vport_flow_table(root_ns, 0, table_size, 0, vport->vport);
 	if (IS_ERR(acl)) {
@@ -1009,6 +1011,7 @@ static void esw_vport_enable_egress_acl(struct mlx5_eswitch *esw,
 		mlx5_destroy_flow_group(vlan_grp);
 	if (err && !IS_ERR_OR_NULL(acl))
 		mlx5_destroy_flow_table(acl);
+	return err;
 }
 
 static void esw_vport_cleanup_egress_rules(struct mlx5_eswitch *esw,
@@ -1041,8 +1044,8 @@ static void esw_vport_disable_egress_acl(struct mlx5_eswitch *esw,
 	vport->egress.acl = NULL;
 }
 
-static void esw_vport_enable_ingress_acl(struct mlx5_eswitch *esw,
-					 struct mlx5_vport *vport)
+static int esw_vport_enable_ingress_acl(struct mlx5_eswitch *esw,
+					struct mlx5_vport *vport)
 {
 	int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in);
 	struct mlx5_core_dev *dev = esw->dev;
@@ -1063,9 +1066,11 @@ static void esw_vport_enable_ingress_acl(struct mlx5_eswitch *esw,
 	int table_size = 4;
 	int err = 0;
 
-	if (!MLX5_CAP_ESW_INGRESS_ACL(dev, ft_support) ||
-	    !IS_ERR_OR_NULL(vport->ingress.acl))
-		return;
+	if (!MLX5_CAP_ESW_INGRESS_ACL(dev, ft_support))
+		return -EOPNOTSUPP;
+
+	if (!IS_ERR_OR_NULL(vport->ingress.acl))
+		return 0;
 
 	esw_debug(dev, "Create vport[%d] ingress ACL log_max_size(%d)\n",
 		  vport->vport, MLX5_CAP_ESW_INGRESS_ACL(dev, log_max_ft_size));
@@ -1073,12 +1078,12 @@ static void esw_vport_enable_ingress_acl(struct mlx5_eswitch *esw,
 	root_ns = mlx5_get_flow_namespace(dev, MLX5_FLOW_NAMESPACE_ESW_INGRESS);
 	if (!root_ns) {
 		esw_warn(dev, "Failed to get E-Switch ingress flow namespace\n");
-		return;
+		return -EIO;
 	}
 
 	flow_group_in = mlx5_vzalloc(inlen);
 	if (!flow_group_in)
-		return;
+		return -ENOMEM;
 
 	acl = mlx5_create_vport_flow_table(root_ns, 0, table_size, 0, vport->vport);
 	if (IS_ERR(acl)) {
@@ -1167,6 +1172,7 @@ static void esw_vport_enable_ingress_acl(struct mlx5_eswitch *esw,
 	}
 
 	kvfree(flow_group_in);
+	return err;
 }
 
 static void esw_vport_cleanup_ingress_rules(struct mlx5_eswitch *esw,
@@ -1225,7 +1231,13 @@ static int esw_vport_ingress_config(struct mlx5_eswitch *esw,
 		return 0;
 	}
 
-	esw_vport_enable_ingress_acl(esw, vport);
+	err = esw_vport_enable_ingress_acl(esw, vport);
+	if (err) {
+		mlx5_core_warn(esw->dev,
+			       "failed to enable ingress acl (%d) on vport[%d]\n",
+			       err, vport->vport);
+		return err;
+	}
 
 	esw_debug(esw->dev,
 		  "vport[%d] configure ingress rules, vlan(%d) qos(%d)\n",
@@ -1299,7 +1311,13 @@ static int esw_vport_egress_config(struct mlx5_eswitch *esw,
 		return 0;
 	}
 
-	esw_vport_enable_egress_acl(esw, vport);
+	err = esw_vport_enable_egress_acl(esw, vport);
+	if (err) {
+		mlx5_core_warn(esw->dev,
+			       "failed to enable egress acl (%d) on vport[%d]\n",
+			       err, vport->vport);
+		return err;
+	}
 
 	esw_debug(esw->dev,
 		  "vport[%d] configure egress rules, vlan(%d) qos(%d)\n",
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 09/12] net/mlx5: Clear health sick bit when starting health poll
  2016-10-25 15:36 [PATCH net 00/12] Mellanox 100G mlx5 fixes 2016-10-25 Saeed Mahameed
                   ` (7 preceding siblings ...)
  2016-10-25 15:36 ` [PATCH net 08/12] net/mlx5: Change the acl enable prototype to return status Saeed Mahameed
@ 2016-10-25 15:36 ` Saeed Mahameed
  2016-10-25 15:36 ` [PATCH net 10/12] net/mlx5: Fix race between PCI error handlers and health work Saeed Mahameed
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Saeed Mahameed @ 2016-10-25 15:36 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, Leon Romanovsky, Mohamad Haj Yahia, Saeed Mahameed

From: Mohamad Haj Yahia <mohamad@mellanox.com>

The health sick status should be cleared when we start the health poll.
This is crucial for driver reload (unload + load) in order to behave
right in case of health issue.

Fixes: fd76ee4da55a ('net/mlx5_core: Fix internal error detection conditions')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/health.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 1a05fb9..2cb4094 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -281,6 +281,7 @@ void mlx5_start_health_poll(struct mlx5_core_dev *dev)
 	struct mlx5_core_health *health = &dev->priv.health;
 
 	init_timer(&health->timer);
+	health->sick = 0;
 	health->health = &dev->iseg->health;
 	health->health_counter = &dev->iseg->health_counter;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 10/12] net/mlx5: Fix race between PCI error handlers and health work
  2016-10-25 15:36 [PATCH net 00/12] Mellanox 100G mlx5 fixes 2016-10-25 Saeed Mahameed
                   ` (8 preceding siblings ...)
  2016-10-25 15:36 ` [PATCH net 09/12] net/mlx5: Clear health sick bit when starting health poll Saeed Mahameed
@ 2016-10-25 15:36 ` Saeed Mahameed
  2016-10-25 15:36 ` [PATCH net 11/12] net/mlx5: PCI error recovery health care simulation Saeed Mahameed
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Saeed Mahameed @ 2016-10-25 15:36 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, Leon Romanovsky, Mohamad Haj Yahia, Saeed Mahameed

From: Mohamad Haj Yahia <mohamad@mellanox.com>

Currently there is a race between the health care work and the kernel
pci error handlers because both of them detect the error, the first one
to be called will do the error handling.
There is a chance that health care will disable the pci after resuming
pci slot.
Also create a separate WQ because now we will have two types of health
works, one for the error detection and one for the recovery.

Fixes: 89d44f0a6c73 ('net/mlx5_core: Add pci error handlers to mlx5_core driver')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/health.c | 30 +++++++++++++++++++++---
 drivers/net/ethernet/mellanox/mlx5/core/main.c   |  9 +++++--
 include/linux/mlx5/driver.h                      |  4 ++++
 3 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 2cb4094..2d00022 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -64,6 +64,10 @@ enum {
 	MLX5_NIC_IFC_NO_DRAM_NIC	= 2
 };
 
+enum {
+	MLX5_DROP_NEW_HEALTH_WORK,
+};
+
 static u8 get_nic_interface(struct mlx5_core_dev *dev)
 {
 	return (ioread32be(&dev->iseg->cmdq_addr_l_sz) >> 8) & 3;
@@ -272,7 +276,13 @@ static void poll_health(unsigned long data)
 	if (in_fatal(dev) && !health->sick) {
 		health->sick = true;
 		print_health_info(dev);
-		schedule_work(&health->work);
+		spin_lock(&health->wq_lock);
+		if (!test_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags))
+			queue_work(health->wq, &health->work);
+		else
+			dev_err(&dev->pdev->dev,
+				"new health works are not permitted at this stage\n");
+		spin_unlock(&health->wq_lock);
 	}
 }
 
@@ -282,6 +292,7 @@ void mlx5_start_health_poll(struct mlx5_core_dev *dev)
 
 	init_timer(&health->timer);
 	health->sick = 0;
+	clear_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags);
 	health->health = &dev->iseg->health;
 	health->health_counter = &dev->iseg->health_counter;
 
@@ -298,11 +309,21 @@ void mlx5_stop_health_poll(struct mlx5_core_dev *dev)
 	del_timer_sync(&health->timer);
 }
 
+void mlx5_drain_health_wq(struct mlx5_core_dev *dev)
+{
+	struct mlx5_core_health *health = &dev->priv.health;
+
+	spin_lock(&health->wq_lock);
+	set_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags);
+	spin_unlock(&health->wq_lock);
+	cancel_work_sync(&health->work);
+}
+
 void mlx5_health_cleanup(struct mlx5_core_dev *dev)
 {
 	struct mlx5_core_health *health = &dev->priv.health;
 
-	flush_work(&health->work);
+	destroy_workqueue(health->wq);
 }
 
 int mlx5_health_init(struct mlx5_core_dev *dev)
@@ -317,8 +338,11 @@ int mlx5_health_init(struct mlx5_core_dev *dev)
 
 	strcpy(name, "mlx5_health");
 	strcat(name, dev_name(&dev->pdev->dev));
+	health->wq = create_singlethread_workqueue(name);
 	kfree(name);
-
+	if (!health->wq)
+		return -ENOMEM;
+	spin_lock_init(&health->wq_lock);
 	INIT_WORK(&health->work, health_care);
 
 	return 0;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 8a63910..9f90226 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1315,8 +1315,13 @@ static pci_ers_result_t mlx5_pci_err_detected(struct pci_dev *pdev,
 	dev_info(&pdev->dev, "%s was called\n", __func__);
 	mlx5_enter_error_state(dev);
 	mlx5_unload_one(dev, priv, false);
-	pci_save_state(pdev);
-	mlx5_pci_disable_device(dev);
+	/* In case of kernel call save the pci state and drain health wq */
+	if (state) {
+		pci_save_state(pdev);
+		mlx5_drain_health_wq(dev);
+		mlx5_pci_disable_device(dev);
+	}
+
 	return state == pci_channel_io_perm_failure ?
 		PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_NEED_RESET;
 }
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 5dbda60..7d9a5d0 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -418,7 +418,10 @@ struct mlx5_core_health {
 	u32				prev;
 	int				miss_counter;
 	bool				sick;
+	/* wq spinlock to synchronize draining */
+	spinlock_t			wq_lock;
 	struct workqueue_struct	       *wq;
+	unsigned long			flags;
 	struct work_struct		work;
 };
 
@@ -778,6 +781,7 @@ void mlx5_health_cleanup(struct mlx5_core_dev *dev);
 int mlx5_health_init(struct mlx5_core_dev *dev);
 void mlx5_start_health_poll(struct mlx5_core_dev *dev);
 void mlx5_stop_health_poll(struct mlx5_core_dev *dev);
+void mlx5_drain_health_wq(struct mlx5_core_dev *dev);
 int mlx5_buf_alloc_node(struct mlx5_core_dev *dev, int size,
 			struct mlx5_buf *buf, int node);
 int mlx5_buf_alloc(struct mlx5_core_dev *dev, int size, struct mlx5_buf *buf);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 11/12] net/mlx5: PCI error recovery health care simulation
  2016-10-25 15:36 [PATCH net 00/12] Mellanox 100G mlx5 fixes 2016-10-25 Saeed Mahameed
                   ` (9 preceding siblings ...)
  2016-10-25 15:36 ` [PATCH net 10/12] net/mlx5: Fix race between PCI error handlers and health work Saeed Mahameed
@ 2016-10-25 15:36 ` Saeed Mahameed
  2016-10-25 15:36 ` [PATCH net 12/12] net/mlx5: Avoid passing dma address 0 to firmware Saeed Mahameed
  2016-10-29 15:56 ` [PATCH net 00/12] Mellanox 100G mlx5 fixes 2016-10-25 David Miller
  12 siblings, 0 replies; 14+ messages in thread
From: Saeed Mahameed @ 2016-10-25 15:36 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, Leon Romanovsky, Mohamad Haj Yahia, Saeed Mahameed

From: Mohamad Haj Yahia <mohamad@mellanox.com>

In case that the kernel PCI error handlers are not called, we will
trigger our own recovery flow.

The health work will give priority to the kernel pci error handlers to
recover the PCI by waiting for a small period, if the pci error handlers
are not triggered the manual recovery flow will be executed.

We don't save pci state in case of manual recovery because it will ruin the
pci configuration space and we will lose dma sync.

Fixes: 89d44f0a6c73 ('net/mlx5_core: Add pci error handlers to mlx5_core driver')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/health.c   | 45 ++++++++++++++++++++--
 drivers/net/ethernet/mellanox/mlx5/core/main.c     | 18 ++++++---
 .../net/ethernet/mellanox/mlx5/core/mlx5_core.h    |  1 +
 include/linux/mlx5/driver.h                        |  1 +
 4 files changed, 56 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 2d00022..5bcf934 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -61,14 +61,15 @@ enum {
 enum {
 	MLX5_NIC_IFC_FULL		= 0,
 	MLX5_NIC_IFC_DISABLED		= 1,
-	MLX5_NIC_IFC_NO_DRAM_NIC	= 2
+	MLX5_NIC_IFC_NO_DRAM_NIC	= 2,
+	MLX5_NIC_IFC_INVALID		= 3
 };
 
 enum {
 	MLX5_DROP_NEW_HEALTH_WORK,
 };
 
-static u8 get_nic_interface(struct mlx5_core_dev *dev)
+static u8 get_nic_state(struct mlx5_core_dev *dev)
 {
 	return (ioread32be(&dev->iseg->cmdq_addr_l_sz) >> 8) & 3;
 }
@@ -101,7 +102,7 @@ static int in_fatal(struct mlx5_core_dev *dev)
 	struct mlx5_core_health *health = &dev->priv.health;
 	struct health_buffer __iomem *h = health->health;
 
-	if (get_nic_interface(dev) == MLX5_NIC_IFC_DISABLED)
+	if (get_nic_state(dev) == MLX5_NIC_IFC_DISABLED)
 		return 1;
 
 	if (ioread32be(&h->fw_ver) == 0xffffffff)
@@ -131,7 +132,7 @@ void mlx5_enter_error_state(struct mlx5_core_dev *dev)
 
 static void mlx5_handle_bad_state(struct mlx5_core_dev *dev)
 {
-	u8 nic_interface = get_nic_interface(dev);
+	u8 nic_interface = get_nic_state(dev);
 
 	switch (nic_interface) {
 	case MLX5_NIC_IFC_FULL:
@@ -153,8 +154,34 @@ static void mlx5_handle_bad_state(struct mlx5_core_dev *dev)
 	mlx5_disable_device(dev);
 }
 
+static void health_recover(struct work_struct *work)
+{
+	struct mlx5_core_health *health;
+	struct delayed_work *dwork;
+	struct mlx5_core_dev *dev;
+	struct mlx5_priv *priv;
+	u8 nic_state;
+
+	dwork = container_of(work, struct delayed_work, work);
+	health = container_of(dwork, struct mlx5_core_health, recover_work);
+	priv = container_of(health, struct mlx5_priv, health);
+	dev = container_of(priv, struct mlx5_core_dev, priv);
+
+	nic_state = get_nic_state(dev);
+	if (nic_state == MLX5_NIC_IFC_INVALID) {
+		dev_err(&dev->pdev->dev, "health recovery flow aborted since the nic state is invalid\n");
+		return;
+	}
+
+	dev_err(&dev->pdev->dev, "starting health recovery flow\n");
+	mlx5_recover_device(dev);
+}
+
+/* How much time to wait until health resetting the driver (in msecs) */
+#define MLX5_RECOVERY_DELAY_MSECS 60000
 static void health_care(struct work_struct *work)
 {
+	unsigned long recover_delay = msecs_to_jiffies(MLX5_RECOVERY_DELAY_MSECS);
 	struct mlx5_core_health *health;
 	struct mlx5_core_dev *dev;
 	struct mlx5_priv *priv;
@@ -164,6 +191,14 @@ static void health_care(struct work_struct *work)
 	dev = container_of(priv, struct mlx5_core_dev, priv);
 	mlx5_core_warn(dev, "handling bad device here\n");
 	mlx5_handle_bad_state(dev);
+
+	spin_lock(&health->wq_lock);
+	if (!test_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags))
+		schedule_delayed_work(&health->recover_work, recover_delay);
+	else
+		dev_err(&dev->pdev->dev,
+			"new health works are not permitted at this stage\n");
+	spin_unlock(&health->wq_lock);
 }
 
 static const char *hsynd_str(u8 synd)
@@ -316,6 +351,7 @@ void mlx5_drain_health_wq(struct mlx5_core_dev *dev)
 	spin_lock(&health->wq_lock);
 	set_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags);
 	spin_unlock(&health->wq_lock);
+	cancel_delayed_work_sync(&health->recover_work);
 	cancel_work_sync(&health->work);
 }
 
@@ -344,6 +380,7 @@ int mlx5_health_init(struct mlx5_core_dev *dev)
 		return -ENOMEM;
 	spin_lock_init(&health->wq_lock);
 	INIT_WORK(&health->work, health_care);
+	INIT_DELAYED_WORK(&health->recover_work, health_recover);
 
 	return 0;
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 9f90226..d5433c4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1313,6 +1313,7 @@ static pci_ers_result_t mlx5_pci_err_detected(struct pci_dev *pdev,
 	struct mlx5_priv *priv = &dev->priv;
 
 	dev_info(&pdev->dev, "%s was called\n", __func__);
+
 	mlx5_enter_error_state(dev);
 	mlx5_unload_one(dev, priv, false);
 	/* In case of kernel call save the pci state and drain health wq */
@@ -1378,11 +1379,6 @@ static pci_ers_result_t mlx5_pci_slot_reset(struct pci_dev *pdev)
 	return PCI_ERS_RESULT_RECOVERED;
 }
 
-void mlx5_disable_device(struct mlx5_core_dev *dev)
-{
-	mlx5_pci_err_detected(dev->pdev, 0);
-}
-
 static void mlx5_pci_resume(struct pci_dev *pdev)
 {
 	struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
@@ -1432,6 +1428,18 @@ static const struct pci_device_id mlx5_core_pci_table[] = {
 
 MODULE_DEVICE_TABLE(pci, mlx5_core_pci_table);
 
+void mlx5_disable_device(struct mlx5_core_dev *dev)
+{
+	mlx5_pci_err_detected(dev->pdev, 0);
+}
+
+void mlx5_recover_device(struct mlx5_core_dev *dev)
+{
+	mlx5_pci_disable_device(dev);
+	if (mlx5_pci_slot_reset(dev->pdev) == PCI_ERS_RESULT_RECOVERED)
+		mlx5_pci_resume(dev->pdev);
+}
+
 static struct pci_driver mlx5_core_driver = {
 	.name           = DRIVER_NAME,
 	.id_table       = mlx5_core_pci_table,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index 3d0cfb9..187662c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -83,6 +83,7 @@ void mlx5_core_event(struct mlx5_core_dev *dev, enum mlx5_dev_event event,
 		     unsigned long param);
 void mlx5_enter_error_state(struct mlx5_core_dev *dev);
 void mlx5_disable_device(struct mlx5_core_dev *dev);
+void mlx5_recover_device(struct mlx5_core_dev *dev);
 int mlx5_sriov_init(struct mlx5_core_dev *dev);
 void mlx5_sriov_cleanup(struct mlx5_core_dev *dev);
 int mlx5_sriov_attach(struct mlx5_core_dev *dev);
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 7d9a5d0..ecc451d 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -423,6 +423,7 @@ struct mlx5_core_health {
 	struct workqueue_struct	       *wq;
 	unsigned long			flags;
 	struct work_struct		work;
+	struct delayed_work		recover_work;
 };
 
 struct mlx5_cq_table {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net 12/12] net/mlx5: Avoid passing dma address 0 to firmware
  2016-10-25 15:36 [PATCH net 00/12] Mellanox 100G mlx5 fixes 2016-10-25 Saeed Mahameed
                   ` (10 preceding siblings ...)
  2016-10-25 15:36 ` [PATCH net 11/12] net/mlx5: PCI error recovery health care simulation Saeed Mahameed
@ 2016-10-25 15:36 ` Saeed Mahameed
  2016-10-29 15:56 ` [PATCH net 00/12] Mellanox 100G mlx5 fixes 2016-10-25 David Miller
  12 siblings, 0 replies; 14+ messages in thread
From: Saeed Mahameed @ 2016-10-25 15:36 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, Leon Romanovsky, Noa Osherovich, Saeed Mahameed

From: Noa Osherovich <noaos@mellanox.com>

Currently the firmware can't work with a page with dma address 0.
Passing such an address to the firmware will cause the give_pages
command to fail.

To avoid this, in case we get a 0 dma address of a page from the
dma engine, we avoid passing it to FW by remapping to get an address
other than 0.

Fixes: bf0bf77f6519 ('mlx5: Support communicating arbitrary host...')
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/pagealloc.c    | 26 +++++++++++++++-------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c b/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c
index cc4fd61..a57d5a8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c
@@ -209,6 +209,7 @@ static void free_4k(struct mlx5_core_dev *dev, u64 addr)
 static int alloc_system_page(struct mlx5_core_dev *dev, u16 func_id)
 {
 	struct page *page;
+	u64 zero_addr = 1;
 	u64 addr;
 	int err;
 	int nid = dev_to_node(&dev->pdev->dev);
@@ -218,26 +219,35 @@ static int alloc_system_page(struct mlx5_core_dev *dev, u16 func_id)
 		mlx5_core_warn(dev, "failed to allocate page\n");
 		return -ENOMEM;
 	}
+map:
 	addr = dma_map_page(&dev->pdev->dev, page, 0,
 			    PAGE_SIZE, DMA_BIDIRECTIONAL);
 	if (dma_mapping_error(&dev->pdev->dev, addr)) {
 		mlx5_core_warn(dev, "failed dma mapping page\n");
 		err = -ENOMEM;
-		goto out_alloc;
+		goto err_mapping;
 	}
+
+	/* Firmware doesn't support page with physical address 0 */
+	if (addr == 0) {
+		zero_addr = addr;
+		goto map;
+	}
+
 	err = insert_page(dev, addr, page, func_id);
 	if (err) {
 		mlx5_core_err(dev, "failed to track allocated page\n");
-		goto out_mapping;
+		dma_unmap_page(&dev->pdev->dev, addr, PAGE_SIZE,
+			       DMA_BIDIRECTIONAL);
 	}
 
-	return 0;
-
-out_mapping:
-	dma_unmap_page(&dev->pdev->dev, addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
+err_mapping:
+	if (err)
+		__free_page(page);
 
-out_alloc:
-	__free_page(page);
+	if (zero_addr == 0)
+		dma_unmap_page(&dev->pdev->dev, zero_addr, PAGE_SIZE,
+			       DMA_BIDIRECTIONAL);
 
 	return err;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net 00/12] Mellanox 100G mlx5 fixes 2016-10-25
  2016-10-25 15:36 [PATCH net 00/12] Mellanox 100G mlx5 fixes 2016-10-25 Saeed Mahameed
                   ` (11 preceding siblings ...)
  2016-10-25 15:36 ` [PATCH net 12/12] net/mlx5: Avoid passing dma address 0 to firmware Saeed Mahameed
@ 2016-10-29 15:56 ` David Miller
  12 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2016-10-29 15:56 UTC (permalink / raw)
  To: saeedm; +Cc: netdev, ogerlitz, leonro

From: Saeed Mahameed <saeedm@mellanox.com>
Date: Tue, 25 Oct 2016 18:36:23 +0300

> This series contains some bug fixes for the mlx5 core and mlx5e
> driver.

Series applied, thank you.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-10-29 15:56 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-25 15:36 [PATCH net 00/12] Mellanox 100G mlx5 fixes 2016-10-25 Saeed Mahameed
2016-10-25 15:36 ` [PATCH net 01/12] {net, ib}/mlx5: Make cache line size determination at runtime Saeed Mahameed
2016-10-25 15:36 ` [PATCH net 02/12] net/mlx5: Always Query HCA caps after setting them Saeed Mahameed
2016-10-25 15:36 ` [PATCH net 03/12] net/mlx5: Keep autogroups list ordered Saeed Mahameed
2016-10-25 15:36 ` [PATCH net 04/12] net/mlx5: Fix autogroups groups num not decreasing Saeed Mahameed
2016-10-25 15:36 ` [PATCH net 05/12] net/mlx5: Correctly initialize last use of flow counters Saeed Mahameed
2016-10-25 15:36 ` [PATCH net 06/12] net/mlx5e: Choose best nearest LRO timeout Saeed Mahameed
2016-10-25 15:36 ` [PATCH net 07/12] net/mlx5e: Unregister netdev before detaching it Saeed Mahameed
2016-10-25 15:36 ` [PATCH net 08/12] net/mlx5: Change the acl enable prototype to return status Saeed Mahameed
2016-10-25 15:36 ` [PATCH net 09/12] net/mlx5: Clear health sick bit when starting health poll Saeed Mahameed
2016-10-25 15:36 ` [PATCH net 10/12] net/mlx5: Fix race between PCI error handlers and health work Saeed Mahameed
2016-10-25 15:36 ` [PATCH net 11/12] net/mlx5: PCI error recovery health care simulation Saeed Mahameed
2016-10-25 15:36 ` [PATCH net 12/12] net/mlx5: Avoid passing dma address 0 to firmware Saeed Mahameed
2016-10-29 15:56 ` [PATCH net 00/12] Mellanox 100G mlx5 fixes 2016-10-25 David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.