netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [pull request][net-next 00/15] Mellanox, mlx5 Firmware devlink health and sw reset
@ 2019-05-05  0:32 Saeed Mahameed
  2019-05-05  0:32 ` [net-next 01/15] net/mlx5: Move all devlink related functions calls to devlink.c Saeed Mahameed
                   ` (14 more replies)
  0 siblings, 15 replies; 34+ messages in thread
From: Saeed Mahameed @ 2019-05-05  0:32 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Jiri Pirko, Saeed Mahameed

Hi Dave,

This series provides the support for mlx5 Firmware devlink health and
sw reset.

We plan to follow up this series with a patch that provides mlx5
documentation under Documentation/networking/mlx5.rst, first thing in
5.3 kernel release, it will include all new mlx5 devlink options and
more.

For more information please see tag log below.

Please pull and let me know if there is any problem.

Thanks,
Saeed.

---
The following changes since commit a734d1f4c2fc962ef4daa179e216df84a8ec5f84:

  net: openvswitch: return an error instead of doing BUG_ON() (2019-05-04 01:36:36 -0400)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2019-05-04

for you to fetch changes up to 30d8b932dcebbcb8c5d1991cab5325c2e3faad6d:

  net/mlx5: Report devlink health on FW fatal issues (2019-05-04 17:22:45 -0700)

----------------------------------------------------------------
mlx5-updates-2019-05-04

Mlx5 devlink health fw reporters and sw reset support

This series provides mlx5 firmware reset support and firmware devlink health
reporters.

1) Add CR-Space access and FW Crdump snapshot support via devlink region_snapshot

2) Issue software reset upon FW asserts

3) Add fw and fw_fatal devlink heath reporters to follow fw errors indication by
dump and recover procedures and enable trigger these functionality by user.

3.1) fw reporter:
The fw reporter implements diagnose and dump callbacks.
It follows symptoms of fw error such as fw syndrome by triggering
fw core dump and storing it and any other fw trace into the dump buffer.
The fw reporter diagnose command can be triggered any time by the user to check
current fw status.

3.2) fw_fatal repoter:
The fw_fatal reporter implements dump and recover callbacks.
It follows fatal errors indications by CR-space dump and recover flow.
The CR-space dump uses vsc interface which is valid even if the FW command
interface is not functional, which is the case in most FW fatal errors. The
CR-space dump is stored as a memory region snapshot to ease read by address.
The recover function runs recover flow which reloads the driver and triggers fw
reset if needed.

Command examples and output:
diagnose data:
assert_var[0] 0xfc3fc043
assert_var[1] 0x0001b41c
assert_var[2] 0x00000000
assert_var[3] 0x00000000
assert_var[4] 0x00000000
assert_exit_ptr 0x008033b4
assert_callra 0x0080365c
fw_ver 16.24.1000
hw_id 0x0000020d
irisc_index 0
synd 0x8: unrecoverable hardware error
ext_synd 0x003d
raw fw_ver 0x101803e8

dump traces:
   trace: 0000:82:00.1 [0x69cd6c5283e] 0 [0xb8] dump general info GVMI=0x0001
   trace: 0000:82:00.1 [0x69cd6c53bec] 0 [0xb8] GVMI management info, gvmi_management context:
   trace: 0000:82:00.1 [0x69cd6c55eff] 0 [0xb8] [000]:  00000000  00000000  00000000  00000000
   trace: 0000:82:00.1 [0x69cd6c5657f] 0 [0xb8] [010]:  00000000  00000000  00000000  00000000
   trace: 0000:82:00.1 [0x69cd6c56608] 0 [0xb8] [020]:  00000000  00000000  00000000  00000000
   trace: 0000:82:00.1 [0x69cd6c566ff] 0 [0xb8] [030]:  00000000  00000000  00000000  00000000
   trace: 0000:82:00.1 [0x69cd6c5677f] 0 [0xb8] [040]:  00000000  00000000  00000000  00000000
   trace: 0000:82:00.1 [0x69cd6c5687f] 0 [0xb8] [050]:  00000000  00000000  00000000  00000000
   trace: 0000:82:00.1 [0x69cd6c568ff] 0 [0xb8] [060]:  00000000  00000000  00000000  00000000
   trace: 0000:82:00.1 [0x69cd6c569a5] 0 [0xb8] [070]:  00000000  00000000  00000000  00000000
   trace: 0000:82:00.1 [0x69cd6c57021] 0 [0xb8] CMDIF dbase from IRON: active_dbase_slots = 0x00000000
   trace: 0000:82:00.1 [0x69cd6c58dae] 0 [0xb8] GVMI=0x0001 hw_toc context:
   trace: 0000:82:00.1 [0x69cd6c58e7f] 0 [0xb8] [000]:  00400100  00000000  00000000  fffff000
   trace: 0000:82:00.1 [0x69cd6c58f7f] 0 [0xb8] [010]:  00000000  00000000  00000000  00000000
...
...

devlink_region_name: cr-space snapshot_id: 1

00000000000f0018 e1 03 00 00 fb ae a9 3f

0000000000000000 00 20 00 01 00 00 00 00 03 00 00 00 00 00 00 00
0000000000000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000000000000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000000000000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000000000000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80
0000000000000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000000000000060 00 00 00 00 00 00 00 00 00 00 00 00 de 0a 00 00
0000000000000070 0c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000000000000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa 00
0000000000000090 b6 0b 00 00 00 00 00 00 80 c7 fe ff 50 0a 00 00
...
...

----------------------------------------------------------------
Alex Vesker (3):
      net/mlx5: Add Vendor Specific Capability access gateway
      net/mlx5: Add Crdump FW snapshot support
      net/mlx5: Add support for devlink region_snapshot parameter

Eran Ben Elisha (1):
      net/mlx5: Move all devlink related functions calls to devlink.c

Feras Daoud (3):
      net/mlx5: Handle SW reset of FW in error flow
      net/mlx5: Control CR-space access by different PFs
      net/mlx5: Issue SW reset on FW assert

Moshe Shemesh (8):
      net/mlx5: Refactor print health info
      net/mlx5: Create FW devlink health reporter
      net/mlx5: Add core dump register access functions
      net/mlx5: Add support for FW reporter dump
      net/mlx5: Report devlink health on FW issues
      net/mlx5: Add fw fatal devlink health reporter
      net/mlx5: Add support for FW fatal reporter dump
      net/mlx5: Report devlink health on FW fatal issues

 drivers/net/ethernet/mellanox/mlx5/core/Makefile   |   3 +-
 drivers/net/ethernet/mellanox/mlx5/core/devlink.c  |  72 +++
 drivers/net/ethernet/mellanox/mlx5/core/devlink.h  |  12 +
 .../net/ethernet/mellanox/mlx5/core/diag/crdump.c  | 210 ++++++++
 .../ethernet/mellanox/mlx5/core/diag/fw_tracer.c   | 143 +++++
 .../ethernet/mellanox/mlx5/core/diag/fw_tracer.h   |  14 +
 .../net/ethernet/mellanox/mlx5/core/en_selftest.c  |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/health.c   | 575 +++++++++++++++++----
 drivers/net/ethernet/mellanox/mlx5/core/lib/mlx5.h |   6 +
 .../net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c  | 313 +++++++++++
 .../net/ethernet/mellanox/mlx5/core/lib/pci_vsc.h  |  33 ++
 drivers/net/ethernet/mellanox/mlx5/core/main.c     |  19 +-
 .../net/ethernet/mellanox/mlx5/core/mlx5_core.h    |   8 +-
 include/linux/mlx5/device.h                        |  10 +-
 include/linux/mlx5/driver.h                        |  20 +-
 include/linux/mlx5/mlx5_ifc.h                      |  17 +-
 16 files changed, 1357 insertions(+), 100 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/devlink.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/devlink.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.h

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [net-next 01/15] net/mlx5: Move all devlink related functions calls to devlink.c
  2019-05-05  0:32 [pull request][net-next 00/15] Mellanox, mlx5 Firmware devlink health and sw reset Saeed Mahameed
@ 2019-05-05  0:32 ` Saeed Mahameed
  2019-05-05  0:32 ` [net-next 02/15] net/mlx5: Add Vendor Specific Capability access gateway Saeed Mahameed
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 34+ messages in thread
From: Saeed Mahameed @ 2019-05-05  0:32 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jiri Pirko, Eran Ben Elisha, Moshe Shemesh, Saeed Mahameed

From: Eran Ben Elisha <eranbe@mellanox.com>

Centralize all devlink related callbacks in one file.
In the downstream patch, some more functionality will be added, this
patch is preparing the driver infrastructure for it.

Currently, move devlink un/register functions calls into this file.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/Makefile  |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/devlink.c | 14 ++++++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/devlink.h | 12 ++++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/main.c    |  5 +++--
 4 files changed, 30 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/devlink.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/devlink.h

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 243368dc23db..03831a1c02fd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -15,7 +15,7 @@ mlx5_core-y :=	main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
 		health.o mcg.o cq.o alloc.o qp.o port.o mr.o pd.o \
 		transobj.o vport.o sriov.o fs_cmd.o fs_core.o \
 		fs_counters.o rl.o lag.o dev.o events.o wq.o lib/gid.o \
-		lib/devcom.o diag/fs_tracepoint.o diag/fw_tracer.o
+		lib/devcom.o diag/fs_tracepoint.o diag/fw_tracer.o devlink.o
 
 #
 # Netdev basic
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
new file mode 100644
index 000000000000..72ff27f57817
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/* Copyright (c) 2019 Mellanox Technologies */
+
+#include <devlink.h>
+
+int mlx5_devlink_register(struct devlink *devlink, struct device *dev)
+{
+	return devlink_register(devlink, dev);
+}
+
+void mlx5_devlink_unregister(struct devlink *devlink)
+{
+	devlink_unregister(devlink);
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.h b/drivers/net/ethernet/mellanox/mlx5/core/devlink.h
new file mode 100644
index 000000000000..2242d73e8420
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2019, Mellanox Technologies */
+
+#ifndef __MLX5_DEVLINK_H__
+#define __MLX5_DEVLINK_H__
+
+#include <net/devlink.h>
+
+int mlx5_devlink_register(struct devlink *devlink, struct device *dev);
+void mlx5_devlink_unregister(struct devlink *devlink);
+
+#endif /* __MLX5_DEVLINK_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 61fa1d162d28..96917f444bef 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -56,6 +56,7 @@
 #include "fs_core.h"
 #include "lib/mpfs.h"
 #include "eswitch.h"
+#include "devlink.h"
 #include "lib/mlx5.h"
 #include "fpga/core.h"
 #include "fpga/ipsec.h"
@@ -1312,7 +1313,7 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *id)
 
 	request_module_nowait(MLX5_IB_MOD);
 
-	err = devlink_register(devlink, &pdev->dev);
+	err = mlx5_devlink_register(devlink, &pdev->dev);
 	if (err)
 		goto clean_load;
 
@@ -1337,7 +1338,7 @@ static void remove_one(struct pci_dev *pdev)
 	struct mlx5_core_dev *dev  = pci_get_drvdata(pdev);
 	struct devlink *devlink = priv_to_devlink(dev);
 
-	devlink_unregister(devlink);
+	mlx5_devlink_unregister(devlink);
 	mlx5_unregister_device(dev);
 
 	if (mlx5_unload_one(dev, true)) {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [net-next 02/15] net/mlx5: Add Vendor Specific Capability access gateway
  2019-05-05  0:32 [pull request][net-next 00/15] Mellanox, mlx5 Firmware devlink health and sw reset Saeed Mahameed
  2019-05-05  0:32 ` [net-next 01/15] net/mlx5: Move all devlink related functions calls to devlink.c Saeed Mahameed
@ 2019-05-05  0:32 ` Saeed Mahameed
  2019-05-05  0:33 ` [net-next 03/15] net/mlx5: Add Crdump FW snapshot support Saeed Mahameed
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 34+ messages in thread
From: Saeed Mahameed @ 2019-05-05  0:32 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jiri Pirko, Alex Vesker, Feras Daoud, Saeed Mahameed

From: Alex Vesker <valex@mellanox.com>

The Vendor Specific Capability (VSC) is used to activate a gateway
interfacing with the device. The gateway is used to read or write
device configurations, which are organized in different domains (spaces).
A configuration access may result in multiple actions, reads, writes.

Example usages are accessing the Crspace domain to read the crspace or
locking a device semaphore using the Semaphore domain.

The configuration access use pci_cfg_access to prevent parallel access to
the VSC space by the driver and userspace calls.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/Makefile  |   3 +-
 .../ethernet/mellanox/mlx5/core/lib/pci_vsc.c | 283 ++++++++++++++++++
 .../ethernet/mellanox/mlx5/core/lib/pci_vsc.h |  25 ++
 .../net/ethernet/mellanox/mlx5/core/main.c    |   3 +
 include/linux/mlx5/driver.h                   |   1 +
 5 files changed, 314 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.h

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 03831a1c02fd..34d9a079b608 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -15,7 +15,8 @@ mlx5_core-y :=	main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
 		health.o mcg.o cq.o alloc.o qp.o port.o mr.o pd.o \
 		transobj.o vport.o sriov.o fs_cmd.o fs_core.o \
 		fs_counters.o rl.o lag.o dev.o events.o wq.o lib/gid.o \
-		lib/devcom.o diag/fs_tracepoint.o diag/fw_tracer.o devlink.o
+		lib/devcom.o lib/pci_vsc.o diag/fs_tracepoint.o \
+		diag/fw_tracer.o devlink.o
 
 #
 # Netdev basic
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c
new file mode 100644
index 000000000000..f42890bdd6b1
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c
@@ -0,0 +1,283 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/* Copyright (c) 2019 Mellanox Technologies */
+
+#include <linux/pci.h>
+#include "mlx5_core.h"
+#include "pci_vsc.h"
+
+#define MLX5_EXTRACT_C(source, offset, size)	\
+	((((u32)(source)) >> (offset)) & MLX5_ONES32(size))
+#define MLX5_EXTRACT(src, start, len)		\
+	(((len) == 32) ? (src) : MLX5_EXTRACT_C(src, start, len))
+#define MLX5_ONES32(size)			\
+	((size) ? (0xffffffff >> (32 - (size))) : 0)
+#define MLX5_MASK32(offset, size)		\
+	(MLX5_ONES32(size) << (offset))
+#define MLX5_MERGE_C(rsrc1, rsrc2, start, len)  \
+	((((rsrc2) << (start)) & (MLX5_MASK32((start), (len)))) | \
+	((rsrc1) & (~MLX5_MASK32((start), (len)))))
+#define MLX5_MERGE(rsrc1, rsrc2, start, len)	\
+	(((len) == 32) ? (rsrc2) : MLX5_MERGE_C(rsrc1, rsrc2, start, len))
+#define vsc_read(dev, offset, val) \
+	pci_read_config_dword((dev)->pdev, (dev)->vsc_addr + (offset), (val))
+#define vsc_write(dev, offset, val) \
+	pci_write_config_dword((dev)->pdev, (dev)->vsc_addr + (offset), (val))
+#define VSC_MAX_RETRIES 2048
+
+enum mlx5_vsc_state {
+	MLX5_VSC_UNLOCK,
+	MLX5_VSC_LOCK,
+};
+
+enum {
+	VSC_CTRL_OFFSET = 0x4,
+	VSC_COUNTER_OFFSET = 0x8,
+	VSC_SEMAPHORE_OFFSET = 0xc,
+	VSC_ADDR_OFFSET = 0x10,
+	VSC_DATA_OFFSET = 0x14,
+
+	VSC_FLAG_BIT_OFFS = 31,
+	VSC_FLAG_BIT_LEN = 1,
+
+	VSC_SYND_BIT_OFFS = 30,
+	VSC_SYND_BIT_LEN = 1,
+
+	VSC_ADDR_BIT_OFFS = 0,
+	VSC_ADDR_BIT_LEN = 30,
+
+	VSC_SPACE_BIT_OFFS = 0,
+	VSC_SPACE_BIT_LEN = 16,
+
+	VSC_SIZE_VLD_BIT_OFFS = 28,
+	VSC_SIZE_VLD_BIT_LEN = 1,
+
+	VSC_STATUS_BIT_OFFS = 29,
+	VSC_STATUS_BIT_LEN = 3,
+};
+
+void mlx5_vsc_init(struct mlx5_core_dev *dev)
+{
+	dev->vsc_addr = pci_find_capability(dev->pdev,
+					    PCI_CAP_ID_VNDR);
+	if (!dev->vsc_addr)
+		mlx5_core_warn(dev, "Failed to get valid vendor specific ID\n");
+}
+
+int mlx5_vsc_gw_lock(struct mlx5_core_dev *dev)
+{
+	u32 counter = 0;
+	int retries = 0;
+	u32 lock_val;
+	int ret;
+
+	pci_cfg_access_lock(dev->pdev);
+	do {
+		if (retries > VSC_MAX_RETRIES) {
+			ret = -EBUSY;
+			goto pci_unlock;
+		}
+
+		/* Check if semaphore is already locked */
+		ret = vsc_read(dev, VSC_SEMAPHORE_OFFSET, &lock_val);
+		if (ret)
+			goto pci_unlock;
+
+		if (lock_val) {
+			retries++;
+			usleep_range(1000, 2000);
+			continue;
+		}
+
+		/* Read and write counter value, if written value is
+		 * the same, semaphore was acquired successfully.
+		 */
+		ret = vsc_read(dev, VSC_COUNTER_OFFSET, &counter);
+		if (ret)
+			goto pci_unlock;
+
+		ret = vsc_write(dev, VSC_SEMAPHORE_OFFSET, counter);
+		if (ret)
+			goto pci_unlock;
+
+		ret = vsc_read(dev, VSC_SEMAPHORE_OFFSET, &lock_val);
+		if (ret)
+			goto pci_unlock;
+
+		retries++;
+	} while (counter != lock_val);
+
+	return 0;
+
+pci_unlock:
+	pci_cfg_access_unlock(dev->pdev);
+	return ret;
+}
+
+int mlx5_vsc_gw_unlock(struct mlx5_core_dev *dev)
+{
+	int ret;
+
+	ret = vsc_write(dev, VSC_SEMAPHORE_OFFSET, MLX5_VSC_UNLOCK);
+	pci_cfg_access_unlock(dev->pdev);
+	return ret;
+}
+
+int mlx5_vsc_gw_set_space(struct mlx5_core_dev *dev, u16 space,
+			  u32 *ret_space_size)
+{
+	int ret;
+	u32 val = 0;
+
+	if (!mlx5_vsc_accessible(dev))
+		return -EINVAL;
+
+	if (ret_space_size)
+		*ret_space_size = 0;
+
+	/* Get a unique val */
+	ret = vsc_read(dev, VSC_CTRL_OFFSET, &val);
+	if (ret)
+		goto out;
+
+	/* Try to modify the lock */
+	val = MLX5_MERGE(val, space, VSC_SPACE_BIT_OFFS, VSC_SPACE_BIT_LEN);
+	ret = vsc_write(dev, VSC_CTRL_OFFSET, val);
+	if (ret)
+		goto out;
+
+	/* Verify lock was modified */
+	ret = vsc_read(dev, VSC_CTRL_OFFSET, &val);
+	if (ret)
+		goto out;
+
+	if (MLX5_EXTRACT(val, VSC_STATUS_BIT_OFFS, VSC_STATUS_BIT_LEN) == 0)
+		return -EINVAL;
+
+	/* Get space max address if indicated by size valid bit */
+	if (ret_space_size &&
+	    MLX5_EXTRACT(val, VSC_SIZE_VLD_BIT_OFFS, VSC_SIZE_VLD_BIT_LEN)) {
+		ret = vsc_read(dev, VSC_ADDR_OFFSET, &val);
+		if (ret) {
+			mlx5_core_warn(dev, "Failed to get max space size\n");
+			goto out;
+		}
+		*ret_space_size = MLX5_EXTRACT(val, VSC_ADDR_BIT_OFFS,
+					       VSC_ADDR_BIT_LEN);
+	}
+	return 0;
+
+out:
+	return ret;
+}
+
+static int mlx5_vsc_wait_on_flag(struct mlx5_core_dev *dev, u8 expected_val)
+{
+	int retries = 0;
+	u32 flag;
+	int ret;
+
+	do {
+		if (retries > VSC_MAX_RETRIES)
+			return -EBUSY;
+
+		ret = vsc_read(dev, VSC_ADDR_OFFSET, &flag);
+		if (ret)
+			return ret;
+		flag = MLX5_EXTRACT(flag, VSC_FLAG_BIT_OFFS, VSC_FLAG_BIT_LEN);
+		retries++;
+
+		if ((retries & 0xf) == 0)
+			usleep_range(1000, 2000);
+
+	} while (flag != expected_val);
+
+	return 0;
+}
+
+static int mlx5_vsc_gw_write(struct mlx5_core_dev *dev, unsigned int address,
+			     u32 data)
+{
+	int ret;
+
+	if (MLX5_EXTRACT(address, VSC_SYND_BIT_OFFS,
+			 VSC_FLAG_BIT_LEN + VSC_SYND_BIT_LEN))
+		return -EINVAL;
+
+	/* Set flag to 0x1 */
+	address = MLX5_MERGE(address, 1, VSC_FLAG_BIT_OFFS, 1);
+	ret = vsc_write(dev, VSC_DATA_OFFSET, data);
+	if (ret)
+		goto out;
+
+	ret = vsc_write(dev, VSC_ADDR_OFFSET, address);
+	if (ret)
+		goto out;
+
+	/* Wait for the flag to be cleared */
+	ret = mlx5_vsc_wait_on_flag(dev, 0);
+
+out:
+	return ret;
+}
+
+static int mlx5_vsc_gw_read(struct mlx5_core_dev *dev, unsigned int address,
+			    u32 *data)
+{
+	int ret;
+
+	if (MLX5_EXTRACT(address, VSC_SYND_BIT_OFFS,
+			 VSC_FLAG_BIT_LEN + VSC_SYND_BIT_LEN))
+		return -EINVAL;
+
+	ret = vsc_write(dev, VSC_ADDR_OFFSET, address);
+	if (ret)
+		goto out;
+
+	ret = mlx5_vsc_wait_on_flag(dev, 1);
+	if (ret)
+		goto out;
+
+	ret = vsc_read(dev, VSC_DATA_OFFSET, data);
+out:
+	return ret;
+}
+
+static int mlx5_vsc_gw_read_fast(struct mlx5_core_dev *dev,
+				 unsigned int read_addr,
+				 unsigned int *next_read_addr,
+				 u32 *data)
+{
+	int ret;
+
+	ret = mlx5_vsc_gw_read(dev, read_addr, data);
+	if (ret)
+		goto out;
+
+	ret = vsc_read(dev, VSC_ADDR_OFFSET, next_read_addr);
+	if (ret)
+		goto out;
+
+	*next_read_addr = MLX5_EXTRACT(*next_read_addr, VSC_ADDR_BIT_OFFS,
+				       VSC_ADDR_BIT_LEN);
+
+	if (*next_read_addr <= read_addr)
+		ret = -EINVAL;
+out:
+	return ret;
+}
+
+int mlx5_vsc_gw_read_block_fast(struct mlx5_core_dev *dev, u32 *data,
+				int length)
+{
+	unsigned int next_read_addr = 0;
+	unsigned int read_addr = 0;
+
+	while (read_addr < length) {
+		if (mlx5_vsc_gw_read_fast(dev, read_addr, &next_read_addr,
+					  &data[(read_addr >> 2)]))
+			return read_addr;
+
+		read_addr = next_read_addr;
+	}
+	return length;
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.h
new file mode 100644
index 000000000000..c6ebf59006c5
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2019 Mellanox Technologies */
+
+#ifndef __MLX5_PCI_VSC_H__
+#define __MLX5_PCI_VSC_H__
+
+enum {
+	MLX5_VSC_SPACE_SCAN_CRSPACE = 0x7,
+};
+
+void mlx5_vsc_init(struct mlx5_core_dev *dev);
+void mlx5_vsc_cleanup(struct mlx5_core_dev *dev);
+int mlx5_vsc_gw_lock(struct mlx5_core_dev *dev);
+int mlx5_vsc_gw_unlock(struct mlx5_core_dev *dev);
+int mlx5_vsc_gw_set_space(struct mlx5_core_dev *dev, u16 space,
+			  u32 *ret_space_size);
+int mlx5_vsc_gw_read_block_fast(struct mlx5_core_dev *dev, u32 *data,
+				int length);
+
+static inline bool mlx5_vsc_accessible(struct mlx5_core_dev *dev)
+{
+	return !!dev->vsc_addr;
+}
+
+#endif /* __MLX5_PCI_VSC_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 96917f444bef..64eb2a558b30 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -65,6 +65,7 @@
 #include "lib/clock.h"
 #include "lib/vxlan.h"
 #include "lib/devcom.h"
+#include "lib/pci_vsc.h"
 #include "diag/fw_tracer.h"
 #include "ecpf.h"
 
@@ -1313,6 +1314,8 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *id)
 
 	request_module_nowait(MLX5_IB_MOD);
 
+	mlx5_vsc_init(dev);
+
 	err = mlx5_devlink_register(devlink, &pdev->dev);
 	if (err)
 		goto clean_load;
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 5a39b323c52e..56d0a116f575 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -691,6 +691,7 @@ struct mlx5_core_dev {
 	struct mlx5_ib_clock_info  *clock_info;
 	struct page             *clock_info_page;
 	struct mlx5_fw_tracer   *tracer;
+	u32                      vsc_addr;
 };
 
 struct mlx5_db {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [net-next 03/15] net/mlx5: Add Crdump FW snapshot support
  2019-05-05  0:32 [pull request][net-next 00/15] Mellanox, mlx5 Firmware devlink health and sw reset Saeed Mahameed
  2019-05-05  0:32 ` [net-next 01/15] net/mlx5: Move all devlink related functions calls to devlink.c Saeed Mahameed
  2019-05-05  0:32 ` [net-next 02/15] net/mlx5: Add Vendor Specific Capability access gateway Saeed Mahameed
@ 2019-05-05  0:33 ` Saeed Mahameed
  2019-05-05 15:36   ` Jiri Pirko
  2019-05-05  0:33 ` [net-next 04/15] net/mlx5: Add support for devlink region_snapshot parameter Saeed Mahameed
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 34+ messages in thread
From: Saeed Mahameed @ 2019-05-05  0:33 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jiri Pirko, Alex Vesker, Moshe Shemesh, Feras Daoud,
	Saeed Mahameed

From: Alex Vesker <valex@mellanox.com>

Crdump allows the driver to create a snapshot of the FW PCI crspace.
This is useful in case of catastrophic issues which may require FW
reset. The snapshot can be used for later debug.

The snapshot is exposed using devlink region_snapshot in downstream patch,
cr-space address regions are registered on init and snapshots are attached
once a new snapshot is collected by the driver.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/Makefile  |   2 +-
 .../ethernet/mellanox/mlx5/core/diag/crdump.c | 179 ++++++++++++++++++
 .../net/ethernet/mellanox/mlx5/core/health.c  |   1 +
 .../ethernet/mellanox/mlx5/core/lib/mlx5.h    |   4 +
 .../net/ethernet/mellanox/mlx5/core/main.c    |   5 +
 include/linux/mlx5/driver.h                   |   4 +
 6 files changed, 194 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 34d9a079b608..5feed9e1bec7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -16,7 +16,7 @@ mlx5_core-y :=	main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
 		transobj.o vport.o sriov.o fs_cmd.o fs_core.o \
 		fs_counters.o rl.o lag.o dev.o events.o wq.o lib/gid.o \
 		lib/devcom.o lib/pci_vsc.o diag/fs_tracepoint.o \
-		diag/fw_tracer.o devlink.o
+		diag/fw_tracer.o diag/crdump.o devlink.o
 
 #
 # Netdev basic
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c b/drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c
new file mode 100644
index 000000000000..6430ceeefb53
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c
@@ -0,0 +1,179 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/* Copyright (c) 2019 Mellanox Technologies */
+
+#include <linux/proc_fs.h>
+#include <linux/mlx5/driver.h>
+#include <net/devlink.h>
+#include "mlx5_core.h"
+#include "lib/pci_vsc.h"
+#include "lib/mlx5.h"
+
+#define BAD_ACCESS			0xBADACCE5
+#define MLX5_PROTECTED_CR_SCAN_CRSPACE	0x7
+#define MAX_NUM_OF_DUMPS_TO_STORE	(8)
+
+static const char *region_cr_space_str = "cr-space";
+
+struct mlx5_fw_crdump {
+	u32			size;
+	struct devlink_region	*region_crspace;
+};
+
+static bool mlx5_crdump_enbaled(struct mlx5_core_dev *dev)
+{
+	struct mlx5_priv *priv = &dev->priv;
+
+	return (!!priv->health.crdump);
+}
+
+static int mlx5_crdump_fill(struct mlx5_core_dev *dev,
+			    char *crdump_region, u32 *snapshot_id)
+{
+	struct devlink *devlink = priv_to_devlink(dev);
+	struct mlx5_priv *priv = &dev->priv;
+	struct mlx5_fw_crdump *crdump = priv->health.crdump;
+	int i, ret = 0;
+	u32 *cr_data;
+	u32 id;
+
+	cr_data = kvmalloc(crdump->size, GFP_KERNEL);
+	if (!cr_data)
+		return -ENOMEM;
+
+	for (i = 0; i < (crdump->size / 4); i++)
+		cr_data[i] = BAD_ACCESS;
+
+	ret = mlx5_vsc_gw_read_block_fast(dev, cr_data, crdump->size);
+	if (ret <= 0) {
+		if (ret == 0)
+			ret = -EIO;
+		goto free_data;
+	}
+
+	if (crdump->size != ret) {
+		mlx5_core_warn(dev, "failed to read full dump, read %d out of %u\n",
+			       ret, crdump->size);
+		ret = -EINVAL;
+		goto free_data;
+	}
+
+	/* Get the available snapshot ID for the dumps */
+	id = devlink_region_shapshot_id_get(devlink);
+	ret = devlink_region_snapshot_create(crdump->region_crspace,
+					     crdump->size, (u8 *)cr_data,
+					     id, &kvfree);
+	if (ret) {
+		mlx5_core_warn(dev, "crdump: devlink create %s snapshot id %d err %d\n",
+			       region_cr_space_str, id, ret);
+		goto free_data;
+	} else {
+		*snapshot_id = id;
+		strcpy(crdump_region, region_cr_space_str);
+	}
+	return 0;
+
+free_data:
+	kvfree(cr_data);
+	return ret;
+}
+
+int mlx5_crdump_collect(struct mlx5_core_dev *dev,
+			char *crdump_region, u32 *snapshot_id)
+{
+	int ret = 0;
+
+	if (!mlx5_crdump_enbaled(dev))
+		return -ENODEV;
+
+	ret = mlx5_vsc_gw_lock(dev);
+	if (ret) {
+		mlx5_core_warn(dev, "crdump: failed to lock vsc gw err %d\n",
+			       ret);
+		return ret;
+	}
+
+	ret = mlx5_vsc_gw_set_space(dev, MLX5_VSC_SPACE_SCAN_CRSPACE, NULL);
+	if (ret)
+		goto unlock;
+
+	ret = mlx5_crdump_fill(dev, crdump_region, snapshot_id);
+
+unlock:
+	mlx5_vsc_gw_unlock(dev);
+	return ret;
+}
+
+int mlx5_crdump_init(struct mlx5_core_dev *dev)
+{
+	struct devlink *devlink = priv_to_devlink(dev);
+	struct mlx5_priv *priv = &dev->priv;
+	struct mlx5_fw_crdump *crdump;
+	u32 space_size;
+	int ret;
+
+	if (!mlx5_core_is_pf(dev) || !mlx5_vsc_accessible(dev) ||
+	    mlx5_crdump_enbaled(dev))
+		return 0;
+
+	ret = mlx5_vsc_gw_lock(dev);
+	if (ret)
+		return ret;
+
+	/* Check if space is supported and get space size */
+	ret = mlx5_vsc_gw_set_space(dev, MLX5_VSC_SPACE_SCAN_CRSPACE,
+				    &space_size);
+	if (ret) {
+		/* Unlock and mask error since space is not supported */
+		mlx5_vsc_gw_unlock(dev);
+		return 0;
+	}
+
+	if (!space_size) {
+		mlx5_core_warn(dev, "Invalid Crspace size, zero\n");
+		mlx5_vsc_gw_unlock(dev);
+		return -EINVAL;
+	}
+
+	ret = mlx5_vsc_gw_unlock(dev);
+	if (ret)
+		return ret;
+
+	crdump = kzalloc(sizeof(*crdump), GFP_KERNEL);
+	if (!crdump)
+		return -ENOMEM;
+
+	/* Create cr-space region */
+	crdump->size = space_size;
+	crdump->region_crspace =
+		devlink_region_create(devlink,
+				      region_cr_space_str,
+				      MAX_NUM_OF_DUMPS_TO_STORE,
+				      space_size);
+	if (IS_ERR(crdump->region_crspace)) {
+		mlx5_core_warn(dev,
+			       "crdump: create devlink region %s err %ld\n",
+			       region_cr_space_str,
+			       PTR_ERR(crdump->region_crspace));
+		ret = PTR_ERR(crdump->region_crspace);
+		goto free_crdump;
+	}
+	priv->health.crdump = crdump;
+	return 0;
+
+free_crdump:
+	kfree(crdump);
+	return ret;
+}
+
+void mlx5_crdump_cleanup(struct mlx5_core_dev *dev)
+{
+	struct mlx5_priv *priv = &dev->priv;
+	struct mlx5_fw_crdump *crdump = priv->health.crdump;
+
+	if (!crdump)
+		return;
+
+	devlink_region_destroy(crdump->region_crspace);
+	kfree(crdump);
+	priv->health.crdump = NULL;
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index a2656f4008d9..90f3da6da7f9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -388,6 +388,7 @@ int mlx5_health_init(struct mlx5_core_dev *dev)
 	spin_lock_init(&health->wq_lock);
 	INIT_WORK(&health->work, health_care);
 	INIT_DELAYED_WORK(&health->recover_work, health_recover);
+	health->crdump = NULL;
 
 	return 0;
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/mlx5.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/mlx5.h
index 397a2847867a..3c9a6dedccaa 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/mlx5.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/mlx5.h
@@ -41,6 +41,10 @@ int  mlx5_core_reserve_gids(struct mlx5_core_dev *dev, unsigned int count);
 void mlx5_core_unreserve_gids(struct mlx5_core_dev *dev, unsigned int count);
 int  mlx5_core_reserved_gid_alloc(struct mlx5_core_dev *dev, int *gid_index);
 void mlx5_core_reserved_gid_free(struct mlx5_core_dev *dev, int gid_index);
+int mlx5_crdump_init(struct mlx5_core_dev *dev);
+void mlx5_crdump_cleanup(struct mlx5_core_dev *dev);
+int mlx5_crdump_collect(struct mlx5_core_dev *dev,
+			char *crdump_region, u32 *snapshot_id);
 
 /* TODO move to lib/events.h */
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 64eb2a558b30..43f5487de4c3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1320,6 +1320,10 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (err)
 		goto clean_load;
 
+	err = mlx5_crdump_init(dev);
+	if (err)
+		dev_err(&pdev->dev, "mlx5_crdump_init failed with error code %d\n", err);
+
 	pci_save_state(pdev);
 	return 0;
 
@@ -1341,6 +1345,7 @@ static void remove_one(struct pci_dev *pdev)
 	struct mlx5_core_dev *dev  = pci_get_drvdata(pdev);
 	struct devlink *devlink = priv_to_devlink(dev);
 
+	mlx5_crdump_cleanup(dev);
 	mlx5_devlink_unregister(devlink);
 	mlx5_unregister_device(dev);
 
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 56d0a116f575..ddf6f41a75d3 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -53,6 +53,7 @@
 #include <linux/mlx5/eq.h>
 #include <linux/timecounter.h>
 #include <linux/ptp_clock_kernel.h>
+#include <net/devlink.h>
 
 enum {
 	MLX5_BOARD_ID_LEN = 64,
@@ -427,6 +428,8 @@ struct mlx5_sq_bfreg {
 	unsigned int		offset;
 };
 
+struct mlx5_fw_crdump;
+
 struct mlx5_core_health {
 	struct health_buffer __iomem   *health;
 	__be32 __iomem		       *health_counter;
@@ -440,6 +443,7 @@ struct mlx5_core_health {
 	unsigned long			flags;
 	struct work_struct		work;
 	struct delayed_work		recover_work;
+	struct mlx5_fw_crdump	       *crdump;
 };
 
 struct mlx5_qp_table {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [net-next 04/15] net/mlx5: Add support for devlink region_snapshot parameter
  2019-05-05  0:32 [pull request][net-next 00/15] Mellanox, mlx5 Firmware devlink health and sw reset Saeed Mahameed
                   ` (2 preceding siblings ...)
  2019-05-05  0:33 ` [net-next 03/15] net/mlx5: Add Crdump FW snapshot support Saeed Mahameed
@ 2019-05-05  0:33 ` Saeed Mahameed
  2019-05-05  0:33 ` [net-next 05/15] net/mlx5: Handle SW reset of FW in error flow Saeed Mahameed
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 34+ messages in thread
From: Saeed Mahameed @ 2019-05-05  0:33 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jiri Pirko, Alex Vesker, Feras Daoud, Moshe Shemesh,
	Saeed Mahameed

From: Alex Vesker <valex@mellanox.com>

This parameter enables capturing region snapshot of the crspace
during critical errors. The default value of this parameter is
disabled, it can be enabled using devlink param commands.
It is possible to configure during runtime and also driver init.

Command line examples:

Delete snapshot id 1 from cr-space address region from device pci/0000:00:05.0
$ devlink region del pci/0000:00:05.0/cr-space snapshot 1

Dump the snapshot taken from cr-space address region with ID 1
$ devlink region dump pci/0000:00:05.0/cr-space snapshot 1

Read from address 0x10, 16 Bytes of snapshot ID 1 taken from cr-space address region
$ devlink region read pci/0000:00:05.0/cr-space snapshot 1 address 0x10 length 16

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/devlink.c | 60 ++++++++++++++++++-
 .../ethernet/mellanox/mlx5/core/diag/crdump.c | 22 +++++++
 .../ethernet/mellanox/mlx5/core/lib/mlx5.h    |  2 +
 3 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
index 72ff27f57817..308fe64e7bcd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
@@ -2,13 +2,71 @@
 /* Copyright (c) 2019 Mellanox Technologies */
 
 #include <devlink.h>
+#include <linux/mlx5/driver.h>
+#include "lib/mlx5.h"
+
+static int mlx5_devlink_get_crdump_snapshot(struct devlink *devlink, u32 id,
+					    struct devlink_param_gset_ctx *ctx)
+{
+	struct mlx5_core_dev *dev = devlink_priv(devlink);
+
+	ctx->val.vbool = mlx5_crdump_is_snapshot_enabled(dev);
+	return 0;
+}
+
+static int mlx5_devlink_set_crdump_snapshot(struct devlink *devlink, u32 id,
+					    struct devlink_param_gset_ctx *ctx)
+{
+	struct mlx5_core_dev *dev = devlink_priv(devlink);
+
+	return mlx5_crdump_set_snapshot_enabled(dev, ctx->val.vbool);
+}
+
+static const struct devlink_param mlx5_devlink_params[] = {
+	DEVLINK_PARAM_GENERIC(REGION_SNAPSHOT,
+			      BIT(DEVLINK_PARAM_CMODE_RUNTIME) |
+			      BIT(DEVLINK_PARAM_CMODE_DRIVERINIT),
+			      mlx5_devlink_get_crdump_snapshot,
+			      mlx5_devlink_set_crdump_snapshot, NULL),
+};
 
 int mlx5_devlink_register(struct devlink *devlink, struct device *dev)
 {
-	return devlink_register(devlink, dev);
+	union devlink_param_value init_val;
+	int err;
+
+	err = devlink_register(devlink, dev);
+	if (err) {
+		dev_warn(dev,
+			 "devlink register failed (err = %d)", err);
+		return err;
+	}
+
+	err = devlink_params_register(devlink, mlx5_devlink_params,
+				      ARRAY_SIZE(mlx5_devlink_params));
+	if (err) {
+		dev_err(dev, "devlink_params_register failed, err = %d\n", err);
+		goto unregister;
+	}
+
+	init_val.vbool = false;
+	err = devlink_param_driverinit_value_set(devlink,
+						 DEVLINK_PARAM_GENERIC_ID_REGION_SNAPSHOT,
+						 init_val);
+	if (err)
+		dev_warn(dev,
+			 "devlink param init failed (err = %d)", err);
+
+	return 0;
+
+unregister:
+	devlink_unregister(devlink);
+	return err;
 }
 
 void mlx5_devlink_unregister(struct devlink *devlink)
 {
+	devlink_params_unregister(devlink, mlx5_devlink_params,
+				  ARRAY_SIZE(mlx5_devlink_params));
 	devlink_unregister(devlink);
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c b/drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c
index 6430ceeefb53..7337a49f2733 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c
@@ -16,6 +16,7 @@ static const char *region_cr_space_str = "cr-space";
 
 struct mlx5_fw_crdump {
 	u32			size;
+	bool			snapshot_enable;
 	struct devlink_region	*region_crspace;
 };
 
@@ -103,6 +104,27 @@ int mlx5_crdump_collect(struct mlx5_core_dev *dev,
 	return ret;
 }
 
+bool mlx5_crdump_is_snapshot_enabled(struct mlx5_core_dev *dev)
+{
+	struct mlx5_priv *priv = &dev->priv;
+
+	if (mlx5_crdump_enbaled(dev))
+		return priv->health.crdump->snapshot_enable;
+
+	return false;
+}
+
+int mlx5_crdump_set_snapshot_enabled(struct mlx5_core_dev *dev, bool value)
+{
+	struct mlx5_priv *priv = &dev->priv;
+
+	if (!mlx5_crdump_enbaled(dev))
+		return -ENODEV;
+
+	priv->health.crdump->snapshot_enable = value;
+	return 0;
+}
+
 int mlx5_crdump_init(struct mlx5_core_dev *dev)
 {
 	struct devlink *devlink = priv_to_devlink(dev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/mlx5.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/mlx5.h
index 3c9a6dedccaa..c639f0af29ed 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/mlx5.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/mlx5.h
@@ -45,6 +45,8 @@ int mlx5_crdump_init(struct mlx5_core_dev *dev);
 void mlx5_crdump_cleanup(struct mlx5_core_dev *dev);
 int mlx5_crdump_collect(struct mlx5_core_dev *dev,
 			char *crdump_region, u32 *snapshot_id);
+bool mlx5_crdump_is_snapshot_enabled(struct mlx5_core_dev *dev);
+int mlx5_crdump_set_snapshot_enabled(struct mlx5_core_dev *dev, bool value);
 
 /* TODO move to lib/events.h */
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [net-next 05/15] net/mlx5: Handle SW reset of FW in error flow
  2019-05-05  0:32 [pull request][net-next 00/15] Mellanox, mlx5 Firmware devlink health and sw reset Saeed Mahameed
                   ` (3 preceding siblings ...)
  2019-05-05  0:33 ` [net-next 04/15] net/mlx5: Add support for devlink region_snapshot parameter Saeed Mahameed
@ 2019-05-05  0:33 ` Saeed Mahameed
  2019-05-05  0:33 ` [net-next 06/15] net/mlx5: Control CR-space access by different PFs Saeed Mahameed
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 34+ messages in thread
From: Saeed Mahameed @ 2019-05-05  0:33 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jiri Pirko, Feras Daoud, Moshe Shemesh, Daniel Jurgens,
	Alex Vesker, Saeed Mahameed

From: Feras Daoud <ferasda@mellanox.com>

New mlx5 adapters allow the driver to reset the FW in the event of an
error, this action called "SW Reset". When an SW reset is issued on any
PF all PFs enter reset state which is a recoverable condition. The
existing recovery flow was designed to allow the recovery of a VF after
a PF driver reload. This patch adds the sw reset to the NIC states
as a preparation for sw reset handling.

When a software reset is issued the following occurs:
1. The NIC interface mode is set to 7 while the reset is in progress.
2. Once the reset completes the NIC interface mode is set to 1.

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../ethernet/mellanox/mlx5/core/en_selftest.c |   2 +-
 .../net/ethernet/mellanox/mlx5/core/health.c  | 105 ++++++++----------
 .../net/ethernet/mellanox/mlx5/core/main.c    |   2 +-
 .../ethernet/mellanox/mlx5/core/mlx5_core.h   |   2 +-
 include/linux/mlx5/driver.h                   |   3 +-
 5 files changed, 48 insertions(+), 66 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
index 4382ef85488c..840ec945ccba 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
@@ -64,7 +64,7 @@ static int mlx5e_test_health_info(struct mlx5e_priv *priv)
 {
 	struct mlx5_core_health *health = &priv->mdev->priv.health;
 
-	return health->sick ? 1 : 0;
+	return health->fatal_error ? 1 : 0;
 }
 
 static int mlx5e_test_link_state(struct mlx5e_priv *priv)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 90f3da6da7f9..adb40fe0f6ec 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -62,12 +62,18 @@ enum {
 
 enum {
 	MLX5_DROP_NEW_HEALTH_WORK,
-	MLX5_DROP_NEW_RECOVERY_WORK,
+};
+
+enum  {
+	MLX5_SENSOR_NO_ERR		= 0,
+	MLX5_SENSOR_PCI_COMM_ERR	= 1,
+	MLX5_SENSOR_NIC_DISABLED	= 2,
+	MLX5_SENSOR_NIC_SW_RESET	= 3,
 };
 
 u8 mlx5_get_nic_state(struct mlx5_core_dev *dev)
 {
-	return (ioread32be(&dev->iseg->cmdq_addr_l_sz) >> 8) & 3;
+	return (ioread32be(&dev->iseg->cmdq_addr_l_sz) >> 8) & 7;
 }
 
 void mlx5_set_nic_state(struct mlx5_core_dev *dev, u8 state)
@@ -80,18 +86,25 @@ void mlx5_set_nic_state(struct mlx5_core_dev *dev, u8 state)
 		    &dev->iseg->cmdq_addr_l_sz);
 }
 
-static int in_fatal(struct mlx5_core_dev *dev)
+static bool sensor_pci_not_working(struct mlx5_core_dev *dev)
 {
 	struct mlx5_core_health *health = &dev->priv.health;
 	struct health_buffer __iomem *h = health->health;
 
-	if (mlx5_get_nic_state(dev) == MLX5_NIC_IFC_DISABLED)
-		return 1;
+	/* Offline PCI reads return 0xffffffff */
+	return (ioread32be(&h->fw_ver) == 0xffffffff);
+}
 
-	if (ioread32be(&h->fw_ver) == 0xffffffff)
-		return 1;
+static u32 check_fatal_sensors(struct mlx5_core_dev *dev)
+{
+	if (sensor_pci_not_working(dev))
+		return MLX5_SENSOR_PCI_COMM_ERR;
+	if (mlx5_get_nic_state(dev) == MLX5_NIC_IFC_DISABLED)
+		return MLX5_SENSOR_NIC_DISABLED;
+	if (mlx5_get_nic_state(dev) == MLX5_NIC_IFC_SW_RESET)
+		return MLX5_SENSOR_NIC_SW_RESET;
 
-	return 0;
+	return MLX5_SENSOR_NO_ERR;
 }
 
 void mlx5_enter_error_state(struct mlx5_core_dev *dev, bool force)
@@ -101,7 +114,8 @@ void mlx5_enter_error_state(struct mlx5_core_dev *dev, bool force)
 		goto unlock;
 
 	mlx5_core_err(dev, "start\n");
-	if (pci_channel_offline(dev->pdev) || in_fatal(dev) || force) {
+	if (pci_channel_offline(dev->pdev) ||
+	    dev->priv.health.fatal_error != MLX5_SENSOR_NO_ERR || force) {
 		dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR;
 		mlx5_cmd_flush(dev);
 	}
@@ -137,38 +151,14 @@ static void mlx5_handle_bad_state(struct mlx5_core_dev *dev)
 	mlx5_disable_device(dev);
 }
 
-static void health_recover(struct work_struct *work)
-{
-	struct mlx5_core_health *health;
-	struct delayed_work *dwork;
-	struct mlx5_core_dev *dev;
-	struct mlx5_priv *priv;
-	u8 nic_state;
-
-	dwork = container_of(work, struct delayed_work, work);
-	health = container_of(dwork, struct mlx5_core_health, recover_work);
-	priv = container_of(health, struct mlx5_priv, health);
-	dev = container_of(priv, struct mlx5_core_dev, priv);
-
-	nic_state = mlx5_get_nic_state(dev);
-	if (nic_state == MLX5_NIC_IFC_INVALID) {
-		mlx5_core_err(dev, "health recovery flow aborted since the nic state is invalid\n");
-		return;
-	}
-
-	mlx5_core_err(dev, "starting health recovery flow\n");
-	mlx5_recover_device(dev);
-}
-
 /* How much time to wait until health resetting the driver (in msecs) */
-#define MLX5_RECOVERY_DELAY_MSECS 60000
+#define MLX5_RECOVERY_WAIT_MSECS 60000
 static void health_care(struct work_struct *work)
 {
-	unsigned long recover_delay = msecs_to_jiffies(MLX5_RECOVERY_DELAY_MSECS);
 	struct mlx5_core_health *health;
 	struct mlx5_core_dev *dev;
 	struct mlx5_priv *priv;
-	unsigned long flags;
+	unsigned long end;
 
 	health = container_of(work, struct mlx5_core_health, work);
 	priv = container_of(health, struct mlx5_priv, health);
@@ -176,13 +166,18 @@ static void health_care(struct work_struct *work)
 	mlx5_core_warn(dev, "handling bad device here\n");
 	mlx5_handle_bad_state(dev);
 
-	spin_lock_irqsave(&health->wq_lock, flags);
-	if (!test_bit(MLX5_DROP_NEW_RECOVERY_WORK, &health->flags))
-		schedule_delayed_work(&health->recover_work, recover_delay);
-	else
-		mlx5_core_err(dev,
-			      "new health works are not permitted at this stage\n");
-	spin_unlock_irqrestore(&health->wq_lock, flags);
+	end = jiffies + msecs_to_jiffies(MLX5_RECOVERY_WAIT_MSECS);
+	while (sensor_pci_not_working(dev)) {
+		if (time_after(jiffies, end)) {
+			mlx5_core_err(dev,
+				      "health recovery flow aborted, PCI reads still not working\n");
+			return;
+		}
+		msleep(100);
+	}
+
+	mlx5_core_err(dev, "starting health recovery flow\n");
+	mlx5_recover_device(dev);
 }
 
 static const char *hsynd_str(u8 synd)
@@ -274,6 +269,7 @@ static void poll_health(struct timer_list *t)
 {
 	struct mlx5_core_dev *dev = from_timer(dev, t, priv.health.timer);
 	struct mlx5_core_health *health = &dev->priv.health;
+	u32 fatal_error;
 	u32 count;
 
 	if (dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
@@ -291,8 +287,11 @@ static void poll_health(struct timer_list *t)
 		print_health_info(dev);
 	}
 
-	if (in_fatal(dev) && !health->sick) {
-		health->sick = true;
+	fatal_error = check_fatal_sensors(dev);
+
+	if (fatal_error && !health->fatal_error) {
+		mlx5_core_err(dev, "Fatal error %u detected\n", fatal_error);
+		dev->priv.health.fatal_error = fatal_error;
 		print_health_info(dev);
 		mlx5_trigger_health_work(dev);
 	}
@@ -306,9 +305,8 @@ void mlx5_start_health_poll(struct mlx5_core_dev *dev)
 	struct mlx5_core_health *health = &dev->priv.health;
 
 	timer_setup(&health->timer, poll_health, 0);
-	health->sick = 0;
+	health->fatal_error = MLX5_SENSOR_NO_ERR;
 	clear_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags);
-	clear_bit(MLX5_DROP_NEW_RECOVERY_WORK, &health->flags);
 	health->health = &dev->iseg->health;
 	health->health_counter = &dev->iseg->health_counter;
 
@@ -324,7 +322,6 @@ void mlx5_stop_health_poll(struct mlx5_core_dev *dev, bool disable_health)
 	if (disable_health) {
 		spin_lock_irqsave(&health->wq_lock, flags);
 		set_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags);
-		set_bit(MLX5_DROP_NEW_RECOVERY_WORK, &health->flags);
 		spin_unlock_irqrestore(&health->wq_lock, flags);
 	}
 
@@ -338,23 +335,10 @@ void mlx5_drain_health_wq(struct mlx5_core_dev *dev)
 
 	spin_lock_irqsave(&health->wq_lock, flags);
 	set_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags);
-	set_bit(MLX5_DROP_NEW_RECOVERY_WORK, &health->flags);
 	spin_unlock_irqrestore(&health->wq_lock, flags);
-	cancel_delayed_work_sync(&health->recover_work);
 	cancel_work_sync(&health->work);
 }
 
-void mlx5_drain_health_recovery(struct mlx5_core_dev *dev)
-{
-	struct mlx5_core_health *health = &dev->priv.health;
-	unsigned long flags;
-
-	spin_lock_irqsave(&health->wq_lock, flags);
-	set_bit(MLX5_DROP_NEW_RECOVERY_WORK, &health->flags);
-	spin_unlock_irqrestore(&health->wq_lock, flags);
-	cancel_delayed_work_sync(&dev->priv.health.recover_work);
-}
-
 void mlx5_health_flush(struct mlx5_core_dev *dev)
 {
 	struct mlx5_core_health *health = &dev->priv.health;
@@ -387,7 +371,6 @@ int mlx5_health_init(struct mlx5_core_dev *dev)
 		return -ENOMEM;
 	spin_lock_init(&health->wq_lock);
 	INIT_WORK(&health->work, health_care);
-	INIT_DELAYED_WORK(&health->recover_work, health_recover);
 	health->crdump = NULL;
 
 	return 0;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 43f5487de4c3..c94eaa49d1f6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1185,7 +1185,7 @@ static int mlx5_unload_one(struct mlx5_core_dev *dev, bool cleanup)
 	int err = 0;
 
 	if (cleanup)
-		mlx5_drain_health_recovery(dev);
+		mlx5_drain_health_wq(dev);
 
 	mutex_lock(&dev->intf_state_mutex);
 	if (!test_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state)) {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index 22e69d4813e4..d31b77ad533d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -213,7 +213,7 @@ enum {
 	MLX5_NIC_IFC_FULL		= 0,
 	MLX5_NIC_IFC_DISABLED		= 1,
 	MLX5_NIC_IFC_NO_DRAM_NIC	= 2,
-	MLX5_NIC_IFC_INVALID		= 3
+	MLX5_NIC_IFC_SW_RESET		= 7
 };
 
 u8 mlx5_get_nic_state(struct mlx5_core_dev *dev);
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index ddf6f41a75d3..086faa4d22bf 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -436,7 +436,7 @@ struct mlx5_core_health {
 	struct timer_list		timer;
 	u32				prev;
 	int				miss_counter;
-	bool				sick;
+	u32				fatal_error;
 	/* wq spinlock to synchronize draining */
 	spinlock_t			wq_lock;
 	struct workqueue_struct	       *wq;
@@ -907,7 +907,6 @@ void mlx5_start_health_poll(struct mlx5_core_dev *dev);
 void mlx5_stop_health_poll(struct mlx5_core_dev *dev, bool disable_health);
 void mlx5_drain_health_wq(struct mlx5_core_dev *dev);
 void mlx5_trigger_health_work(struct mlx5_core_dev *dev);
-void mlx5_drain_health_recovery(struct mlx5_core_dev *dev);
 int mlx5_buf_alloc_node(struct mlx5_core_dev *dev, int size,
 			struct mlx5_frag_buf *buf, int node);
 int mlx5_buf_alloc(struct mlx5_core_dev *dev,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [net-next 06/15] net/mlx5: Control CR-space access by different PFs
  2019-05-05  0:32 [pull request][net-next 00/15] Mellanox, mlx5 Firmware devlink health and sw reset Saeed Mahameed
                   ` (4 preceding siblings ...)
  2019-05-05  0:33 ` [net-next 05/15] net/mlx5: Handle SW reset of FW in error flow Saeed Mahameed
@ 2019-05-05  0:33 ` Saeed Mahameed
  2019-05-05  0:33 ` [net-next 07/15] net/mlx5: Issue SW reset on FW assert Saeed Mahameed
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 34+ messages in thread
From: Saeed Mahameed @ 2019-05-05  0:33 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jiri Pirko, Feras Daoud, Saeed Mahameed, Alex Vesker

From: Feras Daoud <ferasda@mellanox.com>

Since the FW can be shared between different PFs/VFs it is common
that more than one health poll will detected a failure, this can
lead to multiple resets which are unneeded.

The solution is to use a FW locking mechanism using semaphore space
to provide a way to allow only one device to collect the cr-dump and
to issue a sw-reset.

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../ethernet/mellanox/mlx5/core/lib/pci_vsc.c | 40 ++++++++++++++++---
 .../ethernet/mellanox/mlx5/core/lib/pci_vsc.h |  8 ++++
 .../ethernet/mellanox/mlx5/core/mlx5_core.h   |  4 ++
 3 files changed, 47 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c
index f42890bdd6b1..b6b8fb13f621 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c
@@ -24,11 +24,6 @@
 	pci_write_config_dword((dev)->pdev, (dev)->vsc_addr + (offset), (val))
 #define VSC_MAX_RETRIES 2048
 
-enum mlx5_vsc_state {
-	MLX5_VSC_UNLOCK,
-	MLX5_VSC_LOCK,
-};
-
 enum {
 	VSC_CTRL_OFFSET = 0x4,
 	VSC_COUNTER_OFFSET = 0x8,
@@ -281,3 +276,38 @@ int mlx5_vsc_gw_read_block_fast(struct mlx5_core_dev *dev, u32 *data,
 	}
 	return length;
 }
+
+int mlx5_vsc_sem_set_space(struct mlx5_core_dev *dev, u16 space,
+			   enum mlx5_vsc_state state)
+{
+	u32 data, id = 0;
+	int ret;
+
+	ret = mlx5_vsc_gw_set_space(dev, MLX5_SEMAPHORE_SPACE_DOMAIN, NULL);
+	if (ret) {
+		mlx5_core_warn(dev, "Failed to set gw space %d\n", ret);
+		return ret;
+	}
+
+	if (state == MLX5_VSC_LOCK) {
+		/* Get a unique ID based on the counter */
+		ret = vsc_read(dev, VSC_COUNTER_OFFSET, &id);
+		if (ret)
+			return ret;
+	}
+
+	/* Try to modify lock */
+	ret = mlx5_vsc_gw_write(dev, space, id);
+	if (ret)
+		return ret;
+
+	/* Verify lock was modified */
+	ret = mlx5_vsc_gw_read(dev, space, &data);
+	if (ret)
+		return -EINVAL;
+
+	if (data != id)
+		return -EBUSY;
+
+	return 0;
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.h
index c6ebf59006c5..4264b65f7437 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.h
@@ -4,6 +4,11 @@
 #ifndef __MLX5_PCI_VSC_H__
 #define __MLX5_PCI_VSC_H__
 
+enum mlx5_vsc_state {
+	MLX5_VSC_UNLOCK,
+	MLX5_VSC_LOCK,
+};
+
 enum {
 	MLX5_VSC_SPACE_SCAN_CRSPACE = 0x7,
 };
@@ -22,4 +27,7 @@ static inline bool mlx5_vsc_accessible(struct mlx5_core_dev *dev)
 	return !!dev->vsc_addr;
 }
 
+int mlx5_vsc_sem_set_space(struct mlx5_core_dev *dev, u16 space,
+			   enum mlx5_vsc_state state);
+
 #endif /* __MLX5_PCI_VSC_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index d31b77ad533d..439cf23945a4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -111,6 +111,10 @@ enum {
 	MLX5_DRIVER_SYND = 0xbadd00de,
 };
 
+enum mlx5_semaphore_space_address {
+	MLX5_SEMAPHORE_SPACE_DOMAIN     = 0xA,
+};
+
 int mlx5_query_hca_caps(struct mlx5_core_dev *dev);
 int mlx5_query_board_id(struct mlx5_core_dev *dev);
 int mlx5_cmd_init_hca(struct mlx5_core_dev *dev, uint32_t *sw_owner_id);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [net-next 07/15] net/mlx5: Issue SW reset on FW assert
  2019-05-05  0:32 [pull request][net-next 00/15] Mellanox, mlx5 Firmware devlink health and sw reset Saeed Mahameed
                   ` (5 preceding siblings ...)
  2019-05-05  0:33 ` [net-next 06/15] net/mlx5: Control CR-space access by different PFs Saeed Mahameed
@ 2019-05-05  0:33 ` Saeed Mahameed
  2019-05-05 15:38   ` Jiri Pirko
  2019-05-05  0:33 ` [net-next 08/15] net/mlx5: Refactor print health info Saeed Mahameed
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 34+ messages in thread
From: Saeed Mahameed @ 2019-05-05  0:33 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jiri Pirko, Feras Daoud, Alex Vesker, Moshe Shemesh,
	Daniel Jurgens, Saeed Mahameed

From: Feras Daoud <ferasda@mellanox.com>

If a FW assert is considered fatal, indicated by a new bit in the health
buffer, reset the FW. After the reset go through the normal recovery
flow. Only one PF needs to issue the reset, so an attempt is made to
prevent the 2nd function from also issuing the reset.
It's not an error if that happens, it just slows recovery.

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../ethernet/mellanox/mlx5/core/diag/crdump.c |  13 +-
 .../net/ethernet/mellanox/mlx5/core/health.c  | 157 +++++++++++++++++-
 .../net/ethernet/mellanox/mlx5/core/main.c    |   1 +
 .../ethernet/mellanox/mlx5/core/mlx5_core.h   |   2 +
 include/linux/mlx5/device.h                   |  10 +-
 include/linux/mlx5/driver.h                   |   1 +
 6 files changed, 176 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c b/drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c
index 7337a49f2733..8cd4dd1d11d2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c
@@ -92,14 +92,23 @@ int mlx5_crdump_collect(struct mlx5_core_dev *dev,
 			       ret);
 		return ret;
 	}
+	/* Verify no other PF is running cr-dump or sw reset */
+	ret = mlx5_vsc_sem_set_space(dev, MLX5_SEMAPHORE_SW_RESET,
+				     MLX5_VSC_LOCK);
+	if (ret) {
+		mlx5_core_warn(dev, "Failed to lock SW reset semaphore\n");
+		goto unlock_gw;
+	}
 
 	ret = mlx5_vsc_gw_set_space(dev, MLX5_VSC_SPACE_SCAN_CRSPACE, NULL);
 	if (ret)
-		goto unlock;
+		goto unlock_sem;
 
 	ret = mlx5_crdump_fill(dev, crdump_region, snapshot_id);
 
-unlock:
+unlock_sem:
+	mlx5_vsc_sem_set_space(dev, MLX5_SEMAPHORE_SW_RESET, MLX5_VSC_UNLOCK);
+unlock_gw:
 	mlx5_vsc_gw_unlock(dev);
 	return ret;
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index adb40fe0f6ec..19d9297682d7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -40,6 +40,7 @@
 #include "mlx5_core.h"
 #include "lib/eq.h"
 #include "lib/mlx5.h"
+#include "lib/pci_vsc.h"
 
 enum {
 	MLX5_HEALTH_POLL_INTERVAL	= 2 * HZ,
@@ -67,8 +68,10 @@ enum {
 enum  {
 	MLX5_SENSOR_NO_ERR		= 0,
 	MLX5_SENSOR_PCI_COMM_ERR	= 1,
-	MLX5_SENSOR_NIC_DISABLED	= 2,
-	MLX5_SENSOR_NIC_SW_RESET	= 3,
+	MLX5_SENSOR_PCI_ERR		= 2,
+	MLX5_SENSOR_NIC_DISABLED	= 3,
+	MLX5_SENSOR_NIC_SW_RESET	= 4,
+	MLX5_SENSOR_FW_SYND_RFR		= 5,
 };
 
 u8 mlx5_get_nic_state(struct mlx5_core_dev *dev)
@@ -95,32 +98,162 @@ static bool sensor_pci_not_working(struct mlx5_core_dev *dev)
 	return (ioread32be(&h->fw_ver) == 0xffffffff);
 }
 
+static bool sensor_fw_synd_rfr(struct mlx5_core_dev *dev)
+{
+	struct mlx5_core_health *health = &dev->priv.health;
+	struct health_buffer __iomem *h = health->health;
+	u32 rfr = ioread32be(&h->rfr) >> MLX5_RFR_OFFSET;
+	u8 synd = ioread8(&h->synd);
+
+	if (rfr && synd)
+		mlx5_core_dbg(dev, "FW requests reset, synd: %d\n", synd);
+	return rfr && synd;
+}
+
 static u32 check_fatal_sensors(struct mlx5_core_dev *dev)
 {
 	if (sensor_pci_not_working(dev))
 		return MLX5_SENSOR_PCI_COMM_ERR;
+	if (pci_channel_offline(dev->pdev))
+		return MLX5_SENSOR_PCI_ERR;
 	if (mlx5_get_nic_state(dev) == MLX5_NIC_IFC_DISABLED)
 		return MLX5_SENSOR_NIC_DISABLED;
 	if (mlx5_get_nic_state(dev) == MLX5_NIC_IFC_SW_RESET)
 		return MLX5_SENSOR_NIC_SW_RESET;
+	if (sensor_fw_synd_rfr(dev))
+		return MLX5_SENSOR_FW_SYND_RFR;
 
 	return MLX5_SENSOR_NO_ERR;
 }
 
+static int lock_sem_sw_reset(struct mlx5_core_dev *dev, bool lock)
+{
+	enum mlx5_vsc_state state;
+	int ret;
+
+	if (!mlx5_core_is_pf(dev))
+		return -EBUSY;
+
+	/* Try to lock GW access, this stage doesn't return
+	 * EBUSY because locked GW does not mean that other PF
+	 * already started the reset.
+	 */
+	ret = mlx5_vsc_gw_lock(dev);
+	if (ret == -EBUSY)
+		return -EINVAL;
+	if (ret)
+		return ret;
+
+	state = lock ? MLX5_VSC_LOCK : MLX5_VSC_UNLOCK;
+	/* At this stage, if the return status == EBUSY, then we know
+	 * for sure that another PF started the reset, so don't allow
+	 * another reset.
+	 */
+	ret = mlx5_vsc_sem_set_space(dev, MLX5_SEMAPHORE_SW_RESET, state);
+	if (ret)
+		mlx5_core_warn(dev, "Failed to lock SW reset semaphore\n");
+
+	/* Unlock GW access */
+	mlx5_vsc_gw_unlock(dev);
+
+	return ret;
+}
+
+static bool reset_fw_if_needed(struct mlx5_core_dev *dev)
+{
+	bool supported = (ioread32be(&dev->iseg->initializing) >>
+			  MLX5_FW_RESET_SUPPORTED_OFFSET) & 1;
+	u32 fatal_error;
+
+	if (!supported)
+		return false;
+
+	/* The reset only needs to be issued by one PF. The health buffer is
+	 * shared between all functions, and will be cleared during a reset.
+	 * Check again to avoid a redundant 2nd reset. If the fatal erros was
+	 * PCI related a reset won't help.
+	 */
+	fatal_error = check_fatal_sensors(dev);
+	if (fatal_error == MLX5_SENSOR_PCI_COMM_ERR ||
+	    fatal_error == MLX5_SENSOR_NIC_DISABLED ||
+	    fatal_error == MLX5_SENSOR_NIC_SW_RESET) {
+		mlx5_core_warn(dev, "Not issuing FW reset. Either it's already done or won't help.");
+		return false;
+	}
+
+	mlx5_core_warn(dev, "Issuing FW Reset\n");
+	/* Write the NIC interface field to initiate the reset, the command
+	 * interface address also resides here, don't overwrite it.
+	 */
+	mlx5_set_nic_state(dev, MLX5_NIC_IFC_SW_RESET);
+
+	return true;
+}
+
 void mlx5_enter_error_state(struct mlx5_core_dev *dev, bool force)
 {
 	mutex_lock(&dev->intf_state_mutex);
 	if (dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
 		goto unlock;
+	if (dev->state == MLX5_DEVICE_STATE_UNINITIALIZED) {
+		dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR;
+		goto unlock;
+	}
 
-	mlx5_core_err(dev, "start\n");
-	if (pci_channel_offline(dev->pdev) ||
-	    dev->priv.health.fatal_error != MLX5_SENSOR_NO_ERR || force) {
+	if (check_fatal_sensors(dev) || force) {
 		dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR;
 		mlx5_cmd_flush(dev);
 	}
 
 	mlx5_notifier_call_chain(dev->priv.events, MLX5_DEV_EVENT_SYS_ERROR, (void *)1);
+unlock:
+	mutex_unlock(&dev->intf_state_mutex);
+}
+
+#define MLX5_CRDUMP_WAIT_MS	60000
+#define MLX5_FW_RESET_WAIT_MS	1000
+void mlx5_error_sw_reset(struct mlx5_core_dev *dev)
+{
+	unsigned long end, delay_ms = MLX5_FW_RESET_WAIT_MS;
+	int lock = -EBUSY;
+
+	mutex_lock(&dev->intf_state_mutex);
+	if (dev->state != MLX5_DEVICE_STATE_INTERNAL_ERROR)
+		goto unlock;
+
+	mlx5_core_err(dev, "start\n");
+
+	if (check_fatal_sensors(dev) == MLX5_SENSOR_FW_SYND_RFR) {
+		/* Get cr-dump and reset FW semaphore */
+		lock = lock_sem_sw_reset(dev, true);
+
+		if (lock == -EBUSY) {
+			delay_ms = MLX5_CRDUMP_WAIT_MS;
+			goto recover_from_sw_reset;
+		}
+		/* Execute SW reset */
+		reset_fw_if_needed(dev);
+	}
+
+recover_from_sw_reset:
+	/* Recover from SW reset */
+	end = jiffies + msecs_to_jiffies(delay_ms);
+	do {
+		if (mlx5_get_nic_state(dev) == MLX5_NIC_IFC_DISABLED)
+			break;
+
+		cond_resched();
+	} while (!time_after(jiffies, end));
+
+	if (mlx5_get_nic_state(dev) != MLX5_NIC_IFC_DISABLED) {
+		dev_err(&dev->pdev->dev, "NIC IFC still %d after %lums.\n",
+			mlx5_get_nic_state(dev), delay_ms);
+	}
+
+	/* Release FW semaphore if you are the lock owner */
+	if (!lock)
+		lock_sem_sw_reset(dev, false);
+
 	mlx5_core_err(dev, "end\n");
 
 unlock:
@@ -143,6 +276,20 @@ static void mlx5_handle_bad_state(struct mlx5_core_dev *dev)
 	case MLX5_NIC_IFC_NO_DRAM_NIC:
 		mlx5_core_warn(dev, "Expected to see disabled NIC but it is no dram nic\n");
 		break;
+
+	case MLX5_NIC_IFC_SW_RESET:
+		/* The IFC mode field is 3 bits, so it will read 0x7 in 2 cases:
+		 * 1. PCI has been disabled (ie. PCI-AER, PF driver unloaded
+		 *    and this is a VF), this is not recoverable by SW reset.
+		 *    Logging of this is handled elsewhere.
+		 * 2. FW reset has been issued by another function, driver can
+		 *    be reloaded to recover after the mode switches to
+		 *    MLX5_NIC_IFC_DISABLED.
+		 */
+		if (dev->priv.health.fatal_error != MLX5_SENSOR_PCI_COMM_ERR)
+			mlx5_core_warn(dev, "NIC SW reset in progress\n");
+		break;
+
 	default:
 		mlx5_core_warn(dev, "Expected to see disabled NIC but it is has invalid value %d\n",
 			       nic_interface);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index c94eaa49d1f6..c22ff9a58ec5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1368,6 +1368,7 @@ static pci_ers_result_t mlx5_pci_err_detected(struct pci_dev *pdev,
 	mlx5_core_info(dev, "%s was called\n", __func__);
 
 	mlx5_enter_error_state(dev, false);
+	mlx5_error_sw_reset(dev);
 	mlx5_unload_one(dev, false);
 	/* In case of kernel call drain the health wq */
 	if (state) {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index 439cf23945a4..9726af137be3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -113,6 +113,7 @@ enum {
 
 enum mlx5_semaphore_space_address {
 	MLX5_SEMAPHORE_SPACE_DOMAIN     = 0xA,
+	MLX5_SEMAPHORE_SW_RESET         = 0x20,
 };
 
 int mlx5_query_hca_caps(struct mlx5_core_dev *dev);
@@ -122,6 +123,7 @@ int mlx5_cmd_teardown_hca(struct mlx5_core_dev *dev);
 int mlx5_cmd_force_teardown_hca(struct mlx5_core_dev *dev);
 int mlx5_cmd_fast_teardown_hca(struct mlx5_core_dev *dev);
 void mlx5_enter_error_state(struct mlx5_core_dev *dev, bool force);
+void mlx5_error_sw_reset(struct mlx5_core_dev *dev);
 void mlx5_disable_device(struct mlx5_core_dev *dev);
 void mlx5_recover_device(struct mlx5_core_dev *dev);
 int mlx5_sriov_init(struct mlx5_core_dev *dev);
diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h
index fc2b6e807f06..2cfa2ec8b5d3 100644
--- a/include/linux/mlx5/device.h
+++ b/include/linux/mlx5/device.h
@@ -510,6 +510,10 @@ struct mlx5_cmd_layout {
 	u8		status_own;
 };
 
+enum mlx5_fatal_assert_bit_offsets {
+	MLX5_RFR_OFFSET = 31,
+};
+
 struct health_buffer {
 	__be32		assert_var[5];
 	__be32		rsvd0[3];
@@ -518,12 +522,16 @@ struct health_buffer {
 	__be32		rsvd1[2];
 	__be32		fw_ver;
 	__be32		hw_id;
-	__be32		rsvd2;
+	__be32		rfr;
 	u8		irisc_index;
 	u8		synd;
 	__be16		ext_synd;
 };
 
+enum mlx5_initializing_bit_offsets {
+	MLX5_FW_RESET_SUPPORTED_OFFSET = 30,
+};
+
 enum mlx5_cmd_addr_l_sz_offset {
 	MLX5_NIC_IFC_OFFSET = 8,
 };
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 086faa4d22bf..33c977db6ceb 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -584,6 +584,7 @@ struct mlx5_priv {
 };
 
 enum mlx5_device_state {
+	MLX5_DEVICE_STATE_UNINITIALIZED,
 	MLX5_DEVICE_STATE_UP,
 	MLX5_DEVICE_STATE_INTERNAL_ERROR,
 };
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [net-next 08/15] net/mlx5: Refactor print health info
  2019-05-05  0:32 [pull request][net-next 00/15] Mellanox, mlx5 Firmware devlink health and sw reset Saeed Mahameed
                   ` (6 preceding siblings ...)
  2019-05-05  0:33 ` [net-next 07/15] net/mlx5: Issue SW reset on FW assert Saeed Mahameed
@ 2019-05-05  0:33 ` Saeed Mahameed
  2019-05-05 15:42   ` Jiri Pirko
  2019-05-05  0:33 ` [net-next 09/15] net/mlx5: Create FW devlink health reporter Saeed Mahameed
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 34+ messages in thread
From: Saeed Mahameed @ 2019-05-05  0:33 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jiri Pirko, Moshe Shemesh, Eran Ben Elisha, Saeed Mahameed

From: Moshe Shemesh <moshe@mellanox.com>

Refactor print health info code, split to two functions:
 1. mlx5_get_health_info() - writes the health info into a buffer.
 2. mlx5_print_health_info() - prints the health info to kernel log.
This refactoring is done to enable using the health info data by devlink
health reporter diagnose() in the downstream patch.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/health.c  | 83 +++++++++++++++----
 include/linux/mlx5/driver.h                   |  4 +
 2 files changed, 70 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 19d9297682d7..a3c7e46aafd9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -357,7 +357,28 @@ static const char *hsynd_str(u8 synd)
 	}
 }
 
-static void print_health_info(struct mlx5_core_dev *dev)
+#define HEALTH_INFO_MAX_BUFF 1024
+static void mlx5_health_info_buf_reset(struct mlx5_core_dev *dev)
+{
+	dev->priv.health.info_buf_len = 0;
+}
+
+static void
+mlx5_health_info_buf_write(struct mlx5_core_dev *dev, const char *fmt, ...)
+{
+	struct mlx5_core_health *health = &dev->priv.health;
+	va_list args;
+	int len;
+
+	va_start(args, fmt);
+	len = vsnprintf(health->info_buf + health->info_buf_len,
+			HEALTH_INFO_MAX_BUFF - health->info_buf_len, fmt, args);
+	va_end(args);
+	health->info_buf_len = min_t(int, health->info_buf_len + len,
+				     HEALTH_INFO_MAX_BUFF);
+}
+
+static void mlx5_get_health_info(struct mlx5_core_dev *dev, u8 *synd)
 {
 	struct mlx5_core_health *health = &dev->priv.health;
 	struct health_buffer __iomem *h = health->health;
@@ -365,27 +386,46 @@ static void print_health_info(struct mlx5_core_dev *dev)
 	u32 fw;
 	int i;
 
+	*synd = ioread8(&h->synd);
 	/* If the syndrome is 0, the device is OK and no need to print buffer */
-	if (!ioread8(&h->synd))
+	if (!synd)
 		return;
 
+	mlx5_health_info_buf_reset(dev);
+	mlx5_health_info_buf_write(dev, "\n");
 	for (i = 0; i < ARRAY_SIZE(h->assert_var); i++)
-		mlx5_core_err(dev, "assert_var[%d] 0x%08x\n", i,
-			      ioread32be(h->assert_var + i));
+		mlx5_health_info_buf_write(dev, "assert_var[%d] 0x%08x\n", i,
+					   ioread32be(h->assert_var + i));
 
-	mlx5_core_err(dev, "assert_exit_ptr 0x%08x\n",
-		      ioread32be(&h->assert_exit_ptr));
-	mlx5_core_err(dev, "assert_callra 0x%08x\n",
-		      ioread32be(&h->assert_callra));
+	mlx5_health_info_buf_write(dev, "assert_exit_ptr 0x%08x\n",
+				   ioread32be(&h->assert_exit_ptr));
+	mlx5_health_info_buf_write(dev, "assert_callra 0x%08x\n",
+				   ioread32be(&h->assert_callra));
 	sprintf(fw_str, "%d.%d.%d", fw_rev_maj(dev), fw_rev_min(dev), fw_rev_sub(dev));
-	mlx5_core_err(dev, "fw_ver %s\n", fw_str);
-	mlx5_core_err(dev, "hw_id 0x%08x\n", ioread32be(&h->hw_id));
-	mlx5_core_err(dev, "irisc_index %d\n", ioread8(&h->irisc_index));
-	mlx5_core_err(dev, "synd 0x%x: %s\n", ioread8(&h->synd),
-		      hsynd_str(ioread8(&h->synd)));
-	mlx5_core_err(dev, "ext_synd 0x%04x\n", ioread16be(&h->ext_synd));
+	mlx5_health_info_buf_write(dev, "fw_ver %s\n", fw_str);
+	mlx5_health_info_buf_write(dev, "hw_id 0x%08x\n", ioread32be(&h->hw_id));
+	mlx5_health_info_buf_write(dev, "irisc_index %d\n", ioread8(&h->irisc_index));
+	mlx5_health_info_buf_write(dev, "synd 0x%x: %s\n", ioread8(&h->synd),
+				   hsynd_str(ioread8(&h->synd)));
+	mlx5_health_info_buf_write(dev, "ext_synd 0x%04x\n", ioread16be(&h->ext_synd));
 	fw = ioread32be(&h->fw_ver);
-	mlx5_core_err(dev, "raw fw_ver 0x%08x\n", fw);
+	mlx5_health_info_buf_write(dev, "raw fw_ver 0x%08x\n", fw);
+}
+
+static void mlx5_print_health_info(struct mlx5_core_dev *dev)
+{
+	struct mlx5_core_health *health = &dev->priv.health;
+	u8 synd;
+
+	mutex_lock(&health->info_buf_lock);
+	mlx5_get_health_info(dev, &synd);
+
+	if (!synd)
+		goto unlock;
+
+	mlx5_core_err(dev, "%s", health->info_buf);
+unlock:
+	mutex_unlock(&health->info_buf_lock);
 }
 
 static unsigned long get_next_poll_jiffies(void)
@@ -431,7 +471,7 @@ static void poll_health(struct timer_list *t)
 	health->prev = count;
 	if (health->miss_counter == MAX_MISSES) {
 		mlx5_core_err(dev, "device's health compromised - reached miss count\n");
-		print_health_info(dev);
+		mlx5_print_health_info(dev);
 	}
 
 	fatal_error = check_fatal_sensors(dev);
@@ -439,7 +479,7 @@ static void poll_health(struct timer_list *t)
 	if (fatal_error && !health->fatal_error) {
 		mlx5_core_err(dev, "Fatal error %u detected\n", fatal_error);
 		dev->priv.health.fatal_error = fatal_error;
-		print_health_info(dev);
+		mlx5_print_health_info(dev);
 		mlx5_trigger_health_work(dev);
 	}
 
@@ -497,6 +537,7 @@ void mlx5_health_cleanup(struct mlx5_core_dev *dev)
 {
 	struct mlx5_core_health *health = &dev->priv.health;
 
+	kfree(health->info_buf);
 	destroy_workqueue(health->wq);
 }
 
@@ -519,6 +560,14 @@ int mlx5_health_init(struct mlx5_core_dev *dev)
 	spin_lock_init(&health->wq_lock);
 	INIT_WORK(&health->work, health_care);
 	health->crdump = NULL;
+	health->info_buf = kmalloc(HEALTH_INFO_MAX_BUFF, GFP_KERNEL);
+	if (!health->info_buf)
+		goto err_alloc_buff;
+	mutex_init(&health->info_buf_lock);
 
 	return 0;
+
+err_alloc_buff:
+	destroy_workqueue(health->wq);
+	return -ENOMEM;
 }
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 33c977db6ceb..df8f4c4e21c6 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -444,6 +444,10 @@ struct mlx5_core_health {
 	struct work_struct		work;
 	struct delayed_work		recover_work;
 	struct mlx5_fw_crdump	       *crdump;
+	char			       *info_buf;
+	int				info_buf_len;
+	/* protect info buf access */
+	struct mutex			info_buf_lock;
 };
 
 struct mlx5_qp_table {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [net-next 09/15] net/mlx5: Create FW devlink health reporter
  2019-05-05  0:32 [pull request][net-next 00/15] Mellanox, mlx5 Firmware devlink health and sw reset Saeed Mahameed
                   ` (7 preceding siblings ...)
  2019-05-05  0:33 ` [net-next 08/15] net/mlx5: Refactor print health info Saeed Mahameed
@ 2019-05-05  0:33 ` Saeed Mahameed
  2019-05-05 15:42   ` Jiri Pirko
  2019-05-07  0:11   ` Jakub Kicinski
  2019-05-05  0:33 ` [net-next 10/15] net/mlx5: Add core dump register access functions Saeed Mahameed
                   ` (5 subsequent siblings)
  14 siblings, 2 replies; 34+ messages in thread
From: Saeed Mahameed @ 2019-05-05  0:33 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jiri Pirko, Moshe Shemesh, Eran Ben Elisha, Saeed Mahameed

From: Moshe Shemesh <moshe@mellanox.com>

Create mlx5_devlink_health_reporter for FW reporter. The FW reporter
implements devlink_health_reporter diagnose callback.

The fw reporter diagnose command can be triggered any time by the user
to check current fw status.
In healthy status, it will return clear syndrome. Otherwise it will dump
the health info buffer.

Command example and output on healthy status:
$ devlink health diagnose pci/0000:82:00.0 reporter fw
Syndrome: 0

Command example and output on non healthy status:
$ devlink health diagnose pci/0000:82:00.0 reporter fw
diagnose data:
assert_var[0] 0xfc3fc043
assert_var[1] 0x0001b41c
assert_var[2] 0x00000000
assert_var[3] 0x00000000
assert_var[4] 0x00000000
assert_exit_ptr 0x008033b4
assert_callra 0x0080365c
fw_ver 16.24.1000
hw_id 0x0000020d
irisc_index 0
synd 0x8: unrecoverable hardware error
ext_synd 0x003d
raw fw_ver 0x101803e8

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/health.c  | 55 +++++++++++++++++++
 include/linux/mlx5/driver.h                   |  1 +
 2 files changed, 56 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index a3c7e46aafd9..9ffa9c7f81a0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -428,6 +428,58 @@ static void mlx5_print_health_info(struct mlx5_core_dev *dev)
 	mutex_unlock(&health->info_buf_lock);
 }
 
+static int
+mlx5_fw_reporter_diagnose(struct devlink_health_reporter *reporter,
+			  struct devlink_fmsg *fmsg)
+{
+	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
+	struct mlx5_core_health *health = &dev->priv.health;
+	u8 synd;
+	int err;
+
+	mutex_lock(&health->info_buf_lock);
+	mlx5_get_health_info(dev, &synd);
+
+	if (!synd) {
+		mutex_unlock(&health->info_buf_lock);
+		return devlink_fmsg_u8_pair_put(fmsg, "Syndrome", synd);
+	}
+
+	err = devlink_fmsg_string_pair_put(fmsg, "diagnose data",
+					   health->info_buf);
+
+	mutex_unlock(&health->info_buf_lock);
+	return err;
+}
+
+static const struct devlink_health_reporter_ops mlx5_fw_reporter_ops = {
+		.name = "fw",
+		.diagnose = mlx5_fw_reporter_diagnose,
+};
+
+static void mlx5_fw_reporter_create(struct mlx5_core_dev *dev)
+{
+	struct mlx5_core_health *health = &dev->priv.health;
+	struct devlink *devlink = priv_to_devlink(dev);
+
+	health->fw_reporter =
+		devlink_health_reporter_create(devlink, &mlx5_fw_reporter_ops,
+					       0, false, dev);
+	if (IS_ERR(health->fw_reporter))
+		mlx5_core_warn(dev, "Failed to create fw reporter, err = %ld\n",
+			       PTR_ERR(health->fw_reporter));
+}
+
+static void mlx5_fw_reporter_destroy(struct mlx5_core_dev *dev)
+{
+	struct mlx5_core_health *health = &dev->priv.health;
+
+	if (IS_ERR_OR_NULL(health->fw_reporter))
+		return;
+
+	devlink_health_reporter_destroy(health->fw_reporter);
+}
+
 static unsigned long get_next_poll_jiffies(void)
 {
 	unsigned long next;
@@ -539,6 +591,7 @@ void mlx5_health_cleanup(struct mlx5_core_dev *dev)
 
 	kfree(health->info_buf);
 	destroy_workqueue(health->wq);
+	mlx5_fw_reporter_destroy(dev);
 }
 
 int mlx5_health_init(struct mlx5_core_dev *dev)
@@ -565,6 +618,8 @@ int mlx5_health_init(struct mlx5_core_dev *dev)
 		goto err_alloc_buff;
 	mutex_init(&health->info_buf_lock);
 
+	mlx5_fw_reporter_create(dev);
+
 	return 0;
 
 err_alloc_buff:
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index df8f4c4e21c6..a362aa6c799c 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -448,6 +448,7 @@ struct mlx5_core_health {
 	int				info_buf_len;
 	/* protect info buf access */
 	struct mutex			info_buf_lock;
+	struct devlink_health_reporter *fw_reporter;
 };
 
 struct mlx5_qp_table {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [net-next 10/15] net/mlx5: Add core dump register access functions
  2019-05-05  0:32 [pull request][net-next 00/15] Mellanox, mlx5 Firmware devlink health and sw reset Saeed Mahameed
                   ` (8 preceding siblings ...)
  2019-05-05  0:33 ` [net-next 09/15] net/mlx5: Create FW devlink health reporter Saeed Mahameed
@ 2019-05-05  0:33 ` Saeed Mahameed
  2019-05-05  0:33 ` [net-next 11/15] net/mlx5: Add support for FW reporter dump Saeed Mahameed
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 34+ messages in thread
From: Saeed Mahameed @ 2019-05-05  0:33 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jiri Pirko, Moshe Shemesh, Eran Ben Elisha, Saeed Mahameed

From: Moshe Shemesh <moshe@mellanox.com>

Add access functions to core dump register to enable trigger FW core
dump.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../mellanox/mlx5/core/diag/fw_tracer.c       | 34 +++++++++++++++++++
 include/linux/mlx5/driver.h                   |  1 +
 include/linux/mlx5/mlx5_ifc.h                 | 17 +++++++++-
 3 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c
index 6999f4486e9e..56025797cd1e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c
@@ -786,6 +786,40 @@ static void mlx5_fw_tracer_ownership_change(struct work_struct *work)
 	mlx5_fw_tracer_start(tracer);
 }
 
+static int mlx5_fw_tracer_set_core_dump_reg(struct mlx5_core_dev *dev,
+					    u32 *in, int size_in)
+{
+	u32 out[MLX5_ST_SZ_DW(core_dump_reg)] = {};
+
+	if (!MLX5_CAP_DEBUG(dev, core_dump_general) &&
+	    !MLX5_CAP_DEBUG(dev, core_dump_qp))
+		return -EOPNOTSUPP;
+
+	return mlx5_core_access_reg(dev, in, size_in, out, sizeof(out),
+				    MLX5_REG_CORE_DUMP, 0, 1);
+}
+
+int mlx5_fw_tracer_trigger_core_dump_general(struct mlx5_core_dev *dev)
+{
+	struct mlx5_fw_tracer *tracer = dev->tracer;
+	u32 in[MLX5_ST_SZ_DW(core_dump_reg)] = {};
+	int err;
+
+	if (!MLX5_CAP_DEBUG(dev, core_dump_general) || !tracer)
+		return -EOPNOTSUPP;
+	if (!tracer->owner)
+		return -EPERM;
+
+	MLX5_SET(core_dump_reg, in, core_dump_type, 0x0);
+
+	err =  mlx5_fw_tracer_set_core_dump_reg(dev, in, sizeof(in));
+	if (err)
+		return err;
+	queue_work(tracer->work_queue, &tracer->handle_traces_work);
+	flush_workqueue(tracer->work_queue);
+	return 0;
+}
+
 /* Create software resources (Buffers, etc ..) */
 struct mlx5_fw_tracer *mlx5_fw_tracer_create(struct mlx5_core_dev *dev)
 {
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index a362aa6c799c..ebda70984601 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -108,6 +108,7 @@ enum {
 	MLX5_REG_FPGA_CAP	 = 0x4022,
 	MLX5_REG_FPGA_CTRL	 = 0x4023,
 	MLX5_REG_FPGA_ACCESS_REG = 0x4024,
+	MLX5_REG_CORE_DUMP	 = 0x402e,
 	MLX5_REG_PCAP		 = 0x5001,
 	MLX5_REG_PMTU		 = 0x5003,
 	MLX5_REG_PTYS		 = 0x5004,
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 82612741b29e..9baee29b7124 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -715,7 +715,9 @@ struct mlx5_ifc_qos_cap_bits {
 };
 
 struct mlx5_ifc_debug_cap_bits {
-	u8         reserved_at_0[0x20];
+	u8         core_dump_general[0x1];
+	u8         core_dump_qp[0x1];
+	u8         reserved_at_2[0x1e];
 
 	u8         reserved_at_20[0x2];
 	u8         stall_detect[0x1];
@@ -2531,6 +2533,7 @@ union mlx5_ifc_hca_cap_union_bits {
 	struct mlx5_ifc_e_switch_cap_bits e_switch_cap;
 	struct mlx5_ifc_vector_calc_cap_bits vector_calc_cap;
 	struct mlx5_ifc_qos_cap_bits qos_cap;
+	struct mlx5_ifc_debug_cap_bits debug_cap;
 	struct mlx5_ifc_fpga_cap_bits fpga_cap;
 	u8         reserved_at_0[0x8000];
 };
@@ -8546,6 +8549,18 @@ struct mlx5_ifc_qcam_reg_bits {
 	u8         reserved_at_1c0[0x80];
 };
 
+struct mlx5_ifc_core_dump_reg_bits {
+	u8         reserved_at_0[0x18];
+	u8         core_dump_type[0x8];
+
+	u8         reserved_at_20[0x30];
+	u8         vhca_id[0x10];
+
+	u8         reserved_at_60[0x8];
+	u8         qpn[0x18];
+	u8         reserved_at_80[0x180];
+};
+
 struct mlx5_ifc_pcap_reg_bits {
 	u8         reserved_at_0[0x8];
 	u8         local_port[0x8];
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [net-next 11/15] net/mlx5: Add support for FW reporter dump
  2019-05-05  0:32 [pull request][net-next 00/15] Mellanox, mlx5 Firmware devlink health and sw reset Saeed Mahameed
                   ` (9 preceding siblings ...)
  2019-05-05  0:33 ` [net-next 10/15] net/mlx5: Add core dump register access functions Saeed Mahameed
@ 2019-05-05  0:33 ` Saeed Mahameed
  2019-05-05 15:49   ` Jiri Pirko
  2019-05-05  0:33 ` [net-next 12/15] net/mlx5: Report devlink health on FW issues Saeed Mahameed
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 34+ messages in thread
From: Saeed Mahameed @ 2019-05-05  0:33 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jiri Pirko, Moshe Shemesh, Eran Ben Elisha, Saeed Mahameed

From: Moshe Shemesh <moshe@mellanox.com>

Add support of dump callback for mlx5 FW reporter.
Once we trigger FW dump, the FW will write the core dump to its raw data
buffer. The tracer translates the raw data to traces and save it to a
buffer. Once dump is done, the saved traces data is filled as objects
into the dump buffer.

FW dump example:
$ devlink health dump show pci/0000:82:00.1 reporter fw
dump traces:
   trace: 0000:82:00.1 [0x69cd6c5283e] 0 [0xb8] dump general info
GVMI=0x0001
   trace: 0000:82:00.1 [0x69cd6c53bec] 0 [0xb8] GVMI management info,
gvmi_management context:
   trace: 0000:82:00.1 [0x69cd6c55eff] 0 [0xb8] [000]:  00000000
00000000  00000000  00000000
   trace: 0000:82:00.1 [0x69cd6c5657f] 0 [0xb8] [010]:  00000000
00000000  00000000  00000000
   trace: 0000:82:00.1 [0x69cd6c56608] 0 [0xb8] [020]:  00000000
00000000  00000000  00000000
   trace: 0000:82:00.1 [0x69cd6c566ff] 0 [0xb8] [030]:  00000000
00000000  00000000  00000000
   trace: 0000:82:00.1 [0x69cd6c5677f] 0 [0xb8] [040]:  00000000
00000000  00000000  00000000
   trace: 0000:82:00.1 [0x69cd6c5687f] 0 [0xb8] [050]:  00000000
00000000  00000000  00000000
   trace: 0000:82:00.1 [0x69cd6c568ff] 0 [0xb8] [060]:  00000000
00000000  00000000  00000000
   trace: 0000:82:00.1 [0x69cd6c569a5] 0 [0xb8] [070]:  00000000
00000000  00000000  00000000
   trace: 0000:82:00.1 [0x69cd6c57021] 0 [0xb8] CMDIF dbase from IRON:
active_dbase_slots = 0x00000000
   trace: 0000:82:00.1 [0x69cd6c58dae] 0 [0xb8] GVMI=0x0001 hw_toc
context:
   trace: 0000:82:00.1 [0x69cd6c58e7f] 0 [0xb8] [000]:  00400100
00000000  00000000  fffff000
   trace: 0000:82:00.1 [0x69cd6c58f7f] 0 [0xb8] [010]:  00000000
00000000  00000000  00000000
...
...

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../mellanox/mlx5/core/diag/fw_tracer.c       | 109 ++++++++++++++++++
 .../mellanox/mlx5/core/diag/fw_tracer.h       |  14 +++
 .../net/ethernet/mellanox/mlx5/core/health.c  |  46 ++++++++
 3 files changed, 169 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c
index 56025797cd1e..8c3e6727a984 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c
@@ -243,6 +243,27 @@ static int mlx5_fw_tracer_allocate_strings_db(struct mlx5_fw_tracer *tracer)
 	return -ENOMEM;
 }
 
+static int
+mlx5_fw_tracer_allocate_saved_traces_buff(struct mlx5_fw_tracer *tracer)
+{
+	int traces_buff_size = SAVED_TRACES_BUFFER_SIZE_BYTE;
+
+	tracer->sbuff.traces_buff = kzalloc(traces_buff_size, GFP_KERNEL);
+	if (!tracer->sbuff.traces_buff)
+		return -ENOMEM;
+	tracer->sbuff.saved_traces_index = 0;
+	mutex_init(&tracer->sbuff.lock);
+
+	return 0;
+}
+
+static void
+mlx5_fw_tracer_free_saved_traces_buff(struct mlx5_fw_tracer *tracer)
+{
+	kfree(tracer->sbuff.traces_buff);
+	tracer->sbuff.traces_buff = NULL;
+}
+
 static void mlx5_tracer_read_strings_db(struct work_struct *work)
 {
 	struct mlx5_fw_tracer *tracer = container_of(work, struct mlx5_fw_tracer,
@@ -522,6 +543,24 @@ static void mlx5_fw_tracer_clean_ready_list(struct mlx5_fw_tracer *tracer)
 		list_del(&str_frmt->list);
 }
 
+static void mlx5_fw_tracer_save_trace(struct mlx5_fw_tracer *tracer,
+				      u64 timestamp, bool lost,
+				      u8 event_id, char *msg)
+{
+	char *saved_traces = tracer->sbuff.traces_buff;
+	u32 offset;
+
+	mutex_lock(&tracer->sbuff.lock);
+	offset = tracer->sbuff.saved_traces_index * TRACE_STR_LINE;
+	snprintf(saved_traces + offset, TRACE_STR_LINE,
+		 "%s [0x%llx] %d [0x%x] %s", dev_name(&tracer->dev->pdev->dev),
+		 timestamp, lost, event_id, msg);
+
+	tracer->sbuff.saved_traces_index =
+		(tracer->sbuff.saved_traces_index + 1) & (SAVED_TRACES_NUM - 1);
+	mutex_unlock(&tracer->sbuff.lock);
+}
+
 static void mlx5_tracer_print_trace(struct tracer_string_format *str_frmt,
 				    struct mlx5_core_dev *dev,
 				    u64 trace_timestamp)
@@ -540,6 +579,9 @@ static void mlx5_tracer_print_trace(struct tracer_string_format *str_frmt,
 	trace_mlx5_fw(dev->tracer, trace_timestamp, str_frmt->lost,
 		      str_frmt->event_id, tmp);
 
+	mlx5_fw_tracer_save_trace(dev->tracer, trace_timestamp,
+				  str_frmt->lost, str_frmt->event_id, tmp);
+
 	/* remove it from hash */
 	mlx5_tracer_clean_message(str_frmt);
 }
@@ -820,6 +862,64 @@ int mlx5_fw_tracer_trigger_core_dump_general(struct mlx5_core_dev *dev)
 	return 0;
 }
 
+static int
+mlx5_devlink_fmsg_fill_trace(struct devlink_fmsg *fmsg,
+			     char *trace)
+{
+	int err;
+
+	err = devlink_fmsg_obj_nest_start(fmsg);
+	if (err)
+		return err;
+
+	err = devlink_fmsg_string_pair_put(fmsg, "trace", trace);
+	if (err)
+		return err;
+
+	err = devlink_fmsg_obj_nest_end(fmsg);
+	if (err)
+		return err;
+	return 0;
+}
+
+int mlx5_fw_tracer_get_saved_traces_objects(struct mlx5_fw_tracer *tracer,
+					    struct devlink_fmsg *fmsg)
+{
+	char *saved_traces = tracer->sbuff.traces_buff;
+	u32 index, start_index, end_index;
+	u32 saved_traces_index;
+	int err;
+
+	if (!saved_traces[0])
+		return -ENOMSG;
+
+	mutex_lock(&tracer->sbuff.lock);
+	saved_traces_index = tracer->sbuff.saved_traces_index;
+	if (saved_traces[saved_traces_index * TRACE_STR_LINE])
+		start_index = saved_traces_index;
+	else
+		start_index = 0;
+	end_index = (saved_traces_index - 1) & (SAVED_TRACES_NUM - 1);
+
+	err = devlink_fmsg_arr_pair_nest_start(fmsg, "dump traces");
+	if (err)
+		goto unlock;
+	index = start_index;
+	while (index != end_index) {
+		err = mlx5_devlink_fmsg_fill_trace(fmsg,
+						   saved_traces + index * TRACE_STR_LINE);
+		if (err)
+			goto unlock;
+
+		index = (index + 1) & (SAVED_TRACES_NUM - 1);
+	}
+
+	err = devlink_fmsg_arr_pair_nest_end(fmsg);
+unlock:
+	mutex_unlock(&tracer->sbuff.lock);
+	return err;
+}
+
 /* Create software resources (Buffers, etc ..) */
 struct mlx5_fw_tracer *mlx5_fw_tracer_create(struct mlx5_core_dev *dev)
 {
@@ -867,10 +967,18 @@ struct mlx5_fw_tracer *mlx5_fw_tracer_create(struct mlx5_core_dev *dev)
 		goto free_log_buf;
 	}
 
+	err = mlx5_fw_tracer_allocate_saved_traces_buff(tracer);
+	if (err) {
+		mlx5_core_warn(dev, "FWTracer: Create saved traces buffer failed %d\n", err);
+		goto free_strings_db;
+	}
+
 	mlx5_core_dbg(dev, "FWTracer: Tracer created\n");
 
 	return tracer;
 
+free_strings_db:
+	mlx5_fw_tracer_free_strings_db(tracer);
 free_log_buf:
 	mlx5_fw_tracer_destroy_log_buf(tracer);
 destroy_workqueue:
@@ -951,6 +1059,7 @@ void mlx5_fw_tracer_destroy(struct mlx5_fw_tracer *tracer)
 	cancel_work_sync(&tracer->read_fw_strings_work);
 	mlx5_fw_tracer_clean_ready_list(tracer);
 	mlx5_fw_tracer_clean_print_hash(tracer);
+	mlx5_fw_tracer_free_saved_traces_buff(tracer);
 	mlx5_fw_tracer_free_strings_db(tracer);
 	mlx5_fw_tracer_destroy_log_buf(tracer);
 	flush_workqueue(tracer->work_queue);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.h b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.h
index a8b8747f2b61..9dcf40a43399 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.h
@@ -46,6 +46,10 @@
 #define TRACER_BLOCK_SIZE_BYTE 256
 #define TRACES_PER_BLOCK 32
 
+#define TRACE_STR_LINE 256
+#define SAVED_TRACES_NUM 1024
+#define SAVED_TRACES_BUFFER_SIZE_BYTE (SAVED_TRACES_NUM * TRACE_STR_LINE)
+
 #define TRACER_MAX_PARAMS 7
 #define MESSAGE_HASH_BITS 6
 #define MESSAGE_HASH_SIZE BIT(MESSAGE_HASH_BITS)
@@ -83,6 +87,13 @@ struct mlx5_fw_tracer {
 		u32 consumer_index;
 	} buff;
 
+	/* Saved Traces Buffer */
+	struct {
+		void *traces_buff;
+		u32 saved_traces_index;
+		struct mutex lock; /* Protect sbuff access */
+	} sbuff;
+
 	u64 last_timestamp;
 	struct work_struct handle_traces_work;
 	struct hlist_head hash[MESSAGE_HASH_SIZE];
@@ -171,5 +182,8 @@ struct mlx5_fw_tracer *mlx5_fw_tracer_create(struct mlx5_core_dev *dev);
 int mlx5_fw_tracer_init(struct mlx5_fw_tracer *tracer);
 void mlx5_fw_tracer_cleanup(struct mlx5_fw_tracer *tracer);
 void mlx5_fw_tracer_destroy(struct mlx5_fw_tracer *tracer);
+int mlx5_fw_tracer_trigger_core_dump_general(struct mlx5_core_dev *dev);
+int mlx5_fw_tracer_get_saved_traces_objects(struct mlx5_fw_tracer *tracer,
+					    struct devlink_fmsg *fmsg);
 
 #endif
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 9ffa9c7f81a0..34b8252afad5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -41,6 +41,7 @@
 #include "lib/eq.h"
 #include "lib/mlx5.h"
 #include "lib/pci_vsc.h"
+#include "diag/fw_tracer.h"
 
 enum {
 	MLX5_HEALTH_POLL_INTERVAL	= 2 * HZ,
@@ -452,9 +453,54 @@ mlx5_fw_reporter_diagnose(struct devlink_health_reporter *reporter,
 	return err;
 }
 
+struct mlx5_fw_reporter_ctx {
+	u8 err_synd;
+	int miss_counter;
+};
+
+static int
+mlx5_fw_reporter_ctx_pairs_put(struct devlink_fmsg *fmsg,
+			       struct mlx5_fw_reporter_ctx *fw_reporter_ctx)
+{
+	int err;
+
+	err = devlink_fmsg_u8_pair_put(fmsg, "Syndrome",
+				       fw_reporter_ctx->err_synd);
+	if (err)
+		return err;
+	err = devlink_fmsg_u32_pair_put(fmsg, "fw_miss_counter",
+					fw_reporter_ctx->miss_counter);
+	if (err)
+		return err;
+	return 0;
+}
+
+static int
+mlx5_fw_reporter_dump(struct devlink_health_reporter *reporter,
+		      struct devlink_fmsg *fmsg, void *priv_ctx)
+{
+	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
+	int err;
+
+	err = mlx5_fw_tracer_trigger_core_dump_general(dev);
+	if (err)
+		return err;
+
+	if (priv_ctx) {
+		struct mlx5_fw_reporter_ctx *fw_reporter_ctx = priv_ctx;
+
+		err = mlx5_fw_reporter_ctx_pairs_put(fmsg, fw_reporter_ctx);
+		if (err)
+			return err;
+	}
+
+	return mlx5_fw_tracer_get_saved_traces_objects(dev->tracer, fmsg);
+}
+
 static const struct devlink_health_reporter_ops mlx5_fw_reporter_ops = {
 		.name = "fw",
 		.diagnose = mlx5_fw_reporter_diagnose,
+		.dump = mlx5_fw_reporter_dump,
 };
 
 static void mlx5_fw_reporter_create(struct mlx5_core_dev *dev)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [net-next 12/15] net/mlx5: Report devlink health on FW issues
  2019-05-05  0:32 [pull request][net-next 00/15] Mellanox, mlx5 Firmware devlink health and sw reset Saeed Mahameed
                   ` (10 preceding siblings ...)
  2019-05-05  0:33 ` [net-next 11/15] net/mlx5: Add support for FW reporter dump Saeed Mahameed
@ 2019-05-05  0:33 ` Saeed Mahameed
  2019-05-05  0:33 ` [net-next 13/15] net/mlx5: Add fw fatal devlink health reporter Saeed Mahameed
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 34+ messages in thread
From: Saeed Mahameed @ 2019-05-05  0:33 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jiri Pirko, Moshe Shemesh, Eran Ben Elisha, Saeed Mahameed

From: Moshe Shemesh <moshe@mellanox.com>

Use devlink_health_report() to report any symptom of FW issue as FW
counter miss or new health syndrom.
The FW issues detected in mlx5 during poll_health which is called in
timer atomic context and so health work queue is used to schedule the
reports.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/health.c  | 33 +++++++++++++++++++
 include/linux/mlx5/driver.h                   |  2 ++
 2 files changed, 35 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 34b8252afad5..03b9fc9ebd6e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -497,6 +497,29 @@ mlx5_fw_reporter_dump(struct devlink_health_reporter *reporter,
 	return mlx5_fw_tracer_get_saved_traces_objects(dev->tracer, fmsg);
 }
 
+static void mlx5_fw_reporter_err_work(struct work_struct *work)
+{
+	struct mlx5_fw_reporter_ctx fw_reporter_ctx;
+	struct mlx5_core_health *health;
+
+	health = container_of(work, struct mlx5_core_health, report_work);
+
+	if (IS_ERR_OR_NULL(health->fw_reporter))
+		return;
+
+	fw_reporter_ctx.err_synd = health->synd;
+	fw_reporter_ctx.miss_counter = health->miss_counter;
+	if (fw_reporter_ctx.err_synd) {
+		devlink_health_report(health->fw_reporter,
+				      "FW syndrome reported", &fw_reporter_ctx);
+		return;
+	}
+	if (fw_reporter_ctx.miss_counter)
+		devlink_health_report(health->fw_reporter,
+				      "FW miss counter reported",
+				      &fw_reporter_ctx);
+}
+
 static const struct devlink_health_reporter_ops mlx5_fw_reporter_ops = {
 		.name = "fw",
 		.diagnose = mlx5_fw_reporter_diagnose,
@@ -554,8 +577,10 @@ static void poll_health(struct timer_list *t)
 {
 	struct mlx5_core_dev *dev = from_timer(dev, t, priv.health.timer);
 	struct mlx5_core_health *health = &dev->priv.health;
+	struct health_buffer __iomem *h = health->health;
 	u32 fatal_error;
 	u32 count;
+	u8 prev_synd;
 
 	if (dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
 		goto out;
@@ -570,8 +595,14 @@ static void poll_health(struct timer_list *t)
 	if (health->miss_counter == MAX_MISSES) {
 		mlx5_core_err(dev, "device's health compromised - reached miss count\n");
 		mlx5_print_health_info(dev);
+		queue_work(health->wq, &health->report_work);
 	}
 
+	prev_synd = health->synd;
+	health->synd = ioread8(&h->synd);
+	if (health->synd && health->synd != prev_synd)
+		queue_work(health->wq, &health->report_work);
+
 	fatal_error = check_fatal_sensors(dev);
 
 	if (fatal_error && !health->fatal_error) {
@@ -621,6 +652,7 @@ void mlx5_drain_health_wq(struct mlx5_core_dev *dev)
 	spin_lock_irqsave(&health->wq_lock, flags);
 	set_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags);
 	spin_unlock_irqrestore(&health->wq_lock, flags);
+	cancel_work_sync(&health->report_work);
 	cancel_work_sync(&health->work);
 }
 
@@ -658,6 +690,7 @@ int mlx5_health_init(struct mlx5_core_dev *dev)
 		return -ENOMEM;
 	spin_lock_init(&health->wq_lock);
 	INIT_WORK(&health->work, health_care);
+	INIT_WORK(&health->report_work, mlx5_fw_reporter_err_work);
 	health->crdump = NULL;
 	health->info_buf = kmalloc(HEALTH_INFO_MAX_BUFF, GFP_KERNEL);
 	if (!health->info_buf)
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index ebda70984601..604079b4706c 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -437,12 +437,14 @@ struct mlx5_core_health {
 	struct timer_list		timer;
 	u32				prev;
 	int				miss_counter;
+	u8				synd;
 	u32				fatal_error;
 	/* wq spinlock to synchronize draining */
 	spinlock_t			wq_lock;
 	struct workqueue_struct	       *wq;
 	unsigned long			flags;
 	struct work_struct		work;
+	struct work_struct		report_work;
 	struct delayed_work		recover_work;
 	struct mlx5_fw_crdump	       *crdump;
 	char			       *info_buf;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [net-next 13/15] net/mlx5: Add fw fatal devlink health reporter
  2019-05-05  0:32 [pull request][net-next 00/15] Mellanox, mlx5 Firmware devlink health and sw reset Saeed Mahameed
                   ` (11 preceding siblings ...)
  2019-05-05  0:33 ` [net-next 12/15] net/mlx5: Report devlink health on FW issues Saeed Mahameed
@ 2019-05-05  0:33 ` Saeed Mahameed
  2019-05-05  0:33 ` [net-next 14/15] net/mlx5: Add support for FW fatal reporter dump Saeed Mahameed
  2019-05-05  0:33 ` [net-next 15/15] net/mlx5: Report devlink health on FW fatal issues Saeed Mahameed
  14 siblings, 0 replies; 34+ messages in thread
From: Saeed Mahameed @ 2019-05-05  0:33 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jiri Pirko, Moshe Shemesh, Eran Ben Elisha, Saeed Mahameed

From: Moshe Shemesh <moshe@mellanox.com>

Create mlx5_devlink_health_reporter for fw fatal reporter.
The fw fatal reporter is added in addition to the fw reporter and
implements the recover callback.
The point of having two reporters for FW issues, is that we
don't want to run FW recover on any issue, but only fatal ones.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/health.c  | 70 ++++++++++++++-----
 include/linux/mlx5/driver.h                   |  1 +
 2 files changed, 54 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 03b9fc9ebd6e..e64f0e32cd67 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -301,31 +301,43 @@ static void mlx5_handle_bad_state(struct mlx5_core_dev *dev)
 
 /* How much time to wait until health resetting the driver (in msecs) */
 #define MLX5_RECOVERY_WAIT_MSECS 60000
-static void health_care(struct work_struct *work)
+static int mlx5_health_care(struct mlx5_core_dev *dev)
 {
-	struct mlx5_core_health *health;
-	struct mlx5_core_dev *dev;
-	struct mlx5_priv *priv;
 	unsigned long end;
 
-	health = container_of(work, struct mlx5_core_health, work);
-	priv = container_of(health, struct mlx5_priv, health);
-	dev = container_of(priv, struct mlx5_core_dev, priv);
 	mlx5_core_warn(dev, "handling bad device here\n");
 	mlx5_handle_bad_state(dev);
-
 	end = jiffies + msecs_to_jiffies(MLX5_RECOVERY_WAIT_MSECS);
 	while (sensor_pci_not_working(dev)) {
 		if (time_after(jiffies, end)) {
 			mlx5_core_err(dev,
 				      "health recovery flow aborted, PCI reads still not working\n");
-			return;
+			return -EIO;
 		}
 		msleep(100);
 	}
 
 	mlx5_core_err(dev, "starting health recovery flow\n");
 	mlx5_recover_device(dev);
+	if (!test_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state) ||
+	    check_fatal_sensors(dev)) {
+		mlx5_core_err(dev, "health recovery failed\n");
+		return -EIO;
+	}
+	return 0;
+}
+
+static void health_care_work(struct work_struct *work)
+{
+	struct mlx5_core_health *health;
+	struct mlx5_core_dev *dev;
+	struct mlx5_priv *priv;
+
+	health = container_of(work, struct mlx5_core_health, work);
+	priv = container_of(health, struct mlx5_priv, health);
+	dev = container_of(priv, struct mlx5_core_dev, priv);
+
+	mlx5_health_care(dev);
 }
 
 static const char *hsynd_str(u8 synd)
@@ -526,7 +538,22 @@ static const struct devlink_health_reporter_ops mlx5_fw_reporter_ops = {
 		.dump = mlx5_fw_reporter_dump,
 };
 
-static void mlx5_fw_reporter_create(struct mlx5_core_dev *dev)
+static int
+mlx5_fw_fatal_reporter_recover(struct devlink_health_reporter *reporter,
+			       void *priv_ctx)
+{
+	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
+
+	return mlx5_health_care(dev);
+}
+
+static const struct devlink_health_reporter_ops mlx5_fw_fatal_reporter_ops = {
+		.name = "fw_fatal",
+		.recover = mlx5_fw_fatal_reporter_recover,
+};
+
+#define MLX5_REPORTER_FW_GRACEFUL_PERIOD 1200000
+static void mlx5_fw_reporters_create(struct mlx5_core_dev *dev)
 {
 	struct mlx5_core_health *health = &dev->priv.health;
 	struct devlink *devlink = priv_to_devlink(dev);
@@ -537,16 +564,25 @@ static void mlx5_fw_reporter_create(struct mlx5_core_dev *dev)
 	if (IS_ERR(health->fw_reporter))
 		mlx5_core_warn(dev, "Failed to create fw reporter, err = %ld\n",
 			       PTR_ERR(health->fw_reporter));
+
+	health->fw_fatal_reporter =
+		devlink_health_reporter_create(devlink, &mlx5_fw_fatal_reporter_ops,
+					       MLX5_REPORTER_FW_GRACEFUL_PERIOD,
+					       true, dev);
+	if (IS_ERR(health->fw_fatal_reporter))
+		mlx5_core_warn(dev, "Failed to create fw fatal reporter, err = %ld\n",
+			       PTR_ERR(health->fw_fatal_reporter));
 }
 
-static void mlx5_fw_reporter_destroy(struct mlx5_core_dev *dev)
+static void mlx5_fw_reporters_destroy(struct mlx5_core_dev *dev)
 {
 	struct mlx5_core_health *health = &dev->priv.health;
 
-	if (IS_ERR_OR_NULL(health->fw_reporter))
-		return;
+	if (!IS_ERR_OR_NULL(health->fw_reporter))
+		devlink_health_reporter_destroy(health->fw_reporter);
 
-	devlink_health_reporter_destroy(health->fw_reporter);
+	if (!IS_ERR_OR_NULL(health->fw_fatal_reporter))
+		devlink_health_reporter_destroy(health->fw_fatal_reporter);
 }
 
 static unsigned long get_next_poll_jiffies(void)
@@ -669,7 +705,7 @@ void mlx5_health_cleanup(struct mlx5_core_dev *dev)
 
 	kfree(health->info_buf);
 	destroy_workqueue(health->wq);
-	mlx5_fw_reporter_destroy(dev);
+	mlx5_fw_reporters_destroy(dev);
 }
 
 int mlx5_health_init(struct mlx5_core_dev *dev)
@@ -689,7 +725,7 @@ int mlx5_health_init(struct mlx5_core_dev *dev)
 	if (!health->wq)
 		return -ENOMEM;
 	spin_lock_init(&health->wq_lock);
-	INIT_WORK(&health->work, health_care);
+	INIT_WORK(&health->work, health_care_work);
 	INIT_WORK(&health->report_work, mlx5_fw_reporter_err_work);
 	health->crdump = NULL;
 	health->info_buf = kmalloc(HEALTH_INFO_MAX_BUFF, GFP_KERNEL);
@@ -697,7 +733,7 @@ int mlx5_health_init(struct mlx5_core_dev *dev)
 		goto err_alloc_buff;
 	mutex_init(&health->info_buf_lock);
 
-	mlx5_fw_reporter_create(dev);
+	mlx5_fw_reporters_create(dev);
 
 	return 0;
 
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 604079b4706c..6f65787bf91b 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -452,6 +452,7 @@ struct mlx5_core_health {
 	/* protect info buf access */
 	struct mutex			info_buf_lock;
 	struct devlink_health_reporter *fw_reporter;
+	struct devlink_health_reporter *fw_fatal_reporter;
 };
 
 struct mlx5_qp_table {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [net-next 14/15] net/mlx5: Add support for FW fatal reporter dump
  2019-05-05  0:32 [pull request][net-next 00/15] Mellanox, mlx5 Firmware devlink health and sw reset Saeed Mahameed
                   ` (12 preceding siblings ...)
  2019-05-05  0:33 ` [net-next 13/15] net/mlx5: Add fw fatal devlink health reporter Saeed Mahameed
@ 2019-05-05  0:33 ` Saeed Mahameed
  2019-05-05 15:52   ` Jiri Pirko
  2019-05-05  0:33 ` [net-next 15/15] net/mlx5: Report devlink health on FW fatal issues Saeed Mahameed
  14 siblings, 1 reply; 34+ messages in thread
From: Saeed Mahameed @ 2019-05-05  0:33 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Jiri Pirko, Moshe Shemesh, Saeed Mahameed

From: Moshe Shemesh <moshe@mellanox.com>

Add support of dump callback for mlx5 FW fatal reporter.
The FW fatal dump use cr-dump functionality to gather cr-space data for
debug. The cr-dump uses vsc interface which is valid even if the FW
command interface is not functional, which is the case in most FW fatal
errors.
The cr-dump is stored as a memory region snapshot to ease read by
address.

Command example and output:
$ devlink health dump show pci/0000:82:00.0 reporter fw_fatal
devlink_region_name: cr-space snapshot_id: 1

$ devlink region read pci/0000:82:00.0/cr-space snapshot 1 address 983064 length 8
00000000000f0018 e1 03 00 00 fb ae a9 3f

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/health.c  | 39 +++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index e64f0e32cd67..5271c88ef64c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -547,9 +547,48 @@ mlx5_fw_fatal_reporter_recover(struct devlink_health_reporter *reporter,
 	return mlx5_health_care(dev);
 }
 
+static int
+mlx5_fw_fatal_reporter_dump(struct devlink_health_reporter *reporter,
+			    struct devlink_fmsg *fmsg, void *priv_ctx)
+{
+	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
+	char crdump_region[20];
+	u32 snapshot_id;
+	int err;
+
+	if (!mlx5_core_is_pf(dev)) {
+		mlx5_core_err(dev, "Only PF is permitted run FW fatal dump\n");
+		return -EPERM;
+	}
+
+	err = mlx5_crdump_collect(dev, crdump_region, &snapshot_id);
+	if (err)
+		return err;
+
+	if (priv_ctx) {
+		struct mlx5_fw_reporter_ctx *fw_reporter_ctx = priv_ctx;
+
+		err = mlx5_fw_reporter_ctx_pairs_put(fmsg, fw_reporter_ctx);
+		if (err)
+			return err;
+	}
+
+	err = devlink_fmsg_string_pair_put(fmsg, "devlink_region_name",
+					   crdump_region);
+	if (err)
+		return err;
+
+	err = devlink_fmsg_u32_pair_put(fmsg, "snapshot_id", snapshot_id);
+	if (err)
+		return err;
+
+	return 0;
+}
+
 static const struct devlink_health_reporter_ops mlx5_fw_fatal_reporter_ops = {
 		.name = "fw_fatal",
 		.recover = mlx5_fw_fatal_reporter_recover,
+		.dump = mlx5_fw_fatal_reporter_dump,
 };
 
 #define MLX5_REPORTER_FW_GRACEFUL_PERIOD 1200000
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [net-next 15/15] net/mlx5: Report devlink health on FW fatal issues
  2019-05-05  0:32 [pull request][net-next 00/15] Mellanox, mlx5 Firmware devlink health and sw reset Saeed Mahameed
                   ` (13 preceding siblings ...)
  2019-05-05  0:33 ` [net-next 14/15] net/mlx5: Add support for FW fatal reporter dump Saeed Mahameed
@ 2019-05-05  0:33 ` Saeed Mahameed
  14 siblings, 0 replies; 34+ messages in thread
From: Saeed Mahameed @ 2019-05-05  0:33 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Jiri Pirko, Moshe Shemesh, Saeed Mahameed

From: Moshe Shemesh <moshe@mellanox.com>

Report devlink health on FW fatal issues via fw_fatal_reporter. The
driver recover flow for FW fatal error is now being handled by the
devlink health.

Having the recovery controlled by devlink health, the user has the
ability to cancel the auto-recovery for debug session and run it
manually.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/health.c  | 42 ++++++++++++-------
 .../net/ethernet/mellanox/mlx5/core/main.c    |  3 +-
 include/linux/mlx5/driver.h                   |  2 +-
 3 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 5271c88ef64c..e3c4e927945d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -327,19 +327,6 @@ static int mlx5_health_care(struct mlx5_core_dev *dev)
 	return 0;
 }
 
-static void health_care_work(struct work_struct *work)
-{
-	struct mlx5_core_health *health;
-	struct mlx5_core_dev *dev;
-	struct mlx5_priv *priv;
-
-	health = container_of(work, struct mlx5_core_health, work);
-	priv = container_of(health, struct mlx5_priv, health);
-	dev = container_of(priv, struct mlx5_core_dev, priv);
-
-	mlx5_health_care(dev);
-}
-
 static const char *hsynd_str(u8 synd)
 {
 	switch (synd) {
@@ -585,6 +572,29 @@ mlx5_fw_fatal_reporter_dump(struct devlink_health_reporter *reporter,
 	return 0;
 }
 
+static void mlx5_fw_fatal_reporter_err_work(struct work_struct *work)
+{
+	struct mlx5_fw_reporter_ctx fw_reporter_ctx;
+	struct mlx5_core_health *health;
+	struct mlx5_core_dev *dev;
+	struct mlx5_priv *priv;
+
+	health = container_of(work, struct mlx5_core_health, fatal_report_work);
+	priv = container_of(health, struct mlx5_priv, health);
+	dev = container_of(priv, struct mlx5_core_dev, priv);
+
+	mlx5_enter_error_state(dev, false);
+	if (IS_ERR_OR_NULL(health->fw_fatal_reporter)) {
+		if (mlx5_health_care(dev))
+			mlx5_core_err(dev, "health recovery failed\n");
+		return;
+	}
+	fw_reporter_ctx.err_synd = health->synd;
+	fw_reporter_ctx.miss_counter = health->miss_counter;
+	devlink_health_report(health->fw_fatal_reporter,
+			      "FW fatal error reported", &fw_reporter_ctx);
+}
+
 static const struct devlink_health_reporter_ops mlx5_fw_fatal_reporter_ops = {
 		.name = "fw_fatal",
 		.recover = mlx5_fw_fatal_reporter_recover,
@@ -642,7 +652,7 @@ void mlx5_trigger_health_work(struct mlx5_core_dev *dev)
 
 	spin_lock_irqsave(&health->wq_lock, flags);
 	if (!test_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags))
-		queue_work(health->wq, &health->work);
+		queue_work(health->wq, &health->fatal_report_work);
 	else
 		mlx5_core_err(dev, "new health works are not permitted at this stage\n");
 	spin_unlock_irqrestore(&health->wq_lock, flags);
@@ -728,7 +738,7 @@ void mlx5_drain_health_wq(struct mlx5_core_dev *dev)
 	set_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags);
 	spin_unlock_irqrestore(&health->wq_lock, flags);
 	cancel_work_sync(&health->report_work);
-	cancel_work_sync(&health->work);
+	cancel_work_sync(&health->fatal_report_work);
 }
 
 void mlx5_health_flush(struct mlx5_core_dev *dev)
@@ -764,7 +774,7 @@ int mlx5_health_init(struct mlx5_core_dev *dev)
 	if (!health->wq)
 		return -ENOMEM;
 	spin_lock_init(&health->wq_lock);
-	INIT_WORK(&health->work, health_care_work);
+	INIT_WORK(&health->fatal_report_work, mlx5_fw_fatal_reporter_err_work);
 	INIT_WORK(&health->report_work, mlx5_fw_reporter_err_work);
 	health->crdump = NULL;
 	health->info_buf = kmalloc(HEALTH_INFO_MAX_BUFF, GFP_KERNEL);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index c22ff9a58ec5..b1ad7369e014 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1367,7 +1367,8 @@ static pci_ers_result_t mlx5_pci_err_detected(struct pci_dev *pdev,
 
 	mlx5_core_info(dev, "%s was called\n", __func__);
 
-	mlx5_enter_error_state(dev, false);
+	if (state)
+		mlx5_enter_error_state(dev, false);
 	mlx5_error_sw_reset(dev);
 	mlx5_unload_one(dev, false);
 	/* In case of kernel call drain the health wq */
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 6f65787bf91b..b5b5baca5aee 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -443,7 +443,7 @@ struct mlx5_core_health {
 	spinlock_t			wq_lock;
 	struct workqueue_struct	       *wq;
 	unsigned long			flags;
-	struct work_struct		work;
+	struct work_struct		fatal_report_work;
 	struct work_struct		report_work;
 	struct delayed_work		recover_work;
 	struct mlx5_fw_crdump	       *crdump;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [net-next 03/15] net/mlx5: Add Crdump FW snapshot support
  2019-05-05  0:33 ` [net-next 03/15] net/mlx5: Add Crdump FW snapshot support Saeed Mahameed
@ 2019-05-05 15:36   ` Jiri Pirko
  0 siblings, 0 replies; 34+ messages in thread
From: Jiri Pirko @ 2019-05-05 15:36 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: David S. Miller, netdev, Jiri Pirko, Alex Vesker, Moshe Shemesh,
	Feras Daoud

Sun, May 05, 2019 at 02:33:00AM CEST, saeedm@mellanox.com wrote:
>From: Alex Vesker <valex@mellanox.com>
>
>Crdump allows the driver to create a snapshot of the FW PCI crspace.
>This is useful in case of catastrophic issues which may require FW
>reset. The snapshot can be used for later debug.
>
>The snapshot is exposed using devlink region_snapshot in downstream patch,
>cr-space address regions are registered on init and snapshots are attached
>once a new snapshot is collected by the driver.
>
>Signed-off-by: Alex Vesker <valex@mellanox.com>
>Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
>Reviewed-by: Feras Daoud <ferasda@mellanox.com>
>Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
>---
> .../net/ethernet/mellanox/mlx5/core/Makefile  |   2 +-
> .../ethernet/mellanox/mlx5/core/diag/crdump.c | 179 ++++++++++++++++++
> .../net/ethernet/mellanox/mlx5/core/health.c  |   1 +
> .../ethernet/mellanox/mlx5/core/lib/mlx5.h    |   4 +
> .../net/ethernet/mellanox/mlx5/core/main.c    |   5 +
> include/linux/mlx5/driver.h                   |   4 +
> 6 files changed, 194 insertions(+), 1 deletion(-)
> create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c
>
>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
>index 34d9a079b608..5feed9e1bec7 100644
>--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
>+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
>@@ -16,7 +16,7 @@ mlx5_core-y :=	main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
> 		transobj.o vport.o sriov.o fs_cmd.o fs_core.o \
> 		fs_counters.o rl.o lag.o dev.o events.o wq.o lib/gid.o \
> 		lib/devcom.o lib/pci_vsc.o diag/fs_tracepoint.o \
>-		diag/fw_tracer.o devlink.o
>+		diag/fw_tracer.o diag/crdump.o devlink.o
> 
> #
> # Netdev basic
>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c b/drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c
>new file mode 100644
>index 000000000000..6430ceeefb53
>--- /dev/null
>+++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c
>@@ -0,0 +1,179 @@
>+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
>+/* Copyright (c) 2019 Mellanox Technologies */
>+
>+#include <linux/proc_fs.h>
>+#include <linux/mlx5/driver.h>
>+#include <net/devlink.h>
>+#include "mlx5_core.h"
>+#include "lib/pci_vsc.h"
>+#include "lib/mlx5.h"
>+
>+#define BAD_ACCESS			0xBADACCE5
>+#define MLX5_PROTECTED_CR_SCAN_CRSPACE	0x7
>+#define MAX_NUM_OF_DUMPS_TO_STORE	(8)

Please make your local defines prefixed with "mlx5_". For example
"BAD_ACCESS" sounds way too generic.


>+
>+static const char *region_cr_space_str = "cr-space";
>+
>+struct mlx5_fw_crdump {
>+	u32			size;
>+	struct devlink_region	*region_crspace;
>+};
>+
>+static bool mlx5_crdump_enbaled(struct mlx5_core_dev *dev)

Typo: s/enbaled/enabled/


>+{
>+	struct mlx5_priv *priv = &dev->priv;
>+
>+	return (!!priv->health.crdump);

Poitlesss brackets. Please remove.


>+}
>+
>+static int mlx5_crdump_fill(struct mlx5_core_dev *dev,
>+			    char *crdump_region, u32 *snapshot_id)
>+{
>+	struct devlink *devlink = priv_to_devlink(dev);
>+	struct mlx5_priv *priv = &dev->priv;
>+	struct mlx5_fw_crdump *crdump = priv->health.crdump;
>+	int i, ret = 0;
>+	u32 *cr_data;
>+	u32 id;
>+
>+	cr_data = kvmalloc(crdump->size, GFP_KERNEL);
>+	if (!cr_data)
>+		return -ENOMEM;
>+
>+	for (i = 0; i < (crdump->size / 4); i++)
>+		cr_data[i] = BAD_ACCESS;
>+
>+	ret = mlx5_vsc_gw_read_block_fast(dev, cr_data, crdump->size);
>+	if (ret <= 0) {
>+		if (ret == 0)
>+			ret = -EIO;
>+		goto free_data;
>+	}
>+
>+	if (crdump->size != ret) {
>+		mlx5_core_warn(dev, "failed to read full dump, read %d out of %u\n",
>+			       ret, crdump->size);
>+		ret = -EINVAL;
>+		goto free_data;
>+	}
>+
>+	/* Get the available snapshot ID for the dumps */
>+	id = devlink_region_shapshot_id_get(devlink);
>+	ret = devlink_region_snapshot_create(crdump->region_crspace,
>+					     crdump->size, (u8 *)cr_data,
>+					     id, &kvfree);
>+	if (ret) {
>+		mlx5_core_warn(dev, "crdump: devlink create %s snapshot id %d err %d\n",
>+			       region_cr_space_str, id, ret);
>+		goto free_data;
>+	} else {
>+		*snapshot_id = id;
>+		strcpy(crdump_region, region_cr_space_str);
>+	}
>+	return 0;
>+
>+free_data:
>+	kvfree(cr_data);
>+	return ret;
>+}
>+
>+int mlx5_crdump_collect(struct mlx5_core_dev *dev,
>+			char *crdump_region, u32 *snapshot_id)
>+{
>+	int ret = 0;
>+
>+	if (!mlx5_crdump_enbaled(dev))
>+		return -ENODEV;
>+
>+	ret = mlx5_vsc_gw_lock(dev);
>+	if (ret) {
>+		mlx5_core_warn(dev, "crdump: failed to lock vsc gw err %d\n",
>+			       ret);
>+		return ret;
>+	}
>+
>+	ret = mlx5_vsc_gw_set_space(dev, MLX5_VSC_SPACE_SCAN_CRSPACE, NULL);
>+	if (ret)
>+		goto unlock;
>+
>+	ret = mlx5_crdump_fill(dev, crdump_region, snapshot_id);
>+
>+unlock:
>+	mlx5_vsc_gw_unlock(dev);
>+	return ret;
>+}
>+
>+int mlx5_crdump_init(struct mlx5_core_dev *dev)
>+{
>+	struct devlink *devlink = priv_to_devlink(dev);
>+	struct mlx5_priv *priv = &dev->priv;
>+	struct mlx5_fw_crdump *crdump;
>+	u32 space_size;
>+	int ret;
>+
>+	if (!mlx5_core_is_pf(dev) || !mlx5_vsc_accessible(dev) ||
>+	    mlx5_crdump_enbaled(dev))
>+		return 0;
>+
>+	ret = mlx5_vsc_gw_lock(dev);
>+	if (ret)
>+		return ret;
>+
>+	/* Check if space is supported and get space size */
>+	ret = mlx5_vsc_gw_set_space(dev, MLX5_VSC_SPACE_SCAN_CRSPACE,
>+				    &space_size);
>+	if (ret) {
>+		/* Unlock and mask error since space is not supported */
>+		mlx5_vsc_gw_unlock(dev);
>+		return 0;
>+	}
>+
>+	if (!space_size) {
>+		mlx5_core_warn(dev, "Invalid Crspace size, zero\n");
>+		mlx5_vsc_gw_unlock(dev);
>+		return -EINVAL;
>+	}
>+
>+	ret = mlx5_vsc_gw_unlock(dev);
>+	if (ret)
>+		return ret;
>+
>+	crdump = kzalloc(sizeof(*crdump), GFP_KERNEL);
>+	if (!crdump)
>+		return -ENOMEM;
>+
>+	/* Create cr-space region */
>+	crdump->size = space_size;
>+	crdump->region_crspace =
>+		devlink_region_create(devlink,
>+				      region_cr_space_str,
>+				      MAX_NUM_OF_DUMPS_TO_STORE,
>+				      space_size);

Unnecessary wraps.


>+	if (IS_ERR(crdump->region_crspace)) {
>+		mlx5_core_warn(dev,

Unnecessary wrap.


>+			       "crdump: create devlink region %s err %ld\n",
>+			       region_cr_space_str,
>+			       PTR_ERR(crdump->region_crspace));
>+		ret = PTR_ERR(crdump->region_crspace);
>+		goto free_crdump;
>+	}
>+	priv->health.crdump = crdump;
>+	return 0;
>+
>+free_crdump:
>+	kfree(crdump);
>+	return ret;
>+}
>+
>+void mlx5_crdump_cleanup(struct mlx5_core_dev *dev)
>+{
>+	struct mlx5_priv *priv = &dev->priv;
>+	struct mlx5_fw_crdump *crdump = priv->health.crdump;
>+
>+	if (!crdump)
>+		return;
>+
>+	devlink_region_destroy(crdump->region_crspace);
>+	kfree(crdump);
>+	priv->health.crdump = NULL;

Why do you need to have this NULL. Where do you check for NULL?


>+}
>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
>index a2656f4008d9..90f3da6da7f9 100644
>--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
>+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
>@@ -388,6 +388,7 @@ int mlx5_health_init(struct mlx5_core_dev *dev)
> 	spin_lock_init(&health->wq_lock);
> 	INIT_WORK(&health->work, health_care);
> 	INIT_DELAYED_WORK(&health->recover_work, health_recover);
>+	health->crdump = NULL;

Same here.


> 
> 	return 0;
> }
>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/mlx5.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/mlx5.h
>index 397a2847867a..3c9a6dedccaa 100644
>--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/mlx5.h
>+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/mlx5.h
>@@ -41,6 +41,10 @@ int  mlx5_core_reserve_gids(struct mlx5_core_dev *dev, unsigned int count);
> void mlx5_core_unreserve_gids(struct mlx5_core_dev *dev, unsigned int count);
> int  mlx5_core_reserved_gid_alloc(struct mlx5_core_dev *dev, int *gid_index);
> void mlx5_core_reserved_gid_free(struct mlx5_core_dev *dev, int gid_index);
>+int mlx5_crdump_init(struct mlx5_core_dev *dev);
>+void mlx5_crdump_cleanup(struct mlx5_core_dev *dev);
>+int mlx5_crdump_collect(struct mlx5_core_dev *dev,
>+			char *crdump_region, u32 *snapshot_id);
> 
> /* TODO move to lib/events.h */
> 
>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
>index 64eb2a558b30..43f5487de4c3 100644
>--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
>+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
>@@ -1320,6 +1320,10 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *id)
> 	if (err)
> 		goto clean_load;
> 
>+	err = mlx5_crdump_init(dev);
>+	if (err)
>+		dev_err(&pdev->dev, "mlx5_crdump_init failed with error code %d\n", err);
>+
> 	pci_save_state(pdev);
> 	return 0;
> 
>@@ -1341,6 +1345,7 @@ static void remove_one(struct pci_dev *pdev)
> 	struct mlx5_core_dev *dev  = pci_get_drvdata(pdev);
> 	struct devlink *devlink = priv_to_devlink(dev);
> 
>+	mlx5_crdump_cleanup(dev);
> 	mlx5_devlink_unregister(devlink);
> 	mlx5_unregister_device(dev);
> 
>diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
>index 56d0a116f575..ddf6f41a75d3 100644
>--- a/include/linux/mlx5/driver.h
>+++ b/include/linux/mlx5/driver.h
>@@ -53,6 +53,7 @@
> #include <linux/mlx5/eq.h>
> #include <linux/timecounter.h>
> #include <linux/ptp_clock_kernel.h>
>+#include <net/devlink.h>
> 
> enum {
> 	MLX5_BOARD_ID_LEN = 64,
>@@ -427,6 +428,8 @@ struct mlx5_sq_bfreg {
> 	unsigned int		offset;
> };
> 
>+struct mlx5_fw_crdump;
>+
> struct mlx5_core_health {
> 	struct health_buffer __iomem   *health;
> 	__be32 __iomem		       *health_counter;
>@@ -440,6 +443,7 @@ struct mlx5_core_health {
> 	unsigned long			flags;
> 	struct work_struct		work;
> 	struct delayed_work		recover_work;
>+	struct mlx5_fw_crdump	       *crdump;
> };
> 
> struct mlx5_qp_table {
>-- 
>2.20.1
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next 07/15] net/mlx5: Issue SW reset on FW assert
  2019-05-05  0:33 ` [net-next 07/15] net/mlx5: Issue SW reset on FW assert Saeed Mahameed
@ 2019-05-05 15:38   ` Jiri Pirko
  2019-05-06 10:44     ` Moshe Shemesh
  0 siblings, 1 reply; 34+ messages in thread
From: Jiri Pirko @ 2019-05-05 15:38 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: David S. Miller, netdev, Jiri Pirko, Feras Daoud, Alex Vesker,
	Moshe Shemesh, Daniel Jurgens

Sun, May 05, 2019 at 02:33:18AM CEST, saeedm@mellanox.com wrote:
>From: Feras Daoud <ferasda@mellanox.com>
>
>If a FW assert is considered fatal, indicated by a new bit in the health
>buffer, reset the FW. After the reset go through the normal recovery
>flow. Only one PF needs to issue the reset, so an attempt is made to
>prevent the 2nd function from also issuing the reset.
>It's not an error if that happens, it just slows recovery.
>
>Signed-off-by: Feras Daoud <ferasda@mellanox.com>
>Signed-off-by: Alex Vesker <valex@mellanox.com>
>Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
>Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
>Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
>---
> .../ethernet/mellanox/mlx5/core/diag/crdump.c |  13 +-
> .../net/ethernet/mellanox/mlx5/core/health.c  | 157 +++++++++++++++++-
> .../net/ethernet/mellanox/mlx5/core/main.c    |   1 +
> .../ethernet/mellanox/mlx5/core/mlx5_core.h   |   2 +
> include/linux/mlx5/device.h                   |  10 +-
> include/linux/mlx5/driver.h                   |   1 +
> 6 files changed, 176 insertions(+), 8 deletions(-)
>

[...]


>+void mlx5_error_sw_reset(struct mlx5_core_dev *dev)
>+{
>+	unsigned long end, delay_ms = MLX5_FW_RESET_WAIT_MS;
>+	int lock = -EBUSY;
>+
>+	mutex_lock(&dev->intf_state_mutex);
>+	if (dev->state != MLX5_DEVICE_STATE_INTERNAL_ERROR)
>+		goto unlock;
>+
>+	mlx5_core_err(dev, "start\n");

Leftover?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next 09/15] net/mlx5: Create FW devlink health reporter
  2019-05-05  0:33 ` [net-next 09/15] net/mlx5: Create FW devlink health reporter Saeed Mahameed
@ 2019-05-05 15:42   ` Jiri Pirko
  2019-05-06 10:45     ` Moshe Shemesh
  2019-05-07  0:11   ` Jakub Kicinski
  1 sibling, 1 reply; 34+ messages in thread
From: Jiri Pirko @ 2019-05-05 15:42 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: David S. Miller, netdev, Jiri Pirko, Moshe Shemesh, Eran Ben Elisha

Sun, May 05, 2019 at 02:33:23AM CEST, saeedm@mellanox.com wrote:
>From: Moshe Shemesh <moshe@mellanox.com>
>
>Create mlx5_devlink_health_reporter for FW reporter. The FW reporter
>implements devlink_health_reporter diagnose callback.
>
>The fw reporter diagnose command can be triggered any time by the user
>to check current fw status.
>In healthy status, it will return clear syndrome. Otherwise it will dump
>the health info buffer.
>
>Command example and output on healthy status:
>$ devlink health diagnose pci/0000:82:00.0 reporter fw
>Syndrome: 0
>
>Command example and output on non healthy status:
>$ devlink health diagnose pci/0000:82:00.0 reporter fw
>diagnose data:
>assert_var[0] 0xfc3fc043
>assert_var[1] 0x0001b41c
>assert_var[2] 0x00000000
>assert_var[3] 0x00000000
>assert_var[4] 0x00000000
>assert_exit_ptr 0x008033b4
>assert_callra 0x0080365c
>fw_ver 16.24.1000
>hw_id 0x0000020d
>irisc_index 0
>synd 0x8: unrecoverable hardware error
>ext_synd 0x003d
>raw fw_ver 0x101803e8
>
>Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
>Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
>Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

	
[...]	
	
	
>+static int
>+mlx5_fw_reporter_diagnose(struct devlink_health_reporter *reporter,
>+			  struct devlink_fmsg *fmsg)
>+{
>+	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
>+	struct mlx5_core_health *health = &dev->priv.health;
>+	u8 synd;
>+	int err;
>+
>+	mutex_lock(&health->info_buf_lock);
>+	mlx5_get_health_info(dev, &synd);
>+
>+	if (!synd) {
>+		mutex_unlock(&health->info_buf_lock);
>+		return devlink_fmsg_u8_pair_put(fmsg, "Syndrome", synd);
>+	}
>+
>+	err = devlink_fmsg_string_pair_put(fmsg, "diagnose data",
>+					   health->info_buf);

No! This is wrong! You are sneaking in text blob. Please put the info in
structured form using proper fmsg helpers.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next 08/15] net/mlx5: Refactor print health info
  2019-05-05  0:33 ` [net-next 08/15] net/mlx5: Refactor print health info Saeed Mahameed
@ 2019-05-05 15:42   ` Jiri Pirko
  0 siblings, 0 replies; 34+ messages in thread
From: Jiri Pirko @ 2019-05-05 15:42 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: David S. Miller, netdev, Jiri Pirko, Moshe Shemesh, Eran Ben Elisha

Sun, May 05, 2019 at 02:33:21AM CEST, saeedm@mellanox.com wrote:
>From: Moshe Shemesh <moshe@mellanox.com>
>
>Refactor print health info code, split to two functions:
> 1. mlx5_get_health_info() - writes the health info into a buffer.
> 2. mlx5_print_health_info() - prints the health info to kernel log.
>This refactoring is done to enable using the health info data by devlink
>health reporter diagnose() in the downstream patch.

Please avoid this. Leave the print out as it is and format fmsg
properly.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next 11/15] net/mlx5: Add support for FW reporter dump
  2019-05-05  0:33 ` [net-next 11/15] net/mlx5: Add support for FW reporter dump Saeed Mahameed
@ 2019-05-05 15:49   ` Jiri Pirko
  2019-05-06 10:51     ` Moshe Shemesh
  0 siblings, 1 reply; 34+ messages in thread
From: Jiri Pirko @ 2019-05-05 15:49 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: David S. Miller, netdev, Jiri Pirko, Moshe Shemesh, Eran Ben Elisha

Sun, May 05, 2019 at 02:33:27AM CEST, saeedm@mellanox.com wrote:
>From: Moshe Shemesh <moshe@mellanox.com>
>
>Add support of dump callback for mlx5 FW reporter.
>Once we trigger FW dump, the FW will write the core dump to its raw data
>buffer. The tracer translates the raw data to traces and save it to a
>buffer. Once dump is done, the saved traces data is filled as objects
>into the dump buffer.
>

[...]

>+static void mlx5_fw_tracer_save_trace(struct mlx5_fw_tracer *tracer,
>+				      u64 timestamp, bool lost,
>+				      u8 event_id, char *msg)
>+{
>+	char *saved_traces = tracer->sbuff.traces_buff;
>+	u32 offset;
>+
>+	mutex_lock(&tracer->sbuff.lock);
>+	offset = tracer->sbuff.saved_traces_index * TRACE_STR_LINE;
>+	snprintf(saved_traces + offset, TRACE_STR_LINE,
>+		 "%s [0x%llx] %d [0x%x] %s", dev_name(&tracer->dev->pdev->dev),
>+		 timestamp, lost, event_id, msg);

Please format this using fmsg helpers instead.


>+
>+	tracer->sbuff.saved_traces_index =
>+		(tracer->sbuff.saved_traces_index + 1) & (SAVED_TRACES_NUM - 1);
>+	mutex_unlock(&tracer->sbuff.lock);
>+}

[...]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next 14/15] net/mlx5: Add support for FW fatal reporter dump
  2019-05-05  0:33 ` [net-next 14/15] net/mlx5: Add support for FW fatal reporter dump Saeed Mahameed
@ 2019-05-05 15:52   ` Jiri Pirko
  2019-05-06 10:54     ` Moshe Shemesh
  0 siblings, 1 reply; 34+ messages in thread
From: Jiri Pirko @ 2019-05-05 15:52 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: David S. Miller, netdev, Jiri Pirko, Moshe Shemesh

Sun, May 05, 2019 at 02:33:33AM CEST, saeedm@mellanox.com wrote:
>From: Moshe Shemesh <moshe@mellanox.com>
>
>Add support of dump callback for mlx5 FW fatal reporter.
>The FW fatal dump use cr-dump functionality to gather cr-space data for
>debug. The cr-dump uses vsc interface which is valid even if the FW
>command interface is not functional, which is the case in most FW fatal
>errors.
>The cr-dump is stored as a memory region snapshot to ease read by
>address.
>
>Command example and output:
>$ devlink health dump show pci/0000:82:00.0 reporter fw_fatal
>devlink_region_name: cr-space snapshot_id: 1
>
>$ devlink region read pci/0000:82:00.0/cr-space snapshot 1 address 983064 length 8
>00000000000f0018 e1 03 00 00 fb ae a9 3f
>
>Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
>Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
>---
> .../net/ethernet/mellanox/mlx5/core/health.c  | 39 +++++++++++++++++++
> 1 file changed, 39 insertions(+)
>
>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
>index e64f0e32cd67..5271c88ef64c 100644
>--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
>+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
>@@ -547,9 +547,48 @@ mlx5_fw_fatal_reporter_recover(struct devlink_health_reporter *reporter,
> 	return mlx5_health_care(dev);
> }
> 
>+static int
>+mlx5_fw_fatal_reporter_dump(struct devlink_health_reporter *reporter,
>+			    struct devlink_fmsg *fmsg, void *priv_ctx)
>+{
>+	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
>+	char crdump_region[20];
>+	u32 snapshot_id;
>+	int err;
>+
>+	if (!mlx5_core_is_pf(dev)) {
>+		mlx5_core_err(dev, "Only PF is permitted run FW fatal dump\n");
>+		return -EPERM;
>+	}
>+
>+	err = mlx5_crdump_collect(dev, crdump_region, &snapshot_id);
>+	if (err)
>+		return err;
>+
>+	if (priv_ctx) {
>+		struct mlx5_fw_reporter_ctx *fw_reporter_ctx = priv_ctx;
>+
>+		err = mlx5_fw_reporter_ctx_pairs_put(fmsg, fw_reporter_ctx);
>+		if (err)
>+			return err;
>+	}
>+
>+	err = devlink_fmsg_string_pair_put(fmsg, "devlink_region_name",
>+					   crdump_region);

Oh come on. You cannot be serious :/ Please do proper linkage to region
and snapshot in devlink core.



>+	if (err)
>+		return err;
>+
>+	err = devlink_fmsg_u32_pair_put(fmsg, "snapshot_id", snapshot_id);
>+	if (err)
>+		return err;
>+
>+	return 0;
>+}
>+
> static const struct devlink_health_reporter_ops mlx5_fw_fatal_reporter_ops = {
> 		.name = "fw_fatal",
> 		.recover = mlx5_fw_fatal_reporter_recover,
>+		.dump = mlx5_fw_fatal_reporter_dump,
> };
> 
> #define MLX5_REPORTER_FW_GRACEFUL_PERIOD 1200000
>-- 
>2.20.1
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next 07/15] net/mlx5: Issue SW reset on FW assert
  2019-05-05 15:38   ` Jiri Pirko
@ 2019-05-06 10:44     ` Moshe Shemesh
  0 siblings, 0 replies; 34+ messages in thread
From: Moshe Shemesh @ 2019-05-06 10:44 UTC (permalink / raw)
  To: Jiri Pirko, Saeed Mahameed
  Cc: David S. Miller, netdev, Jiri Pirko, Feras Daoud, Alex Vesker,
	Daniel Jurgens



On 5/5/2019 6:38 PM, Jiri Pirko wrote:
> Sun, May 05, 2019 at 02:33:18AM CEST, saeedm@mellanox.com wrote:
>> From: Feras Daoud <ferasda@mellanox.com>
>>
>> If a FW assert is considered fatal, indicated by a new bit in the health
>> buffer, reset the FW. After the reset go through the normal recovery
>> flow. Only one PF needs to issue the reset, so an attempt is made to
>> prevent the 2nd function from also issuing the reset.
>> It's not an error if that happens, it just slows recovery.
>>
>> Signed-off-by: Feras Daoud <ferasda@mellanox.com>
>> Signed-off-by: Alex Vesker <valex@mellanox.com>
>> Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
>> Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
>> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
>> ---
>> .../ethernet/mellanox/mlx5/core/diag/crdump.c |  13 +-
>> .../net/ethernet/mellanox/mlx5/core/health.c  | 157 +++++++++++++++++-
>> .../net/ethernet/mellanox/mlx5/core/main.c    |   1 +
>> .../ethernet/mellanox/mlx5/core/mlx5_core.h   |   2 +
>> include/linux/mlx5/device.h                   |  10 +-
>> include/linux/mlx5/driver.h                   |   1 +
>> 6 files changed, 176 insertions(+), 8 deletions(-)
>>
> 
> [...]
> 
> 
>> +void mlx5_error_sw_reset(struct mlx5_core_dev *dev)
>> +{
>> +	unsigned long end, delay_ms = MLX5_FW_RESET_WAIT_MS;
>> +	int lock = -EBUSY;
>> +
>> +	mutex_lock(&dev->intf_state_mutex);
>> +	if (dev->state != MLX5_DEVICE_STATE_INTERNAL_ERROR)
>> +		goto unlock;
>> +
>> +	mlx5_core_err(dev, "start\n");
> 
> Leftover?
> 
Not leftover, it was just moved from one point to another.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next 09/15] net/mlx5: Create FW devlink health reporter
  2019-05-05 15:42   ` Jiri Pirko
@ 2019-05-06 10:45     ` Moshe Shemesh
  2019-05-06 11:38       ` Jiri Pirko
  0 siblings, 1 reply; 34+ messages in thread
From: Moshe Shemesh @ 2019-05-06 10:45 UTC (permalink / raw)
  To: Jiri Pirko, Saeed Mahameed
  Cc: David S. Miller, netdev, Jiri Pirko, Eran Ben Elisha



On 5/5/2019 6:42 PM, Jiri Pirko wrote:
> Sun, May 05, 2019 at 02:33:23AM CEST, saeedm@mellanox.com wrote:
>> From: Moshe Shemesh <moshe@mellanox.com>
>>
>> Create mlx5_devlink_health_reporter for FW reporter. The FW reporter
>> implements devlink_health_reporter diagnose callback.
>>
>> The fw reporter diagnose command can be triggered any time by the user
>> to check current fw status.
>> In healthy status, it will return clear syndrome. Otherwise it will dump
>> the health info buffer.
>>
>> Command example and output on healthy status:
>> $ devlink health diagnose pci/0000:82:00.0 reporter fw
>> Syndrome: 0
>>
>> Command example and output on non healthy status:
>> $ devlink health diagnose pci/0000:82:00.0 reporter fw
>> diagnose data:
>> assert_var[0] 0xfc3fc043
>> assert_var[1] 0x0001b41c
>> assert_var[2] 0x00000000
>> assert_var[3] 0x00000000
>> assert_var[4] 0x00000000
>> assert_exit_ptr 0x008033b4
>> assert_callra 0x0080365c
>> fw_ver 16.24.1000
>> hw_id 0x0000020d
>> irisc_index 0
>> synd 0x8: unrecoverable hardware error
>> ext_synd 0x003d
>> raw fw_ver 0x101803e8
>>
>> Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
>> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
>> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
> 
> 	
> [...]	
> 	
> 	
>> +static int
>> +mlx5_fw_reporter_diagnose(struct devlink_health_reporter *reporter,
>> +			  struct devlink_fmsg *fmsg)
>> +{
>> +	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
>> +	struct mlx5_core_health *health = &dev->priv.health;
>> +	u8 synd;
>> +	int err;
>> +
>> +	mutex_lock(&health->info_buf_lock);
>> +	mlx5_get_health_info(dev, &synd);
>> +
>> +	if (!synd) {
>> +		mutex_unlock(&health->info_buf_lock);
>> +		return devlink_fmsg_u8_pair_put(fmsg, "Syndrome", synd);
>> +	}
>> +
>> +	err = devlink_fmsg_string_pair_put(fmsg, "diagnose data",
>> +					   health->info_buf);
> 
> No! This is wrong! You are sneaking in text blob. Please put the info in
> structured form using proper fmsg helpers.
> 
This is the fw output format, it is already in use, I don't want to 
change it, just have it here in the diagnose output too.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next 11/15] net/mlx5: Add support for FW reporter dump
  2019-05-05 15:49   ` Jiri Pirko
@ 2019-05-06 10:51     ` Moshe Shemesh
  2019-05-06 11:37       ` Jiri Pirko
  0 siblings, 1 reply; 34+ messages in thread
From: Moshe Shemesh @ 2019-05-06 10:51 UTC (permalink / raw)
  To: Jiri Pirko, Saeed Mahameed
  Cc: David S. Miller, netdev, Jiri Pirko, Eran Ben Elisha



On 5/5/2019 6:49 PM, Jiri Pirko wrote:
> Sun, May 05, 2019 at 02:33:27AM CEST, saeedm@mellanox.com wrote:
>> From: Moshe Shemesh <moshe@mellanox.com>
>>
>> Add support of dump callback for mlx5 FW reporter.
>> Once we trigger FW dump, the FW will write the core dump to its raw data
>> buffer. The tracer translates the raw data to traces and save it to a
>> buffer. Once dump is done, the saved traces data is filled as objects
>> into the dump buffer.
>>
> 
> [...]
> 
>> +static void mlx5_fw_tracer_save_trace(struct mlx5_fw_tracer *tracer,
>> +				      u64 timestamp, bool lost,
>> +				      u8 event_id, char *msg)
>> +{
>> +	char *saved_traces = tracer->sbuff.traces_buff;
>> +	u32 offset;
>> +
>> +	mutex_lock(&tracer->sbuff.lock);
>> +	offset = tracer->sbuff.saved_traces_index * TRACE_STR_LINE;
>> +	snprintf(saved_traces + offset, TRACE_STR_LINE,
>> +		 "%s [0x%llx] %d [0x%x] %s", dev_name(&tracer->dev->pdev->dev),
>> +		 timestamp, lost, event_id, msg);
> 
> Please format this using fmsg helpers instead.
> 

Same here, I want to keep the format as is, not change it.
> 
>> +
>> +	tracer->sbuff.saved_traces_index =
>> +		(tracer->sbuff.saved_traces_index + 1) & (SAVED_TRACES_NUM - 1);
>> +	mutex_unlock(&tracer->sbuff.lock);
>> +}
> 
> [...]
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next 14/15] net/mlx5: Add support for FW fatal reporter dump
  2019-05-05 15:52   ` Jiri Pirko
@ 2019-05-06 10:54     ` Moshe Shemesh
  2019-05-06 11:42       ` Jiri Pirko
  0 siblings, 1 reply; 34+ messages in thread
From: Moshe Shemesh @ 2019-05-06 10:54 UTC (permalink / raw)
  To: Jiri Pirko, Saeed Mahameed; +Cc: David S. Miller, netdev, Jiri Pirko



On 5/5/2019 6:52 PM, Jiri Pirko wrote:
> Sun, May 05, 2019 at 02:33:33AM CEST, saeedm@mellanox.com wrote:
>> From: Moshe Shemesh <moshe@mellanox.com>
>>
>> Add support of dump callback for mlx5 FW fatal reporter.
>> The FW fatal dump use cr-dump functionality to gather cr-space data for
>> debug. The cr-dump uses vsc interface which is valid even if the FW
>> command interface is not functional, which is the case in most FW fatal
>> errors.
>> The cr-dump is stored as a memory region snapshot to ease read by
>> address.
>>
>> Command example and output:
>> $ devlink health dump show pci/0000:82:00.0 reporter fw_fatal
>> devlink_region_name: cr-space snapshot_id: 1
>>
>> $ devlink region read pci/0000:82:00.0/cr-space snapshot 1 address 983064 length 8
>> 00000000000f0018 e1 03 00 00 fb ae a9 3f
>>
>> Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
>> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
>> ---
>> .../net/ethernet/mellanox/mlx5/core/health.c  | 39 +++++++++++++++++++
>> 1 file changed, 39 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
>> index e64f0e32cd67..5271c88ef64c 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
>> @@ -547,9 +547,48 @@ mlx5_fw_fatal_reporter_recover(struct devlink_health_reporter *reporter,
>> 	return mlx5_health_care(dev);
>> }
>>
>> +static int
>> +mlx5_fw_fatal_reporter_dump(struct devlink_health_reporter *reporter,
>> +			    struct devlink_fmsg *fmsg, void *priv_ctx)
>> +{
>> +	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
>> +	char crdump_region[20];
>> +	u32 snapshot_id;
>> +	int err;
>> +
>> +	if (!mlx5_core_is_pf(dev)) {
>> +		mlx5_core_err(dev, "Only PF is permitted run FW fatal dump\n");
>> +		return -EPERM;
>> +	}
>> +
>> +	err = mlx5_crdump_collect(dev, crdump_region, &snapshot_id);
>> +	if (err)
>> +		return err;
>> +
>> +	if (priv_ctx) {
>> +		struct mlx5_fw_reporter_ctx *fw_reporter_ctx = priv_ctx;
>> +
>> +		err = mlx5_fw_reporter_ctx_pairs_put(fmsg, fw_reporter_ctx);
>> +		if (err)
>> +			return err;
>> +	}
>> +
>> +	err = devlink_fmsg_string_pair_put(fmsg, "devlink_region_name",
>> +					   crdump_region);
> 
> Oh come on. You cannot be serious :/ Please do proper linkage to region
> and snapshot in devlink core.
> 

Not sure I understand what you mean, as I wrote in the commit message, 
the region snapshot added value here, is that user can read data by offset.

> 
> 
>> +	if (err)
>> +		return err;
>> +
>> +	err = devlink_fmsg_u32_pair_put(fmsg, "snapshot_id", snapshot_id);
>> +	if (err)
>> +		return err;
>> +
>> +	return 0;
>> +}
>> +
>> static const struct devlink_health_reporter_ops mlx5_fw_fatal_reporter_ops = {
>> 		.name = "fw_fatal",
>> 		.recover = mlx5_fw_fatal_reporter_recover,
>> +		.dump = mlx5_fw_fatal_reporter_dump,
>> };
>>
>> #define MLX5_REPORTER_FW_GRACEFUL_PERIOD 1200000
>> -- 
>> 2.20.1
>>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next 11/15] net/mlx5: Add support for FW reporter dump
  2019-05-06 10:51     ` Moshe Shemesh
@ 2019-05-06 11:37       ` Jiri Pirko
  0 siblings, 0 replies; 34+ messages in thread
From: Jiri Pirko @ 2019-05-06 11:37 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: Saeed Mahameed, David S. Miller, netdev, Jiri Pirko, Eran Ben Elisha

Mon, May 06, 2019 at 12:51:24PM CEST, moshe@mellanox.com wrote:
>
>
>On 5/5/2019 6:49 PM, Jiri Pirko wrote:
>> Sun, May 05, 2019 at 02:33:27AM CEST, saeedm@mellanox.com wrote:
>>> From: Moshe Shemesh <moshe@mellanox.com>
>>>
>>> Add support of dump callback for mlx5 FW reporter.
>>> Once we trigger FW dump, the FW will write the core dump to its raw data
>>> buffer. The tracer translates the raw data to traces and save it to a
>>> buffer. Once dump is done, the saved traces data is filled as objects
>>> into the dump buffer.
>>>
>> 
>> [...]
>> 
>>> +static void mlx5_fw_tracer_save_trace(struct mlx5_fw_tracer *tracer,
>>> +				      u64 timestamp, bool lost,
>>> +				      u8 event_id, char *msg)
>>> +{
>>> +	char *saved_traces = tracer->sbuff.traces_buff;
>>> +	u32 offset;
>>> +
>>> +	mutex_lock(&tracer->sbuff.lock);
>>> +	offset = tracer->sbuff.saved_traces_index * TRACE_STR_LINE;
>>> +	snprintf(saved_traces + offset, TRACE_STR_LINE,
>>> +		 "%s [0x%llx] %d [0x%x] %s", dev_name(&tracer->dev->pdev->dev),
>>> +		 timestamp, lost, event_id, msg);
>> 
>> Please format this using fmsg helpers instead.
>> 
>
>Same here, I want to keep the format as is, not change it.

"as is" - where exactly?


>> 
>>> +
>>> +	tracer->sbuff.saved_traces_index =
>>> +		(tracer->sbuff.saved_traces_index + 1) & (SAVED_TRACES_NUM - 1);
>>> +	mutex_unlock(&tracer->sbuff.lock);
>>> +}
>> 
>> [...]
>> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next 09/15] net/mlx5: Create FW devlink health reporter
  2019-05-06 10:45     ` Moshe Shemesh
@ 2019-05-06 11:38       ` Jiri Pirko
  2019-05-06 19:52         ` Saeed Mahameed
  0 siblings, 1 reply; 34+ messages in thread
From: Jiri Pirko @ 2019-05-06 11:38 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: Saeed Mahameed, David S. Miller, netdev, Jiri Pirko, Eran Ben Elisha

Mon, May 06, 2019 at 12:45:44PM CEST, moshe@mellanox.com wrote:
>
>
>On 5/5/2019 6:42 PM, Jiri Pirko wrote:
>> Sun, May 05, 2019 at 02:33:23AM CEST, saeedm@mellanox.com wrote:
>>> From: Moshe Shemesh <moshe@mellanox.com>
>>>
>>> Create mlx5_devlink_health_reporter for FW reporter. The FW reporter
>>> implements devlink_health_reporter diagnose callback.
>>>
>>> The fw reporter diagnose command can be triggered any time by the user
>>> to check current fw status.
>>> In healthy status, it will return clear syndrome. Otherwise it will dump
>>> the health info buffer.
>>>
>>> Command example and output on healthy status:
>>> $ devlink health diagnose pci/0000:82:00.0 reporter fw
>>> Syndrome: 0
>>>
>>> Command example and output on non healthy status:
>>> $ devlink health diagnose pci/0000:82:00.0 reporter fw
>>> diagnose data:
>>> assert_var[0] 0xfc3fc043
>>> assert_var[1] 0x0001b41c
>>> assert_var[2] 0x00000000
>>> assert_var[3] 0x00000000
>>> assert_var[4] 0x00000000
>>> assert_exit_ptr 0x008033b4
>>> assert_callra 0x0080365c
>>> fw_ver 16.24.1000
>>> hw_id 0x0000020d
>>> irisc_index 0
>>> synd 0x8: unrecoverable hardware error
>>> ext_synd 0x003d
>>> raw fw_ver 0x101803e8
>>>
>>> Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
>>> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
>>> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
>> 
>> 	
>> [...]	
>> 	
>> 	
>>> +static int
>>> +mlx5_fw_reporter_diagnose(struct devlink_health_reporter *reporter,
>>> +			  struct devlink_fmsg *fmsg)
>>> +{
>>> +	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
>>> +	struct mlx5_core_health *health = &dev->priv.health;
>>> +	u8 synd;
>>> +	int err;
>>> +
>>> +	mutex_lock(&health->info_buf_lock);
>>> +	mlx5_get_health_info(dev, &synd);
>>> +
>>> +	if (!synd) {
>>> +		mutex_unlock(&health->info_buf_lock);
>>> +		return devlink_fmsg_u8_pair_put(fmsg, "Syndrome", synd);
>>> +	}
>>> +
>>> +	err = devlink_fmsg_string_pair_put(fmsg, "diagnose data",
>>> +					   health->info_buf);
>> 
>> No! This is wrong! You are sneaking in text blob. Please put the info in
>> structured form using proper fmsg helpers.
>> 
>This is the fw output format, it is already in use, I don't want to 
>change it, just have it here in the diagnose output too.

Already in use where? in dmesg? Sorry, but that is not an argument.
Please format the message properly.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next 14/15] net/mlx5: Add support for FW fatal reporter dump
  2019-05-06 10:54     ` Moshe Shemesh
@ 2019-05-06 11:42       ` Jiri Pirko
  0 siblings, 0 replies; 34+ messages in thread
From: Jiri Pirko @ 2019-05-06 11:42 UTC (permalink / raw)
  To: Moshe Shemesh; +Cc: Saeed Mahameed, David S. Miller, netdev, Jiri Pirko

Mon, May 06, 2019 at 12:54:00PM CEST, moshe@mellanox.com wrote:
>
>
>On 5/5/2019 6:52 PM, Jiri Pirko wrote:
>> Sun, May 05, 2019 at 02:33:33AM CEST, saeedm@mellanox.com wrote:
>>> From: Moshe Shemesh <moshe@mellanox.com>
>>>
>>> Add support of dump callback for mlx5 FW fatal reporter.
>>> The FW fatal dump use cr-dump functionality to gather cr-space data for
>>> debug. The cr-dump uses vsc interface which is valid even if the FW
>>> command interface is not functional, which is the case in most FW fatal
>>> errors.
>>> The cr-dump is stored as a memory region snapshot to ease read by
>>> address.
>>>
>>> Command example and output:
>>> $ devlink health dump show pci/0000:82:00.0 reporter fw_fatal
>>> devlink_region_name: cr-space snapshot_id: 1
>>>
>>> $ devlink region read pci/0000:82:00.0/cr-space snapshot 1 address 983064 length 8
>>> 00000000000f0018 e1 03 00 00 fb ae a9 3f
>>>
>>> Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
>>> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
>>> ---
>>> .../net/ethernet/mellanox/mlx5/core/health.c  | 39 +++++++++++++++++++
>>> 1 file changed, 39 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
>>> index e64f0e32cd67..5271c88ef64c 100644
>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
>>> @@ -547,9 +547,48 @@ mlx5_fw_fatal_reporter_recover(struct devlink_health_reporter *reporter,
>>> 	return mlx5_health_care(dev);
>>> }
>>>
>>> +static int
>>> +mlx5_fw_fatal_reporter_dump(struct devlink_health_reporter *reporter,
>>> +			    struct devlink_fmsg *fmsg, void *priv_ctx)
>>> +{
>>> +	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
>>> +	char crdump_region[20];
>>> +	u32 snapshot_id;
>>> +	int err;
>>> +
>>> +	if (!mlx5_core_is_pf(dev)) {
>>> +		mlx5_core_err(dev, "Only PF is permitted run FW fatal dump\n");
>>> +		return -EPERM;
>>> +	}
>>> +
>>> +	err = mlx5_crdump_collect(dev, crdump_region, &snapshot_id);
>>> +	if (err)
>>> +		return err;
>>> +
>>> +	if (priv_ctx) {
>>> +		struct mlx5_fw_reporter_ctx *fw_reporter_ctx = priv_ctx;
>>> +
>>> +		err = mlx5_fw_reporter_ctx_pairs_put(fmsg, fw_reporter_ctx);
>>> +		if (err)
>>> +			return err;
>>> +	}
>>> +
>>> +	err = devlink_fmsg_string_pair_put(fmsg, "devlink_region_name",
>>> +					   crdump_region);
>> 
>> Oh come on. You cannot be serious :/ Please do proper linkage to region
>> and snapshot in devlink core.
>> 
>
>Not sure I understand what you mean, as I wrote in the commit message, 
>the region snapshot added value here, is that user can read data by offset.

If there is a region/snapshot affiliated with a reporter, it should be
linked together in devlink core. Not in the driver by exposing
arbitrary strings...


>
>> 
>> 
>>> +	if (err)
>>> +		return err;
>>> +
>>> +	err = devlink_fmsg_u32_pair_put(fmsg, "snapshot_id", snapshot_id);
>>> +	if (err)
>>> +		return err;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> static const struct devlink_health_reporter_ops mlx5_fw_fatal_reporter_ops = {
>>> 		.name = "fw_fatal",
>>> 		.recover = mlx5_fw_fatal_reporter_recover,
>>> +		.dump = mlx5_fw_fatal_reporter_dump,
>>> };
>>>
>>> #define MLX5_REPORTER_FW_GRACEFUL_PERIOD 1200000
>>> -- 
>>> 2.20.1
>>>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next 09/15] net/mlx5: Create FW devlink health reporter
  2019-05-06 11:38       ` Jiri Pirko
@ 2019-05-06 19:52         ` Saeed Mahameed
  2019-05-06 21:46           ` Alexei Starovoitov
  2019-05-07  6:01           ` Jiri Pirko
  0 siblings, 2 replies; 34+ messages in thread
From: Saeed Mahameed @ 2019-05-06 19:52 UTC (permalink / raw)
  To: Moshe Shemesh, jiri; +Cc: Eran Ben Elisha, davem, netdev, Jiri Pirko

On Mon, 2019-05-06 at 13:38 +0200, Jiri Pirko wrote:
> Mon, May 06, 2019 at 12:45:44PM CEST, moshe@mellanox.com wrote:
> > 
> > On 5/5/2019 6:42 PM, Jiri Pirko wrote:
> > > Sun, May 05, 2019 at 02:33:23AM CEST, saeedm@mellanox.com wrote:
> > > > From: Moshe Shemesh <moshe@mellanox.com>
> > > > 
> > > > Create mlx5_devlink_health_reporter for FW reporter. The FW
> > > > reporter
> > > > implements devlink_health_reporter diagnose callback.
> > > > 
> > > > The fw reporter diagnose command can be triggered any time by
> > > > the user
> > > > to check current fw status.
> > > > In healthy status, it will return clear syndrome. Otherwise it
> > > > will dump
> > > > the health info buffer.
> > > > 
> > > > Command example and output on healthy status:
> > > > $ devlink health diagnose pci/0000:82:00.0 reporter fw
> > > > Syndrome: 0
> > > > 
> > > > Command example and output on non healthy status:
> > > > $ devlink health diagnose pci/0000:82:00.0 reporter fw
> > > > diagnose data:
> > > > assert_var[0] 0xfc3fc043
> > > > assert_var[1] 0x0001b41c
> > > > assert_var[2] 0x00000000
> > > > assert_var[3] 0x00000000
> > > > assert_var[4] 0x00000000
> > > > assert_exit_ptr 0x008033b4
> > > > assert_callra 0x0080365c
> > > > fw_ver 16.24.1000
> > > > hw_id 0x0000020d
> > > > irisc_index 0
> > > > synd 0x8: unrecoverable hardware error
> > > > ext_synd 0x003d
> > > > raw fw_ver 0x101803e8
> > > > 
> > > > Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
> > > > Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
> > > > Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
> > > 
> > > 	
> > > [...]	
> > > 	
> > > 	
> > > > +static int
> > > > +mlx5_fw_reporter_diagnose(struct devlink_health_reporter
> > > > *reporter,
> > > > +			  struct devlink_fmsg *fmsg)
> > > > +{
> > > > +	struct mlx5_core_dev *dev =
> > > > devlink_health_reporter_priv(reporter);
> > > > +	struct mlx5_core_health *health = &dev->priv.health;
> > > > +	u8 synd;
> > > > +	int err;
> > > > +
> > > > +	mutex_lock(&health->info_buf_lock);
> > > > +	mlx5_get_health_info(dev, &synd);
> > > > +
> > > > +	if (!synd) {
> > > > +		mutex_unlock(&health->info_buf_lock);
> > > > +		return devlink_fmsg_u8_pair_put(fmsg,
> > > > "Syndrome", synd);
> > > > +	}
> > > > +
> > > > +	err = devlink_fmsg_string_pair_put(fmsg, "diagnose
> > > > data",
> > > > +					   health->info_buf);
> > > 
> > > No! This is wrong! You are sneaking in text blob. Please put the
> > > info in
> > > structured form using proper fmsg helpers.
> > > 
> > This is the fw output format, it is already in use, I don't want
> > to 
> > change it, just have it here in the diagnose output too.
> 
> Already in use where? in dmesg? Sorry, but that is not an argument.
> Please format the message properly.
> 

What is wrong here ? 

Unlike binary dump data, I thought diagnose data is allowed to be
developer friendly free text format, if not then let's enforce it in
devlink API. Jiri,  you can't audit each and every use of
devlink_fmsg_string_pair_put, it must be done by design.

Thanks,
Saeed

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next 09/15] net/mlx5: Create FW devlink health reporter
  2019-05-06 19:52         ` Saeed Mahameed
@ 2019-05-06 21:46           ` Alexei Starovoitov
  2019-05-07  5:59             ` Jiri Pirko
  2019-05-07  6:01           ` Jiri Pirko
  1 sibling, 1 reply; 34+ messages in thread
From: Alexei Starovoitov @ 2019-05-06 21:46 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Moshe Shemesh, jiri, Eran Ben Elisha, davem, netdev, Jiri Pirko

On Mon, May 06, 2019 at 07:52:18PM +0000, Saeed Mahameed wrote:
> On Mon, 2019-05-06 at 13:38 +0200, Jiri Pirko wrote:
> > Mon, May 06, 2019 at 12:45:44PM CEST, moshe@mellanox.com wrote:
> > > 
> > > On 5/5/2019 6:42 PM, Jiri Pirko wrote:
> > > > Sun, May 05, 2019 at 02:33:23AM CEST, saeedm@mellanox.com wrote:
> > > > > From: Moshe Shemesh <moshe@mellanox.com>
> > > > > 
> > > > > Create mlx5_devlink_health_reporter for FW reporter. The FW
> > > > > reporter
> > > > > implements devlink_health_reporter diagnose callback.
> > > > > 
> > > > > The fw reporter diagnose command can be triggered any time by
> > > > > the user
> > > > > to check current fw status.
> > > > > In healthy status, it will return clear syndrome. Otherwise it
> > > > > will dump
> > > > > the health info buffer.
> > > > > 
> > > > > Command example and output on healthy status:
> > > > > $ devlink health diagnose pci/0000:82:00.0 reporter fw
> > > > > Syndrome: 0
> > > > > 
> > > > > Command example and output on non healthy status:
> > > > > $ devlink health diagnose pci/0000:82:00.0 reporter fw
> > > > > diagnose data:
> > > > > assert_var[0] 0xfc3fc043
> > > > > assert_var[1] 0x0001b41c
> > > > > assert_var[2] 0x00000000
> > > > > assert_var[3] 0x00000000
> > > > > assert_var[4] 0x00000000
> > > > > assert_exit_ptr 0x008033b4
> > > > > assert_callra 0x0080365c
> > > > > fw_ver 16.24.1000
> > > > > hw_id 0x0000020d
> > > > > irisc_index 0
> > > > > synd 0x8: unrecoverable hardware error
> > > > > ext_synd 0x003d
> > > > > raw fw_ver 0x101803e8
> > > > > 
> > > > > Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
> > > > > Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
> > > > > Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
> > > > 
> > > > 	
> > > > [...]	
> > > > 	
> > > > 	
> > > > > +static int
> > > > > +mlx5_fw_reporter_diagnose(struct devlink_health_reporter
> > > > > *reporter,
> > > > > +			  struct devlink_fmsg *fmsg)
> > > > > +{
> > > > > +	struct mlx5_core_dev *dev =
> > > > > devlink_health_reporter_priv(reporter);
> > > > > +	struct mlx5_core_health *health = &dev->priv.health;
> > > > > +	u8 synd;
> > > > > +	int err;
> > > > > +
> > > > > +	mutex_lock(&health->info_buf_lock);
> > > > > +	mlx5_get_health_info(dev, &synd);
> > > > > +
> > > > > +	if (!synd) {
> > > > > +		mutex_unlock(&health->info_buf_lock);
> > > > > +		return devlink_fmsg_u8_pair_put(fmsg,
> > > > > "Syndrome", synd);
> > > > > +	}
> > > > > +
> > > > > +	err = devlink_fmsg_string_pair_put(fmsg, "diagnose
> > > > > data",
> > > > > +					   health->info_buf);
> > > > 
> > > > No! This is wrong! You are sneaking in text blob. Please put the
> > > > info in
> > > > structured form using proper fmsg helpers.
> > > > 
> > > This is the fw output format, it is already in use, I don't want
> > > to 
> > > change it, just have it here in the diagnose output too.
> > 
> > Already in use where? in dmesg? Sorry, but that is not an argument.
> > Please format the message properly.
> > 
> 
> What is wrong here ? 
> 
> Unlike binary dump data, I thought diagnose data is allowed to be
> developer friendly free text format, if not then let's enforce it in
> devlink API. Jiri,  you can't audit each and every use of
> devlink_fmsg_string_pair_put, it must be done by design.

I agree with the purpose that Jiri is trying to get out of this infra,
but I disagree that strings should be disallowed.

The example messages from commit log are not useful at all:
$ devlink health diagnose pci/0000:82:00.0 reporter fw
diagnose data:
assert_var[0] 0xfc3fc043
assert_var[1] 0x0001b41c
assert_var[2] 0x00000000
assert_var[3] 0x00000000
assert_var[4] 0x00000000
assert_exit_ptr 0x008033b4
assert_callra 0x0080365c
fw_ver 16.24.1000
hw_id 0x0000020d
irisc_index 0
synd 0x8: unrecoverable hardware error
ext_synd 0x003d
raw fw_ver 0x101803e8

Firmware dumps some magic numbers. How this is any better than
what we have today?
Every bit has to have meaningful message.
The point of devlink diag is not to replicate dmesg, but to give
users a clean way to understand and debug the hw.

It seems mlx is adding a new interface and immediately misusing it.
This not an example to give to other vendors.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next 09/15] net/mlx5: Create FW devlink health reporter
  2019-05-05  0:33 ` [net-next 09/15] net/mlx5: Create FW devlink health reporter Saeed Mahameed
  2019-05-05 15:42   ` Jiri Pirko
@ 2019-05-07  0:11   ` Jakub Kicinski
  1 sibling, 0 replies; 34+ messages in thread
From: Jakub Kicinski @ 2019-05-07  0:11 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: David S. Miller, netdev, Jiri Pirko, Moshe Shemesh, Eran Ben Elisha

On Sun, 5 May 2019 00:33:23 +0000, Saeed Mahameed wrote:
> From: Moshe Shemesh <moshe@mellanox.com>
> 
> Create mlx5_devlink_health_reporter for FW reporter. The FW reporter
> implements devlink_health_reporter diagnose callback.
> 
> The fw reporter diagnose command can be triggered any time by the user
> to check current fw status.
> In healthy status, it will return clear syndrome. Otherwise it will dump
> the health info buffer.
> 
> Command example and output on healthy status:
> $ devlink health diagnose pci/0000:82:00.0 reporter fw
> Syndrome: 0
> 
> Command example and output on non healthy status:
> $ devlink health diagnose pci/0000:82:00.0 reporter fw
> diagnose data:
> assert_var[0] 0xfc3fc043
> assert_var[1] 0x0001b41c
> assert_var[2] 0x00000000
> assert_var[3] 0x00000000
> assert_var[4] 0x00000000
> assert_exit_ptr 0x008033b4
> assert_callra 0x0080365c
> fw_ver 16.24.1000

Does FW version really have to be duplicated in this API?

> hw_id 0x0000020d
> irisc_index 0
> synd 0x8: unrecoverable hardware error
> ext_synd 0x003d
> raw fw_ver 0x101803e8

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next 09/15] net/mlx5: Create FW devlink health reporter
  2019-05-06 21:46           ` Alexei Starovoitov
@ 2019-05-07  5:59             ` Jiri Pirko
  0 siblings, 0 replies; 34+ messages in thread
From: Jiri Pirko @ 2019-05-07  5:59 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Saeed Mahameed, Moshe Shemesh, Eran Ben Elisha, davem, netdev,
	Jiri Pirko

Mon, May 06, 2019 at 11:46:43PM CEST, alexei.starovoitov@gmail.com wrote:
>On Mon, May 06, 2019 at 07:52:18PM +0000, Saeed Mahameed wrote:
>> On Mon, 2019-05-06 at 13:38 +0200, Jiri Pirko wrote:
>> > Mon, May 06, 2019 at 12:45:44PM CEST, moshe@mellanox.com wrote:
>> > > 
>> > > On 5/5/2019 6:42 PM, Jiri Pirko wrote:
>> > > > Sun, May 05, 2019 at 02:33:23AM CEST, saeedm@mellanox.com wrote:
>> > > > > From: Moshe Shemesh <moshe@mellanox.com>
>> > > > > 
>> > > > > Create mlx5_devlink_health_reporter for FW reporter. The FW
>> > > > > reporter
>> > > > > implements devlink_health_reporter diagnose callback.
>> > > > > 
>> > > > > The fw reporter diagnose command can be triggered any time by
>> > > > > the user
>> > > > > to check current fw status.
>> > > > > In healthy status, it will return clear syndrome. Otherwise it
>> > > > > will dump
>> > > > > the health info buffer.
>> > > > > 
>> > > > > Command example and output on healthy status:
>> > > > > $ devlink health diagnose pci/0000:82:00.0 reporter fw
>> > > > > Syndrome: 0
>> > > > > 
>> > > > > Command example and output on non healthy status:
>> > > > > $ devlink health diagnose pci/0000:82:00.0 reporter fw
>> > > > > diagnose data:
>> > > > > assert_var[0] 0xfc3fc043
>> > > > > assert_var[1] 0x0001b41c
>> > > > > assert_var[2] 0x00000000
>> > > > > assert_var[3] 0x00000000
>> > > > > assert_var[4] 0x00000000
>> > > > > assert_exit_ptr 0x008033b4
>> > > > > assert_callra 0x0080365c
>> > > > > fw_ver 16.24.1000
>> > > > > hw_id 0x0000020d
>> > > > > irisc_index 0
>> > > > > synd 0x8: unrecoverable hardware error
>> > > > > ext_synd 0x003d
>> > > > > raw fw_ver 0x101803e8
>> > > > > 
>> > > > > Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
>> > > > > Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
>> > > > > Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
>> > > > 
>> > > > 	
>> > > > [...]	
>> > > > 	
>> > > > 	
>> > > > > +static int
>> > > > > +mlx5_fw_reporter_diagnose(struct devlink_health_reporter
>> > > > > *reporter,
>> > > > > +			  struct devlink_fmsg *fmsg)
>> > > > > +{
>> > > > > +	struct mlx5_core_dev *dev =
>> > > > > devlink_health_reporter_priv(reporter);
>> > > > > +	struct mlx5_core_health *health = &dev->priv.health;
>> > > > > +	u8 synd;
>> > > > > +	int err;
>> > > > > +
>> > > > > +	mutex_lock(&health->info_buf_lock);
>> > > > > +	mlx5_get_health_info(dev, &synd);
>> > > > > +
>> > > > > +	if (!synd) {
>> > > > > +		mutex_unlock(&health->info_buf_lock);
>> > > > > +		return devlink_fmsg_u8_pair_put(fmsg,
>> > > > > "Syndrome", synd);
>> > > > > +	}
>> > > > > +
>> > > > > +	err = devlink_fmsg_string_pair_put(fmsg, "diagnose
>> > > > > data",
>> > > > > +					   health->info_buf);
>> > > > 
>> > > > No! This is wrong! You are sneaking in text blob. Please put the
>> > > > info in
>> > > > structured form using proper fmsg helpers.
>> > > > 
>> > > This is the fw output format, it is already in use, I don't want
>> > > to 
>> > > change it, just have it here in the diagnose output too.
>> > 
>> > Already in use where? in dmesg? Sorry, but that is not an argument.
>> > Please format the message properly.
>> > 
>> 
>> What is wrong here ? 
>> 
>> Unlike binary dump data, I thought diagnose data is allowed to be
>> developer friendly free text format, if not then let's enforce it in
>> devlink API. Jiri,  you can't audit each and every use of
>> devlink_fmsg_string_pair_put, it must be done by design.
>
>I agree with the purpose that Jiri is trying to get out of this infra,
>but I disagree that strings should be disallowed.

Not strings in general, I have nothing against them. If they are plain.
But when you push structured data into strings, for example using
sprintf for ints etc, that is a problem I'm pointing at.
Instead of that, the structured data should be formatted using
proper fmsg helpers. We already have them in kernel, it is just about
using them and not taking shortcuts.


>
>The example messages from commit log are not useful at all:
>$ devlink health diagnose pci/0000:82:00.0 reporter fw
>diagnose data:
>assert_var[0] 0xfc3fc043
>assert_var[1] 0x0001b41c
>assert_var[2] 0x00000000
>assert_var[3] 0x00000000
>assert_var[4] 0x00000000
>assert_exit_ptr 0x008033b4
>assert_callra 0x0080365c
>fw_ver 16.24.1000
>hw_id 0x0000020d
>irisc_index 0
>synd 0x8: unrecoverable hardware error
>ext_synd 0x003d
>raw fw_ver 0x101803e8
>
>Firmware dumps some magic numbers. How this is any better than
>what we have today?
>Every bit has to have meaningful message.
>The point of devlink diag is not to replicate dmesg, but to give
>users a clean way to understand and debug the hw.
>
>It seems mlx is adding a new interface and immediately misusing it.
>This not an example to give to other vendors.

I agree.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next 09/15] net/mlx5: Create FW devlink health reporter
  2019-05-06 19:52         ` Saeed Mahameed
  2019-05-06 21:46           ` Alexei Starovoitov
@ 2019-05-07  6:01           ` Jiri Pirko
  1 sibling, 0 replies; 34+ messages in thread
From: Jiri Pirko @ 2019-05-07  6:01 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: Moshe Shemesh, Eran Ben Elisha, davem, netdev, Jiri Pirko

Mon, May 06, 2019 at 09:52:18PM CEST, saeedm@mellanox.com wrote:
>On Mon, 2019-05-06 at 13:38 +0200, Jiri Pirko wrote:
>> Mon, May 06, 2019 at 12:45:44PM CEST, moshe@mellanox.com wrote:
>> > 
>> > On 5/5/2019 6:42 PM, Jiri Pirko wrote:
>> > > Sun, May 05, 2019 at 02:33:23AM CEST, saeedm@mellanox.com wrote:
>> > > > From: Moshe Shemesh <moshe@mellanox.com>
>> > > > 
>> > > > Create mlx5_devlink_health_reporter for FW reporter. The FW
>> > > > reporter
>> > > > implements devlink_health_reporter diagnose callback.
>> > > > 
>> > > > The fw reporter diagnose command can be triggered any time by
>> > > > the user
>> > > > to check current fw status.
>> > > > In healthy status, it will return clear syndrome. Otherwise it
>> > > > will dump
>> > > > the health info buffer.
>> > > > 
>> > > > Command example and output on healthy status:
>> > > > $ devlink health diagnose pci/0000:82:00.0 reporter fw
>> > > > Syndrome: 0
>> > > > 
>> > > > Command example and output on non healthy status:
>> > > > $ devlink health diagnose pci/0000:82:00.0 reporter fw
>> > > > diagnose data:
>> > > > assert_var[0] 0xfc3fc043
>> > > > assert_var[1] 0x0001b41c
>> > > > assert_var[2] 0x00000000
>> > > > assert_var[3] 0x00000000
>> > > > assert_var[4] 0x00000000
>> > > > assert_exit_ptr 0x008033b4
>> > > > assert_callra 0x0080365c
>> > > > fw_ver 16.24.1000
>> > > > hw_id 0x0000020d
>> > > > irisc_index 0
>> > > > synd 0x8: unrecoverable hardware error
>> > > > ext_synd 0x003d
>> > > > raw fw_ver 0x101803e8
>> > > > 
>> > > > Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
>> > > > Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
>> > > > Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
>> > > 
>> > > 	
>> > > [...]	
>> > > 	
>> > > 	
>> > > > +static int
>> > > > +mlx5_fw_reporter_diagnose(struct devlink_health_reporter
>> > > > *reporter,
>> > > > +			  struct devlink_fmsg *fmsg)
>> > > > +{
>> > > > +	struct mlx5_core_dev *dev =
>> > > > devlink_health_reporter_priv(reporter);
>> > > > +	struct mlx5_core_health *health = &dev->priv.health;
>> > > > +	u8 synd;
>> > > > +	int err;
>> > > > +
>> > > > +	mutex_lock(&health->info_buf_lock);
>> > > > +	mlx5_get_health_info(dev, &synd);
>> > > > +
>> > > > +	if (!synd) {
>> > > > +		mutex_unlock(&health->info_buf_lock);
>> > > > +		return devlink_fmsg_u8_pair_put(fmsg,
>> > > > "Syndrome", synd);
>> > > > +	}
>> > > > +
>> > > > +	err = devlink_fmsg_string_pair_put(fmsg, "diagnose
>> > > > data",
>> > > > +					   health->info_buf);
>> > > 
>> > > No! This is wrong! You are sneaking in text blob. Please put the
>> > > info in
>> > > structured form using proper fmsg helpers.
>> > > 
>> > This is the fw output format, it is already in use, I don't want
>> > to 
>> > change it, just have it here in the diagnose output too.
>> 
>> Already in use where? in dmesg? Sorry, but that is not an argument.
>> Please format the message properly.
>> 
>
>What is wrong here ? 
>
>Unlike binary dump data, I thought diagnose data is allowed to be
>developer friendly free text format, if not then let's enforce it in
>devlink API. Jiri,  you can't audit each and every use of
>devlink_fmsg_string_pair_put, it must be done by design.

No way to enforce it. Strings need to be there in general. But drivers
have to use them wisely, only for plain values. Unlike this patch, which
is pushing structured text blob in it.


^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2019-05-07  6:01 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-05  0:32 [pull request][net-next 00/15] Mellanox, mlx5 Firmware devlink health and sw reset Saeed Mahameed
2019-05-05  0:32 ` [net-next 01/15] net/mlx5: Move all devlink related functions calls to devlink.c Saeed Mahameed
2019-05-05  0:32 ` [net-next 02/15] net/mlx5: Add Vendor Specific Capability access gateway Saeed Mahameed
2019-05-05  0:33 ` [net-next 03/15] net/mlx5: Add Crdump FW snapshot support Saeed Mahameed
2019-05-05 15:36   ` Jiri Pirko
2019-05-05  0:33 ` [net-next 04/15] net/mlx5: Add support for devlink region_snapshot parameter Saeed Mahameed
2019-05-05  0:33 ` [net-next 05/15] net/mlx5: Handle SW reset of FW in error flow Saeed Mahameed
2019-05-05  0:33 ` [net-next 06/15] net/mlx5: Control CR-space access by different PFs Saeed Mahameed
2019-05-05  0:33 ` [net-next 07/15] net/mlx5: Issue SW reset on FW assert Saeed Mahameed
2019-05-05 15:38   ` Jiri Pirko
2019-05-06 10:44     ` Moshe Shemesh
2019-05-05  0:33 ` [net-next 08/15] net/mlx5: Refactor print health info Saeed Mahameed
2019-05-05 15:42   ` Jiri Pirko
2019-05-05  0:33 ` [net-next 09/15] net/mlx5: Create FW devlink health reporter Saeed Mahameed
2019-05-05 15:42   ` Jiri Pirko
2019-05-06 10:45     ` Moshe Shemesh
2019-05-06 11:38       ` Jiri Pirko
2019-05-06 19:52         ` Saeed Mahameed
2019-05-06 21:46           ` Alexei Starovoitov
2019-05-07  5:59             ` Jiri Pirko
2019-05-07  6:01           ` Jiri Pirko
2019-05-07  0:11   ` Jakub Kicinski
2019-05-05  0:33 ` [net-next 10/15] net/mlx5: Add core dump register access functions Saeed Mahameed
2019-05-05  0:33 ` [net-next 11/15] net/mlx5: Add support for FW reporter dump Saeed Mahameed
2019-05-05 15:49   ` Jiri Pirko
2019-05-06 10:51     ` Moshe Shemesh
2019-05-06 11:37       ` Jiri Pirko
2019-05-05  0:33 ` [net-next 12/15] net/mlx5: Report devlink health on FW issues Saeed Mahameed
2019-05-05  0:33 ` [net-next 13/15] net/mlx5: Add fw fatal devlink health reporter Saeed Mahameed
2019-05-05  0:33 ` [net-next 14/15] net/mlx5: Add support for FW fatal reporter dump Saeed Mahameed
2019-05-05 15:52   ` Jiri Pirko
2019-05-06 10:54     ` Moshe Shemesh
2019-05-06 11:42       ` Jiri Pirko
2019-05-05  0:33 ` [net-next 15/15] net/mlx5: Report devlink health on FW fatal issues Saeed Mahameed

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).