netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [pull request][net-next v2 00/15] Mellanox, mlx5 Firmware devlink health and sw reset
@ 2019-06-13 20:39 Saeed Mahameed
  2019-06-13 20:39 ` [net-next v2 01/15] devlink: Hang reporter's dump method on a dumpit cb Saeed Mahameed
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: Saeed Mahameed @ 2019-06-13 20:39 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Jiri Pirko, Saeed Mahameed


Hi Dave,

This series provides the support for mlx5 Firmware devlink health and
sw reset.

For more information please see tag log below.

Please pull and let me know if there is any problem.

This is a re-spin of a previously sent series on 5.2 kernel
release.

v2:
 - Improved mlx5 kernel documentation
 - Addressed Jiri's comments:
      Proper linkage to region and snapshot in devlink core.
      Format trace dumps using fmsg helpers.

Thanks,
Saeed.

---
The following changes since commit a842fe1425cb20f457abd3f8ef98b468f83ca98b:

  tcp: add optional per socket transmit delay (2019-06-12 13:05:43 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2019-06-13

for you to fetch changes up to 06efeb555524a8c65ef429f2603885c31a5212b1:

  Documentation: net: mlx5: Devlink health documentation (2019-06-13 13:23:19 -0700)

----------------------------------------------------------------
mlx5-updates-2019-06-13

Mlx5 devlink health fw reporters and sw reset support

This series provides mlx5 firmware reset support and firmware devlink health
reporters.

1) Add initial mlx5 kernel documentation and include devlink health reporters

2) Add CR-Space access and FW Crdump snapshot support via devlink region_snapshot

3) Issue software reset upon FW asserts

4) Add fw and fw_fatal devlink heath reporters to follow fw errors indication by
dump and recover procedures and enable trigger these functionality by user.

4.1) fw reporter:
The fw reporter implements diagnose and dump callbacks.
It follows symptoms of fw error such as fw syndrome by triggering
fw core dump and storing it and any other fw trace into the dump buffer.
The fw reporter diagnose command can be triggered any time by the user to check
current fw status.

4.2) fw_fatal repoter:
The fw_fatal reporter implements dump and recover callbacks.
It follows fatal errors indications by CR-space dump and recover flow.
The CR-space dump uses vsc interface which is valid even if the FW command
interface is not functional, which is the case in most FW fatal errors. The
CR-space dump is stored as a memory region snapshot to ease read by address.
The recover function runs recover flow which reloads the driver and triggers fw
reset if needed.

----------------------------------------------------------------
Alex Vesker (2):
      net/mlx5: Add Vendor Specific Capability access gateway
      net/mlx5: Add Crdump support

Aya Levin (1):
      devlink: Hang reporter's dump method on a dumpit cb

Eran Ben Elisha (1):
      net/mlx5: Move all devlink related functions calls to devlink.c

Feras Daoud (3):
      net/mlx5: Handle SW reset of FW in error flow
      net/mlx5: Control CR-space access by different PFs
      net/mlx5: Issue SW reset on FW assert

Moshe Shemesh (7):
      net/mlx5: Create FW devlink_health_reporter
      net/mlx5: Add support for FW reporter dump
      net/mlx5: Report devlink health on FW issues
      net/mlx5: Add fw fatal devlink_health_reporter
      net/mlx5: Add support for FW fatal reporter dump
      net/mlx5: Report devlink health on FW fatal issues
      Documentation: net: mlx5: Devlink health documentation

Saeed Mahameed (1):
      Documentation: net: mlx5: Add mlx5 initial documentation

 Documentation/networking/device_drivers/index.rst  |   1 +
 .../networking/device_drivers/mellanox/mlx5.rst    | 173 +++++++
 MAINTAINERS                                        |   1 +
 drivers/net/ethernet/mellanox/mlx5/core/Makefile   |   3 +-
 drivers/net/ethernet/mellanox/mlx5/core/devlink.c  |  58 +++
 drivers/net/ethernet/mellanox/mlx5/core/devlink.h  |  14 +
 .../net/ethernet/mellanox/mlx5/core/diag/crdump.c  | 115 +++++
 .../ethernet/mellanox/mlx5/core/diag/fw_tracer.c   | 139 +++++
 .../ethernet/mellanox/mlx5/core/diag/fw_tracer.h   |  20 +
 .../net/ethernet/mellanox/mlx5/core/en_selftest.c  |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/health.c   | 569 ++++++++++++++++++---
 drivers/net/ethernet/mellanox/mlx5/core/lib/mlx5.h |   3 +
 .../net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c  | 316 ++++++++++++
 .../net/ethernet/mellanox/mlx5/core/lib/pci_vsc.h  |  32 ++
 drivers/net/ethernet/mellanox/mlx5/core/main.c     |  65 +--
 .../net/ethernet/mellanox/mlx5/core/mlx5_core.h    |   8 +-
 include/linux/mlx5/device.h                        |  10 +-
 include/linux/mlx5/driver.h                        |  13 +-
 net/core/devlink.c                                 | 118 ++++-
 19 files changed, 1516 insertions(+), 144 deletions(-)
 create mode 100644 Documentation/networking/device_drivers/mellanox/mlx5.rst
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/devlink.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/devlink.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.h

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-06-15  2:46 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-13 20:39 [pull request][net-next v2 00/15] Mellanox, mlx5 Firmware devlink health and sw reset Saeed Mahameed
2019-06-13 20:39 ` [net-next v2 01/15] devlink: Hang reporter's dump method on a dumpit cb Saeed Mahameed
2019-06-13 20:39 ` [net-next v2 02/15] Documentation: net: mlx5: Add mlx5 initial documentation Saeed Mahameed
2019-06-13 20:39 ` [net-next v2 03/15] net/mlx5: Move all devlink related functions calls to devlink.c Saeed Mahameed
2019-06-13 20:39 ` [net-next v2 04/15] net/mlx5: Add Vendor Specific Capability access gateway Saeed Mahameed
2019-06-13 20:39 ` [net-next v2 05/15] net/mlx5: Add Crdump support Saeed Mahameed
2019-06-13 20:39 ` [net-next v2 06/15] net/mlx5: Handle SW reset of FW in error flow Saeed Mahameed
2019-06-13 20:39 ` [net-next v2 07/15] net/mlx5: Control CR-space access by different PFs Saeed Mahameed
2019-06-13 20:39 ` [net-next v2 08/15] net/mlx5: Issue SW reset on FW assert Saeed Mahameed
2019-06-13 20:39 ` [net-next v2 09/15] net/mlx5: Create FW devlink_health_reporter Saeed Mahameed
2019-06-13 20:39 ` [net-next v2 10/15] net/mlx5: Add support for FW reporter dump Saeed Mahameed
2019-06-13 20:39 ` [net-next v2 11/15] net/mlx5: Report devlink health on FW issues Saeed Mahameed
2019-06-13 20:39 ` [net-next v2 12/15] net/mlx5: Add fw fatal devlink_health_reporter Saeed Mahameed
2019-06-13 20:39 ` [net-next v2 13/15] net/mlx5: Add support for FW fatal reporter dump Saeed Mahameed
2019-06-13 20:39 ` [net-next v2 14/15] net/mlx5: Report devlink health on FW fatal issues Saeed Mahameed
2019-06-13 20:39 ` [net-next v2 15/15] Documentation: net: mlx5: Devlink health documentation Saeed Mahameed
2019-06-15  2:46 ` [pull request][net-next v2 00/15] Mellanox, mlx5 Firmware devlink health and sw reset David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).