linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv6 net-next 0/3] Add devlink and devlink health reporters to
@ 2020-12-11  6:25 George Cherian
  2020-12-11  6:25 ` [PATCHv6 net-next 1/3] octeontx2-af: Add devlink suppoort to af driver George Cherian
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: George Cherian @ 2020-12-11  6:25 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: kuba, davem, sgoutham, lcherian, gakula, george.cherian,
	willemdebruijn.kernel, saeed, jiri

Add basic devlink and devlink health reporters.
Devlink health reporters are added for NPA block.

Address Jakub's comment to add devlink support for error reporting.
https://www.spinics.net/lists/netdev/msg670712.html

For now, I have dropped the NIX block health reporters. 
This series attempts to add health reporters only for the NPA block.
As per Jakub's suggestion separate reporters per event is used and also
got rid of the counters.

Change-log:
v6
 - Address Jakub comments
 - Add reporters per event for each block.
 - Remove the Sw counter.
 - Remove the mbox version from devlink info.

v5 
 - Address Jiri's comment
 - use devlink_fmsg_arr_pair_nest_start() for NIX blocks 

v4 
 - Rebase to net-next (no logic changes).
 
v3
 - Address Saeed's comments on v2.
 - Renamed the reporter name as hw_*.
 - Call devlink_health_report() when an event is raised.
 - Added recover op too.

v2
 - Address Willem's comments on v1.
 - Fixed the sparse issues, reported by Jakub.


George Cherian (3):
  octeontx2-af: Add devlink suppoort to af driver
  octeontx2-af: Add devlink health reporters for NPA
  docs: octeontx2: Add Documentation for NPA health reporters

 .../ethernet/marvell/octeontx2.rst            |  50 ++
 .../net/ethernet/marvell/octeontx2/Kconfig    |   1 +
 .../ethernet/marvell/octeontx2/af/Makefile    |   2 +-
 .../net/ethernet/marvell/octeontx2/af/rvu.c   |   9 +-
 .../net/ethernet/marvell/octeontx2/af/rvu.h   |   4 +
 .../marvell/octeontx2/af/rvu_devlink.c        | 770 ++++++++++++++++++
 .../marvell/octeontx2/af/rvu_devlink.h        |  55 ++
 .../marvell/octeontx2/af/rvu_struct.h         |  23 +
 8 files changed, 912 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
 create mode 100644 drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCHv6 net-next  1/3] octeontx2-af: Add devlink suppoort to af driver
  2020-12-11  6:25 [PATCHv6 net-next 0/3] Add devlink and devlink health reporters to George Cherian
@ 2020-12-11  6:25 ` George Cherian
  2020-12-11  6:25 ` [PATCH 2/3] octeontx2-af: Add devlink health reporters for NPA George Cherian
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: George Cherian @ 2020-12-11  6:25 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: kuba, davem, sgoutham, lcherian, gakula, george.cherian,
	willemdebruijn.kernel, saeed, jiri

Add devlink support to AF driver. Basic devlink support is added.
Currently info_get is the only supported devlink ops.

devlink ouptput looks like this
 # devlink dev
 pci/0002:01:00.0
 # devlink dev info
 pci/0002:01:00.0:
  driver octeontx2-af
 #

Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: George Cherian <george.cherian@marvell.com>
---
 .../net/ethernet/marvell/octeontx2/Kconfig    |  1 +
 .../ethernet/marvell/octeontx2/af/Makefile    |  2 +-
 .../net/ethernet/marvell/octeontx2/af/rvu.c   |  9 ++-
 .../net/ethernet/marvell/octeontx2/af/rvu.h   |  4 ++
 .../marvell/octeontx2/af/rvu_devlink.c        | 64 +++++++++++++++++++
 .../marvell/octeontx2/af/rvu_devlink.h        | 20 ++++++
 6 files changed, 98 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
 create mode 100644 drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.h

diff --git a/drivers/net/ethernet/marvell/octeontx2/Kconfig b/drivers/net/ethernet/marvell/octeontx2/Kconfig
index 543a1d047567..16caa02095fe 100644
--- a/drivers/net/ethernet/marvell/octeontx2/Kconfig
+++ b/drivers/net/ethernet/marvell/octeontx2/Kconfig
@@ -9,6 +9,7 @@ config OCTEONTX2_MBOX
 config OCTEONTX2_AF
 	tristate "Marvell OcteonTX2 RVU Admin Function driver"
 	select OCTEONTX2_MBOX
+	select NET_DEVLINK
 	depends on (64BIT && COMPILE_TEST) || ARM64
 	depends on PCI
 	help
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/Makefile b/drivers/net/ethernet/marvell/octeontx2/af/Makefile
index 7100d1dd856e..eb535c98ca38 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/Makefile
+++ b/drivers/net/ethernet/marvell/octeontx2/af/Makefile
@@ -10,4 +10,4 @@ obj-$(CONFIG_OCTEONTX2_AF) += octeontx2_af.o
 octeontx2_mbox-y := mbox.o rvu_trace.o
 octeontx2_af-y := cgx.o rvu.o rvu_cgx.o rvu_npa.o rvu_nix.o \
 		  rvu_reg.o rvu_npc.o rvu_debugfs.o ptp.o rvu_npc_fs.o \
-		  rvu_cpt.o
+		  rvu_cpt.o rvu_devlink.o
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
index 9f901c0edcbb..e8fd712860a1 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
@@ -2826,17 +2826,23 @@ static int rvu_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (err)
 		goto err_flr;
 
+	err = rvu_register_dl(rvu);
+	if (err)
+		goto err_irq;
+
 	rvu_setup_rvum_blk_revid(rvu);
 
 	/* Enable AF's VFs (if any) */
 	err = rvu_enable_sriov(rvu);
 	if (err)
-		goto err_irq;
+		goto err_dl;
 
 	/* Initialize debugfs */
 	rvu_dbg_init(rvu);
 
 	return 0;
+err_dl:
+	rvu_unregister_dl(rvu);
 err_irq:
 	rvu_unregister_interrupts(rvu);
 err_flr:
@@ -2868,6 +2874,7 @@ static void rvu_remove(struct pci_dev *pdev)
 
 	rvu_dbg_exit(rvu);
 	rvu_unregister_interrupts(rvu);
+	rvu_unregister_dl(rvu);
 	rvu_flr_wq_destroy(rvu);
 	rvu_cgx_exit(rvu);
 	rvu_fwdata_exit(rvu);
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
index b6c0977499ab..b1a6ecfd563e 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
@@ -12,7 +12,10 @@
 #define RVU_H
 
 #include <linux/pci.h>
+#include <net/devlink.h>
+
 #include "rvu_struct.h"
+#include "rvu_devlink.h"
 #include "common.h"
 #include "mbox.h"
 #include "npc.h"
@@ -422,6 +425,7 @@ struct rvu {
 #ifdef CONFIG_DEBUG_FS
 	struct rvu_debugfs	rvu_dbg;
 #endif
+	struct rvu_devlink	*rvu_dl;
 };
 
 static inline void rvu_write64(struct rvu *rvu, u64 block, u64 offset, u64 val)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
new file mode 100644
index 000000000000..5dabca04a34b
--- /dev/null
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
@@ -0,0 +1,64 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Marvell OcteonTx2 RVU Devlink
+ *
+ * Copyright (C) 2020 Marvell.
+ *
+ */
+
+#include "rvu.h"
+
+#define DRV_NAME "octeontx2-af"
+
+static int rvu_devlink_info_get(struct devlink *devlink, struct devlink_info_req *req,
+				struct netlink_ext_ack *extack)
+{
+	return devlink_info_driver_name_put(req, DRV_NAME);
+}
+
+static const struct devlink_ops rvu_devlink_ops = {
+	.info_get = rvu_devlink_info_get,
+};
+
+int rvu_register_dl(struct rvu *rvu)
+{
+	struct rvu_devlink *rvu_dl;
+	struct devlink *dl;
+	int err;
+
+	rvu_dl = kzalloc(sizeof(*rvu_dl), GFP_KERNEL);
+	if (!rvu_dl)
+		return -ENOMEM;
+
+	dl = devlink_alloc(&rvu_devlink_ops, sizeof(struct rvu_devlink));
+	if (!dl) {
+		dev_warn(rvu->dev, "devlink_alloc failed\n");
+		kfree(rvu_dl);
+		return -ENOMEM;
+	}
+
+	err = devlink_register(dl, rvu->dev);
+	if (err) {
+		dev_err(rvu->dev, "devlink register failed with error %d\n", err);
+		devlink_free(dl);
+		kfree(rvu_dl);
+		return err;
+	}
+
+	rvu_dl->dl = dl;
+	rvu_dl->rvu = rvu;
+	rvu->rvu_dl = rvu_dl;
+	return 0;
+}
+
+void rvu_unregister_dl(struct rvu *rvu)
+{
+	struct rvu_devlink *rvu_dl = rvu->rvu_dl;
+	struct devlink *dl = rvu_dl->dl;
+
+	if (!dl)
+		return;
+
+	devlink_unregister(dl);
+	devlink_free(dl);
+	kfree(rvu_dl);
+}
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.h
new file mode 100644
index 000000000000..1ed6dde79a4e
--- /dev/null
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*  Marvell OcteonTx2 RVU Devlink
+ *
+ * Copyright (C) 2020 Marvell.
+ *
+ */
+
+#ifndef RVU_DEVLINK_H
+#define  RVU_DEVLINK_H
+
+struct rvu_devlink {
+	struct devlink *dl;
+	struct rvu *rvu;
+};
+
+/* Devlink APIs */
+int rvu_register_dl(struct rvu *rvu);
+void rvu_unregister_dl(struct rvu *rvu);
+
+#endif /* RVU_DEVLINK_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] octeontx2-af: Add devlink health reporters for NPA
  2020-12-11  6:25 [PATCHv6 net-next 0/3] Add devlink and devlink health reporters to George Cherian
  2020-12-11  6:25 ` [PATCHv6 net-next 1/3] octeontx2-af: Add devlink suppoort to af driver George Cherian
@ 2020-12-11  6:25 ` George Cherian
  2020-12-11  6:25 ` [PATCH 3/3] docs: octeontx2: Add Documentation for NPA health reporters George Cherian
  2020-12-15  2:00 ` [PATCHv6 net-next 0/3] Add devlink and devlink health reporters to patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: George Cherian @ 2020-12-11  6:25 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: kuba, davem, sgoutham, lcherian, gakula, george.cherian,
	willemdebruijn.kernel, saeed, jiri

Add health reporters for RVU NPA block.
NPA Health reporters handle following HW event groups
 - GENERAL events
 - ERROR events
 - RAS events
 - RVU event

Output:
 #devlink health
 pci/0002:01:00.0:
   reporter hw_npa_intr
     state healthy error 0 recover 0 grace_period 0 auto_recover true
 auto_dump true
   reporter hw_npa_gen
     state healthy error 0 recover 0 grace_period 0 auto_recover true
 auto_dump true
   reporter hw_npa_err
     state healthy error 0 recover 0 grace_period 0 auto_recover true
 auto_dump true
   reporter hw_npa_ras
     state healthy error 0 recover 0 grace_period 0 auto_recover true
 auto_dump true

 #devlink health dump show  pci/0002:01:00.0 reporter hw_npa_err
 NPA_AF_ERR:
        NPA Error Interrupt Reg : 4096
        AQ Doorbell Error
 #devlink health dump show  pci/0002:01:00.0 reporter hw_npa_ras
 NPA_AF_RVU_RAS:
        NPA RAS Interrupt Reg : 0

 Each reporter dump shows the Register value and the description of the
cause.

Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: George Cherian <george.cherian@marvell.com>
---
 .../marvell/octeontx2/af/rvu_devlink.c        | 708 +++++++++++++++++-
 .../marvell/octeontx2/af/rvu_devlink.h        |  35 +
 .../marvell/octeontx2/af/rvu_struct.h         |  23 +
 3 files changed, 765 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
index 5dabca04a34b..3f9d0ab6d5ae 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
@@ -5,10 +5,714 @@
  *
  */
 
+#include<linux/bitfield.h>
+
 #include "rvu.h"
+#include "rvu_reg.h"
+#include "rvu_struct.h"
 
 #define DRV_NAME "octeontx2-af"
 
+static int rvu_report_pair_start(struct devlink_fmsg *fmsg, const char *name)
+{
+	int err;
+
+	err = devlink_fmsg_pair_nest_start(fmsg, name);
+	if (err)
+		return err;
+
+	return  devlink_fmsg_obj_nest_start(fmsg);
+}
+
+static int rvu_report_pair_end(struct devlink_fmsg *fmsg)
+{
+	int err;
+
+	err = devlink_fmsg_obj_nest_end(fmsg);
+	if (err)
+		return err;
+
+	return devlink_fmsg_pair_nest_end(fmsg);
+}
+
+static bool rvu_common_request_irq(struct rvu *rvu, int offset,
+				   const char *name, irq_handler_t fn)
+{
+	struct rvu_devlink *rvu_dl = rvu->rvu_dl;
+	int rc;
+
+	sprintf(&rvu->irq_name[offset * NAME_SIZE], name);
+	rc = request_irq(pci_irq_vector(rvu->pdev, offset), fn, 0,
+			 &rvu->irq_name[offset * NAME_SIZE], rvu_dl);
+	if (rc)
+		dev_warn(rvu->dev, "Failed to register %s irq\n", name);
+	else
+		rvu->irq_allocated[offset] = true;
+
+	return rvu->irq_allocated[offset];
+}
+
+static void rvu_npa_intr_work(struct work_struct *work)
+{
+	struct rvu_npa_health_reporters *rvu_npa_health_reporter;
+
+	rvu_npa_health_reporter = container_of(work, struct rvu_npa_health_reporters, intr_work);
+	devlink_health_report(rvu_npa_health_reporter->rvu_hw_npa_intr_reporter,
+			      "NPA_AF_RVU Error",
+			      rvu_npa_health_reporter->npa_event_ctx);
+}
+
+static irqreturn_t rvu_npa_af_rvu_intr_handler(int irq, void *rvu_irq)
+{
+	struct rvu_npa_event_ctx *npa_event_context;
+	struct rvu_devlink *rvu_dl = rvu_irq;
+	struct rvu *rvu;
+	int blkaddr;
+	u64 intr;
+
+	rvu = rvu_dl->rvu;
+	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPA, 0);
+	if (blkaddr < 0)
+		return IRQ_NONE;
+
+	npa_event_context = rvu_dl->rvu_npa_health_reporter->npa_event_ctx;
+	intr = rvu_read64(rvu, blkaddr, NPA_AF_RVU_INT);
+	npa_event_context->npa_af_rvu_int = intr;
+
+	/* Clear interrupts */
+	rvu_write64(rvu, blkaddr, NPA_AF_RVU_INT, intr);
+	rvu_write64(rvu, blkaddr, NPA_AF_RVU_INT_ENA_W1C, ~0ULL);
+	queue_work(rvu_dl->devlink_wq, &rvu_dl->rvu_npa_health_reporter->intr_work);
+
+	return IRQ_HANDLED;
+}
+
+static void rvu_npa_gen_work(struct work_struct *work)
+{
+	struct rvu_npa_health_reporters *rvu_npa_health_reporter;
+
+	rvu_npa_health_reporter = container_of(work, struct rvu_npa_health_reporters, gen_work);
+	devlink_health_report(rvu_npa_health_reporter->rvu_hw_npa_gen_reporter,
+			      "NPA_AF_GEN Error",
+			      rvu_npa_health_reporter->npa_event_ctx);
+}
+
+static irqreturn_t rvu_npa_af_gen_intr_handler(int irq, void *rvu_irq)
+{
+	struct rvu_npa_event_ctx *npa_event_context;
+	struct rvu_devlink *rvu_dl = rvu_irq;
+	struct rvu *rvu;
+	int blkaddr;
+	u64 intr;
+
+	rvu = rvu_dl->rvu;
+	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPA, 0);
+	if (blkaddr < 0)
+		return IRQ_NONE;
+
+	npa_event_context = rvu_dl->rvu_npa_health_reporter->npa_event_ctx;
+	intr = rvu_read64(rvu, blkaddr, NPA_AF_GEN_INT);
+	npa_event_context->npa_af_rvu_gen = intr;
+
+	/* Clear interrupts */
+	rvu_write64(rvu, blkaddr, NPA_AF_GEN_INT, intr);
+	rvu_write64(rvu, blkaddr, NPA_AF_GEN_INT_ENA_W1C, ~0ULL);
+	queue_work(rvu_dl->devlink_wq, &rvu_dl->rvu_npa_health_reporter->gen_work);
+
+	return IRQ_HANDLED;
+}
+
+static void rvu_npa_err_work(struct work_struct *work)
+{
+	struct rvu_npa_health_reporters *rvu_npa_health_reporter;
+
+	rvu_npa_health_reporter = container_of(work, struct rvu_npa_health_reporters, err_work);
+	devlink_health_report(rvu_npa_health_reporter->rvu_hw_npa_err_reporter,
+			      "NPA_AF_ERR Error",
+			      rvu_npa_health_reporter->npa_event_ctx);
+}
+
+static irqreturn_t rvu_npa_af_err_intr_handler(int irq, void *rvu_irq)
+{
+	struct rvu_npa_event_ctx *npa_event_context;
+	struct rvu_devlink *rvu_dl = rvu_irq;
+	struct rvu *rvu;
+	int blkaddr;
+	u64 intr;
+
+	rvu = rvu_dl->rvu;
+	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPA, 0);
+	if (blkaddr < 0)
+		return IRQ_NONE;
+	npa_event_context = rvu_dl->rvu_npa_health_reporter->npa_event_ctx;
+	intr = rvu_read64(rvu, blkaddr, NPA_AF_ERR_INT);
+	npa_event_context->npa_af_rvu_err = intr;
+
+	/* Clear interrupts */
+	rvu_write64(rvu, blkaddr, NPA_AF_ERR_INT, intr);
+	rvu_write64(rvu, blkaddr, NPA_AF_ERR_INT_ENA_W1C, ~0ULL);
+	queue_work(rvu_dl->devlink_wq, &rvu_dl->rvu_npa_health_reporter->err_work);
+
+	return IRQ_HANDLED;
+}
+
+static void rvu_npa_ras_work(struct work_struct *work)
+{
+	struct rvu_npa_health_reporters *rvu_npa_health_reporter;
+
+	rvu_npa_health_reporter = container_of(work, struct rvu_npa_health_reporters, ras_work);
+	devlink_health_report(rvu_npa_health_reporter->rvu_hw_npa_ras_reporter,
+			      "HW NPA_AF_RAS Error reported",
+			      rvu_npa_health_reporter->npa_event_ctx);
+}
+
+static irqreturn_t rvu_npa_af_ras_intr_handler(int irq, void *rvu_irq)
+{
+	struct rvu_npa_event_ctx *npa_event_context;
+	struct rvu_devlink *rvu_dl = rvu_irq;
+	struct rvu *rvu;
+	int blkaddr;
+	u64 intr;
+
+	rvu = rvu_dl->rvu;
+	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPA, 0);
+	if (blkaddr < 0)
+		return IRQ_NONE;
+
+	npa_event_context = rvu_dl->rvu_npa_health_reporter->npa_event_ctx;
+	intr = rvu_read64(rvu, blkaddr, NPA_AF_RAS);
+	npa_event_context->npa_af_rvu_ras = intr;
+
+	/* Clear interrupts */
+	rvu_write64(rvu, blkaddr, NPA_AF_RAS, intr);
+	rvu_write64(rvu, blkaddr, NPA_AF_RAS_ENA_W1C, ~0ULL);
+	queue_work(rvu_dl->devlink_wq, &rvu_dl->rvu_npa_health_reporter->ras_work);
+
+	return IRQ_HANDLED;
+}
+
+static void rvu_npa_unregister_interrupts(struct rvu *rvu)
+{
+	struct rvu_devlink *rvu_dl = rvu->rvu_dl;
+	int i, offs, blkaddr;
+	u64 reg;
+
+	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPA, 0);
+	if (blkaddr < 0)
+		return;
+
+	reg = rvu_read64(rvu, blkaddr, NPA_PRIV_AF_INT_CFG);
+	offs = reg & 0x3FF;
+
+	rvu_write64(rvu, blkaddr, NPA_AF_RVU_INT_ENA_W1C, ~0ULL);
+	rvu_write64(rvu, blkaddr, NPA_AF_GEN_INT_ENA_W1C, ~0ULL);
+	rvu_write64(rvu, blkaddr, NPA_AF_ERR_INT_ENA_W1C, ~0ULL);
+	rvu_write64(rvu, blkaddr, NPA_AF_RAS_ENA_W1C, ~0ULL);
+
+	for (i = 0; i < NPA_AF_INT_VEC_CNT; i++)
+		if (rvu->irq_allocated[offs + i]) {
+			free_irq(pci_irq_vector(rvu->pdev, offs + i), rvu_dl);
+			rvu->irq_allocated[offs + i] = false;
+		}
+}
+
+static int rvu_npa_register_interrupts(struct rvu *rvu)
+{
+	int blkaddr, base;
+	bool rc;
+
+	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPA, 0);
+	if (blkaddr < 0)
+		return blkaddr;
+
+	/* Get NPA AF MSIX vectors offset. */
+	base = rvu_read64(rvu, blkaddr, NPA_PRIV_AF_INT_CFG) & 0x3ff;
+	if (!base) {
+		dev_warn(rvu->dev,
+			 "Failed to get NPA_AF_INT vector offsets\n");
+		return 0;
+	}
+
+	/* Register and enable NPA_AF_RVU_INT interrupt */
+	rc = rvu_common_request_irq(rvu, base +  NPA_AF_INT_VEC_RVU,
+				    "NPA_AF_RVU_INT",
+				    rvu_npa_af_rvu_intr_handler);
+	if (!rc)
+		goto err;
+	rvu_write64(rvu, blkaddr, NPA_AF_RVU_INT_ENA_W1S, ~0ULL);
+
+	/* Register and enable NPA_AF_GEN_INT interrupt */
+	rc = rvu_common_request_irq(rvu, base + NPA_AF_INT_VEC_GEN,
+				    "NPA_AF_RVU_GEN",
+				    rvu_npa_af_gen_intr_handler);
+	if (!rc)
+		goto err;
+	rvu_write64(rvu, blkaddr, NPA_AF_GEN_INT_ENA_W1S, ~0ULL);
+
+	/* Register and enable NPA_AF_ERR_INT interrupt */
+	rc = rvu_common_request_irq(rvu, base + NPA_AF_INT_VEC_AF_ERR,
+				    "NPA_AF_ERR_INT",
+				    rvu_npa_af_err_intr_handler);
+	if (!rc)
+		goto err;
+	rvu_write64(rvu, blkaddr, NPA_AF_ERR_INT_ENA_W1S, ~0ULL);
+
+	/* Register and enable NPA_AF_RAS interrupt */
+	rc = rvu_common_request_irq(rvu, base + NPA_AF_INT_VEC_POISON,
+				    "NPA_AF_RAS",
+				    rvu_npa_af_ras_intr_handler);
+	if (!rc)
+		goto err;
+	rvu_write64(rvu, blkaddr, NPA_AF_RAS_ENA_W1S, ~0ULL);
+
+	return 0;
+err:
+	rvu_npa_unregister_interrupts(rvu);
+	return rc;
+}
+
+static int rvu_npa_report_show(struct devlink_fmsg *fmsg, void *ctx,
+			       enum npa_af_rvu_health health_reporter)
+{
+	struct rvu_npa_event_ctx *npa_event_context;
+	unsigned int intr_val, alloc_dis, free_dis;
+	int err;
+
+	npa_event_context = ctx;
+	switch (health_reporter) {
+	case NPA_AF_RVU_GEN:
+		intr_val = npa_event_context->npa_af_rvu_gen;
+		err = rvu_report_pair_start(fmsg, "NPA_AF_GENERAL");
+		if (err)
+			return err;
+		err = devlink_fmsg_u64_pair_put(fmsg, "\tNPA General Interrupt Reg ",
+						npa_event_context->npa_af_rvu_gen);
+		if (err)
+			return err;
+		if (intr_val & BIT_ULL(32)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tUnmap PF Error");
+			if (err)
+				return err;
+		}
+
+		free_dis = FIELD_GET(GENMASK(15, 0), intr_val);
+		if (free_dis & BIT(NPA_INPQ_NIX0_RX)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tNIX0: free disabled RX");
+			if (err)
+				return err;
+		}
+		if (free_dis & BIT(NPA_INPQ_NIX0_TX)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tNIX0:free disabled TX");
+			if (err)
+				return err;
+		}
+		if (free_dis & BIT(NPA_INPQ_NIX1_RX)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tNIX1: free disabled RX");
+			if (err)
+				return err;
+		}
+		if (free_dis & BIT(NPA_INPQ_NIX1_TX)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tNIX1:free disabled TX");
+			if (err)
+				return err;
+		}
+		if (free_dis & BIT(NPA_INPQ_SSO)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tFree Disabled for SSO");
+			if (err)
+				return err;
+		}
+		if (free_dis & BIT(NPA_INPQ_TIM)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tFree Disabled for TIM");
+			if (err)
+				return err;
+		}
+		if (free_dis & BIT(NPA_INPQ_DPI)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tFree Disabled for DPI");
+			if (err)
+				return err;
+		}
+		if (free_dis & BIT(NPA_INPQ_AURA_OP)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tFree Disabled for AURA");
+			if (err)
+				return err;
+		}
+
+		alloc_dis = FIELD_GET(GENMASK(31, 16), intr_val);
+		if (alloc_dis & BIT(NPA_INPQ_NIX0_RX)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tNIX0: alloc disabled RX");
+			if (err)
+				return err;
+		}
+		if (alloc_dis & BIT(NPA_INPQ_NIX0_TX)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tNIX0:alloc disabled TX");
+			if (err)
+				return err;
+		}
+		if (alloc_dis & BIT(NPA_INPQ_NIX1_RX)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tNIX1: alloc disabled RX");
+			if (err)
+				return err;
+		}
+		if (alloc_dis & BIT(NPA_INPQ_NIX1_TX)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tNIX1:alloc disabled TX");
+			if (err)
+				return err;
+		}
+		if (alloc_dis & BIT(NPA_INPQ_SSO)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tAlloc Disabled for SSO");
+			if (err)
+				return err;
+		}
+		if (alloc_dis & BIT(NPA_INPQ_TIM)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tAlloc Disabled for TIM");
+			if (err)
+				return err;
+		}
+		if (alloc_dis & BIT(NPA_INPQ_DPI)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tAlloc Disabled for DPI");
+			if (err)
+				return err;
+		}
+		if (alloc_dis & BIT(NPA_INPQ_AURA_OP)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tAlloc Disabled for AURA");
+			if (err)
+				return err;
+		}
+		err = rvu_report_pair_end(fmsg);
+		if (err)
+			return err;
+		break;
+	case NPA_AF_RVU_ERR:
+		err = rvu_report_pair_start(fmsg, "NPA_AF_ERR");
+		if (err)
+			return err;
+		err = devlink_fmsg_u64_pair_put(fmsg, "\tNPA Error Interrupt Reg ",
+						npa_event_context->npa_af_rvu_err);
+		if (err)
+			return err;
+
+		if (npa_event_context->npa_af_rvu_err & BIT_ULL(14)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tFault on NPA_AQ_INST_S read");
+			if (err)
+				return err;
+		}
+		if (npa_event_context->npa_af_rvu_err & BIT_ULL(13)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tFault on NPA_AQ_RES_S write");
+			if (err)
+				return err;
+		}
+		if (npa_event_context->npa_af_rvu_err & BIT_ULL(12)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tAQ Doorbell Error");
+			if (err)
+				return err;
+		}
+		err = rvu_report_pair_end(fmsg);
+		if (err)
+			return err;
+		break;
+	case NPA_AF_RVU_RAS:
+		err = rvu_report_pair_start(fmsg, "NPA_AF_RVU_RAS");
+		if (err)
+			return err;
+		err = devlink_fmsg_u64_pair_put(fmsg, "\tNPA RAS Interrupt Reg ",
+						npa_event_context->npa_af_rvu_ras);
+		if (err)
+			return err;
+		if (npa_event_context->npa_af_rvu_ras & BIT_ULL(34)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tPoison data on NPA_AQ_INST_S");
+			if (err)
+				return err;
+		}
+		if (npa_event_context->npa_af_rvu_ras & BIT_ULL(33)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tPoison data on NPA_AQ_RES_S");
+			if (err)
+				return err;
+		}
+		if (npa_event_context->npa_af_rvu_ras & BIT_ULL(32)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tPoison data on HW context");
+			if (err)
+				return err;
+		}
+		err = rvu_report_pair_end(fmsg);
+		if (err)
+			return err;
+		break;
+	case NPA_AF_RVU_INTR:
+		err = rvu_report_pair_start(fmsg, "NPA_AF_RVU");
+		if (err)
+			return err;
+		err = devlink_fmsg_u64_pair_put(fmsg, "\tNPA RVU Interrupt Reg ",
+						npa_event_context->npa_af_rvu_int);
+		if (err)
+			return err;
+		if (npa_event_context->npa_af_rvu_int & BIT_ULL(0)) {
+			err = devlink_fmsg_string_put(fmsg, "\n\tUnmap Slot Error");
+			if (err)
+				return err;
+		}
+		return rvu_report_pair_end(fmsg);
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int rvu_hw_npa_intr_dump(struct devlink_health_reporter *reporter,
+				struct devlink_fmsg *fmsg, void *ctx,
+				struct netlink_ext_ack *netlink_extack)
+{
+	struct rvu *rvu = devlink_health_reporter_priv(reporter);
+	struct rvu_devlink *rvu_dl = rvu->rvu_dl;
+	struct rvu_npa_event_ctx *npa_ctx;
+
+	npa_ctx = rvu_dl->rvu_npa_health_reporter->npa_event_ctx;
+
+	return ctx ? rvu_npa_report_show(fmsg, ctx, NPA_AF_RVU_INTR) :
+		     rvu_npa_report_show(fmsg, npa_ctx, NPA_AF_RVU_INTR);
+}
+
+static int rvu_hw_npa_intr_recover(struct devlink_health_reporter *reporter,
+				   void *ctx, struct netlink_ext_ack *netlink_extack)
+{
+	struct rvu *rvu = devlink_health_reporter_priv(reporter);
+	struct rvu_npa_event_ctx *npa_event_ctx = ctx;
+	int blkaddr;
+
+	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPA, 0);
+	if (blkaddr < 0)
+		return blkaddr;
+
+	if (npa_event_ctx->npa_af_rvu_int)
+		rvu_write64(rvu, blkaddr, NPA_AF_RVU_INT_ENA_W1S, ~0ULL);
+
+	return 0;
+}
+
+static int rvu_hw_npa_gen_dump(struct devlink_health_reporter *reporter,
+			       struct devlink_fmsg *fmsg, void *ctx,
+			       struct netlink_ext_ack *netlink_extack)
+{
+	struct rvu *rvu = devlink_health_reporter_priv(reporter);
+	struct rvu_devlink *rvu_dl = rvu->rvu_dl;
+	struct rvu_npa_event_ctx *npa_ctx;
+
+	npa_ctx = rvu_dl->rvu_npa_health_reporter->npa_event_ctx;
+
+	return ctx ? rvu_npa_report_show(fmsg, ctx, NPA_AF_RVU_GEN) :
+		     rvu_npa_report_show(fmsg, npa_ctx, NPA_AF_RVU_GEN);
+}
+
+static int rvu_hw_npa_gen_recover(struct devlink_health_reporter *reporter,
+				  void *ctx, struct netlink_ext_ack *netlink_extack)
+{
+	struct rvu *rvu = devlink_health_reporter_priv(reporter);
+	struct rvu_npa_event_ctx *npa_event_ctx = ctx;
+	int blkaddr;
+
+	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPA, 0);
+	if (blkaddr < 0)
+		return blkaddr;
+
+	if (npa_event_ctx->npa_af_rvu_gen)
+		rvu_write64(rvu, blkaddr, NPA_AF_GEN_INT_ENA_W1S, ~0ULL);
+
+	return 0;
+}
+
+static int rvu_hw_npa_err_dump(struct devlink_health_reporter *reporter,
+			       struct devlink_fmsg *fmsg, void *ctx,
+			       struct netlink_ext_ack *netlink_extack)
+{
+	struct rvu *rvu = devlink_health_reporter_priv(reporter);
+	struct rvu_devlink *rvu_dl = rvu->rvu_dl;
+	struct rvu_npa_event_ctx *npa_ctx;
+
+	npa_ctx = rvu_dl->rvu_npa_health_reporter->npa_event_ctx;
+
+	return ctx ? rvu_npa_report_show(fmsg, ctx, NPA_AF_RVU_ERR) :
+		     rvu_npa_report_show(fmsg, npa_ctx, NPA_AF_RVU_ERR);
+}
+
+static int rvu_hw_npa_err_recover(struct devlink_health_reporter *reporter,
+				  void *ctx, struct netlink_ext_ack *netlink_extack)
+{
+	struct rvu *rvu = devlink_health_reporter_priv(reporter);
+	struct rvu_npa_event_ctx *npa_event_ctx = ctx;
+	int blkaddr;
+
+	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPA, 0);
+	if (blkaddr < 0)
+		return blkaddr;
+
+	if (npa_event_ctx->npa_af_rvu_err)
+		rvu_write64(rvu, blkaddr, NPA_AF_ERR_INT_ENA_W1S, ~0ULL);
+
+	return 0;
+}
+
+static int rvu_hw_npa_ras_dump(struct devlink_health_reporter *reporter,
+			       struct devlink_fmsg *fmsg, void *ctx,
+			       struct netlink_ext_ack *netlink_extack)
+{
+	struct rvu *rvu = devlink_health_reporter_priv(reporter);
+	struct rvu_devlink *rvu_dl = rvu->rvu_dl;
+	struct rvu_npa_event_ctx *npa_ctx;
+
+	npa_ctx = rvu_dl->rvu_npa_health_reporter->npa_event_ctx;
+
+	return ctx ? rvu_npa_report_show(fmsg, ctx, NPA_AF_RVU_RAS) :
+		     rvu_npa_report_show(fmsg, npa_ctx, NPA_AF_RVU_RAS);
+}
+
+static int rvu_hw_npa_ras_recover(struct devlink_health_reporter *reporter,
+				  void *ctx, struct netlink_ext_ack *netlink_extack)
+{
+	struct rvu *rvu = devlink_health_reporter_priv(reporter);
+	struct rvu_npa_event_ctx *npa_event_ctx = ctx;
+	int blkaddr;
+
+	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPA, 0);
+	if (blkaddr < 0)
+		return blkaddr;
+
+	if (npa_event_ctx->npa_af_rvu_ras)
+		rvu_write64(rvu, blkaddr, NPA_AF_RAS_ENA_W1S, ~0ULL);
+
+	return 0;
+}
+
+RVU_REPORTERS(hw_npa_intr);
+RVU_REPORTERS(hw_npa_gen);
+RVU_REPORTERS(hw_npa_err);
+RVU_REPORTERS(hw_npa_ras);
+
+static void rvu_npa_health_reporters_destroy(struct rvu_devlink *rvu_dl);
+
+static int rvu_npa_register_reporters(struct rvu_devlink *rvu_dl)
+{
+	struct rvu_npa_health_reporters *rvu_reporters;
+	struct rvu_npa_event_ctx *npa_event_context;
+	struct rvu *rvu = rvu_dl->rvu;
+
+	rvu_reporters = kzalloc(sizeof(*rvu_reporters), GFP_KERNEL);
+	if (!rvu_reporters)
+		return -ENOMEM;
+
+	rvu_dl->rvu_npa_health_reporter = rvu_reporters;
+	npa_event_context = kzalloc(sizeof(*npa_event_context), GFP_KERNEL);
+	if (!npa_event_context)
+		return -ENOMEM;
+
+	rvu_reporters->npa_event_ctx = npa_event_context;
+	rvu_reporters->rvu_hw_npa_intr_reporter =
+		devlink_health_reporter_create(rvu_dl->dl, &rvu_hw_npa_intr_reporter_ops, 0, rvu);
+	if (IS_ERR(rvu_reporters->rvu_hw_npa_intr_reporter)) {
+		dev_warn(rvu->dev, "Failed to create hw_npa_intr reporter, err=%ld\n",
+			 PTR_ERR(rvu_reporters->rvu_hw_npa_intr_reporter));
+		return PTR_ERR(rvu_reporters->rvu_hw_npa_intr_reporter);
+	}
+
+	rvu_reporters->rvu_hw_npa_gen_reporter =
+		devlink_health_reporter_create(rvu_dl->dl, &rvu_hw_npa_gen_reporter_ops, 0, rvu);
+	if (IS_ERR(rvu_reporters->rvu_hw_npa_gen_reporter)) {
+		dev_warn(rvu->dev, "Failed to create hw_npa_gen reporter, err=%ld\n",
+			 PTR_ERR(rvu_reporters->rvu_hw_npa_gen_reporter));
+		return PTR_ERR(rvu_reporters->rvu_hw_npa_gen_reporter);
+	}
+
+	rvu_reporters->rvu_hw_npa_err_reporter =
+		devlink_health_reporter_create(rvu_dl->dl, &rvu_hw_npa_err_reporter_ops, 0, rvu);
+	if (IS_ERR(rvu_reporters->rvu_hw_npa_err_reporter)) {
+		dev_warn(rvu->dev, "Failed to create hw_npa_err reporter, err=%ld\n",
+			 PTR_ERR(rvu_reporters->rvu_hw_npa_err_reporter));
+		return PTR_ERR(rvu_reporters->rvu_hw_npa_err_reporter);
+	}
+
+	rvu_reporters->rvu_hw_npa_ras_reporter =
+		devlink_health_reporter_create(rvu_dl->dl, &rvu_hw_npa_ras_reporter_ops, 0, rvu);
+	if (IS_ERR(rvu_reporters->rvu_hw_npa_ras_reporter)) {
+		dev_warn(rvu->dev, "Failed to create hw_npa_ras reporter, err=%ld\n",
+			 PTR_ERR(rvu_reporters->rvu_hw_npa_ras_reporter));
+		return PTR_ERR(rvu_reporters->rvu_hw_npa_ras_reporter);
+	}
+
+	rvu_dl->devlink_wq = create_workqueue("rvu_devlink_wq");
+	if (!rvu_dl->devlink_wq)
+		goto err;
+
+	INIT_WORK(&rvu_reporters->intr_work, rvu_npa_intr_work);
+	INIT_WORK(&rvu_reporters->err_work, rvu_npa_err_work);
+	INIT_WORK(&rvu_reporters->gen_work, rvu_npa_gen_work);
+	INIT_WORK(&rvu_reporters->ras_work, rvu_npa_ras_work);
+
+	return 0;
+err:
+	rvu_npa_health_reporters_destroy(rvu_dl);
+	return -ENOMEM;
+}
+
+static int rvu_npa_health_reporters_create(struct rvu_devlink *rvu_dl)
+{
+	struct rvu *rvu = rvu_dl->rvu;
+	int err;
+
+	err = rvu_npa_register_reporters(rvu_dl);
+	if (err) {
+		dev_warn(rvu->dev, "Failed to create npa reporter, err =%d\n",
+			 err);
+		return err;
+	}
+	rvu_npa_register_interrupts(rvu);
+
+	return 0;
+}
+
+static void rvu_npa_health_reporters_destroy(struct rvu_devlink *rvu_dl)
+{
+	struct rvu_npa_health_reporters *npa_reporters;
+	struct rvu *rvu = rvu_dl->rvu;
+
+	npa_reporters = rvu_dl->rvu_npa_health_reporter;
+
+	if (!npa_reporters->rvu_hw_npa_ras_reporter)
+		return;
+	if (!IS_ERR_OR_NULL(npa_reporters->rvu_hw_npa_intr_reporter))
+		devlink_health_reporter_destroy(npa_reporters->rvu_hw_npa_intr_reporter);
+
+	if (!IS_ERR_OR_NULL(npa_reporters->rvu_hw_npa_gen_reporter))
+		devlink_health_reporter_destroy(npa_reporters->rvu_hw_npa_gen_reporter);
+
+	if (!IS_ERR_OR_NULL(npa_reporters->rvu_hw_npa_err_reporter))
+		devlink_health_reporter_destroy(npa_reporters->rvu_hw_npa_err_reporter);
+
+	if (!IS_ERR_OR_NULL(npa_reporters->rvu_hw_npa_ras_reporter))
+		devlink_health_reporter_destroy(npa_reporters->rvu_hw_npa_ras_reporter);
+
+	rvu_npa_unregister_interrupts(rvu);
+	kfree(rvu_dl->rvu_npa_health_reporter->npa_event_ctx);
+	kfree(rvu_dl->rvu_npa_health_reporter);
+}
+
+static int rvu_health_reporters_create(struct rvu *rvu)
+{
+	struct rvu_devlink *rvu_dl;
+
+	rvu_dl = rvu->rvu_dl;
+	return rvu_npa_health_reporters_create(rvu_dl);
+}
+
+static void rvu_health_reporters_destroy(struct rvu *rvu)
+{
+	struct rvu_devlink *rvu_dl;
+
+	if (!rvu->rvu_dl)
+		return;
+
+	rvu_dl = rvu->rvu_dl;
+	rvu_npa_health_reporters_destroy(rvu_dl);
+}
+
 static int rvu_devlink_info_get(struct devlink *devlink, struct devlink_info_req *req,
 				struct netlink_ext_ack *extack)
 {
@@ -47,7 +751,8 @@ int rvu_register_dl(struct rvu *rvu)
 	rvu_dl->dl = dl;
 	rvu_dl->rvu = rvu;
 	rvu->rvu_dl = rvu_dl;
-	return 0;
+
+	return rvu_health_reporters_create(rvu);
 }
 
 void rvu_unregister_dl(struct rvu *rvu)
@@ -58,6 +763,7 @@ void rvu_unregister_dl(struct rvu *rvu)
 	if (!dl)
 		return;
 
+	rvu_health_reporters_destroy(rvu);
 	devlink_unregister(dl);
 	devlink_free(dl);
 	kfree(rvu_dl);
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.h
index 1ed6dde79a4e..d7578fa92ac1 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.h
@@ -8,9 +8,44 @@
 #ifndef RVU_DEVLINK_H
 #define  RVU_DEVLINK_H
 
+#define RVU_REPORTERS(_name)  \
+static const struct devlink_health_reporter_ops  rvu_ ## _name ## _reporter_ops =  { \
+	.name = #_name, \
+	.recover = rvu_ ## _name ## _recover, \
+	.dump = rvu_ ## _name ## _dump, \
+}
+
+enum npa_af_rvu_health {
+	NPA_AF_RVU_INTR,
+	NPA_AF_RVU_GEN,
+	NPA_AF_RVU_ERR,
+	NPA_AF_RVU_RAS,
+};
+
+struct rvu_npa_event_ctx {
+	u64 npa_af_rvu_int;
+	u64 npa_af_rvu_gen;
+	u64 npa_af_rvu_err;
+	u64 npa_af_rvu_ras;
+};
+
+struct rvu_npa_health_reporters {
+	struct rvu_npa_event_ctx *npa_event_ctx;
+	struct devlink_health_reporter *rvu_hw_npa_intr_reporter;
+	struct work_struct              intr_work;
+	struct devlink_health_reporter *rvu_hw_npa_gen_reporter;
+	struct work_struct              gen_work;
+	struct devlink_health_reporter *rvu_hw_npa_err_reporter;
+	struct work_struct             err_work;
+	struct devlink_health_reporter *rvu_hw_npa_ras_reporter;
+	struct work_struct              ras_work;
+};
+
 struct rvu_devlink {
 	struct devlink *dl;
 	struct rvu *rvu;
+	struct workqueue_struct *devlink_wq;
+	struct rvu_npa_health_reporters *rvu_npa_health_reporter;
 };
 
 /* Devlink APIs */
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_struct.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu_struct.h
index 723643868589..e2153d47c373 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_struct.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_struct.h
@@ -64,6 +64,16 @@ enum rvu_af_int_vec_e {
 	RVU_AF_INT_VEC_CNT    = 0x5,
 };
 
+/* NPA Admin function Interrupt Vector Enumeration */
+enum npa_af_int_vec_e {
+	NPA_AF_INT_VEC_RVU	= 0x0,
+	NPA_AF_INT_VEC_GEN	= 0x1,
+	NPA_AF_INT_VEC_AQ_DONE	= 0x2,
+	NPA_AF_INT_VEC_AF_ERR	= 0x3,
+	NPA_AF_INT_VEC_POISON	= 0x4,
+	NPA_AF_INT_VEC_CNT	= 0x5,
+};
+
 /**
  * RVU PF Interrupt Vector Enumeration
  */
@@ -104,6 +114,19 @@ enum npa_aq_instop {
 	NPA_AQ_INSTOP_UNLOCK = 0x5,
 };
 
+/* ALLOC/FREE input queues Enumeration from coprocessors */
+enum npa_inpq {
+	NPA_INPQ_NIX0_RX       = 0x0,
+	NPA_INPQ_NIX0_TX       = 0x1,
+	NPA_INPQ_NIX1_RX       = 0x2,
+	NPA_INPQ_NIX1_TX       = 0x3,
+	NPA_INPQ_SSO           = 0x4,
+	NPA_INPQ_TIM           = 0x5,
+	NPA_INPQ_DPI           = 0x6,
+	NPA_INPQ_AURA_OP       = 0xe,
+	NPA_INPQ_INTERNAL_RSV  = 0xf,
+};
+
 /* NPA admin queue instruction structure */
 struct npa_aq_inst_s {
 #if defined(__BIG_ENDIAN_BITFIELD)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] docs: octeontx2: Add Documentation for NPA health reporters
  2020-12-11  6:25 [PATCHv6 net-next 0/3] Add devlink and devlink health reporters to George Cherian
  2020-12-11  6:25 ` [PATCHv6 net-next 1/3] octeontx2-af: Add devlink suppoort to af driver George Cherian
  2020-12-11  6:25 ` [PATCH 2/3] octeontx2-af: Add devlink health reporters for NPA George Cherian
@ 2020-12-11  6:25 ` George Cherian
  2020-12-15  2:00 ` [PATCHv6 net-next 0/3] Add devlink and devlink health reporters to patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: George Cherian @ 2020-12-11  6:25 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: kuba, davem, sgoutham, lcherian, gakula, george.cherian,
	willemdebruijn.kernel, saeed, jiri

Add Documentation for devlink health reporters for NPA block.

Signed-off-by: George Cherian <george.cherian@marvell.com>
---
 .../ethernet/marvell/octeontx2.rst            | 50 +++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst b/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst
index 88f508338c5f..d3fcf536d14e 100644
--- a/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst
+++ b/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst
@@ -12,6 +12,7 @@ Contents
 - `Overview`_
 - `Drivers`_
 - `Basic packet flow`_
+- `Devlink health reporters`_
 
 Overview
 ========
@@ -157,3 +158,52 @@ Egress
 3. The SQ descriptor ring is maintained in buffers allocated from SQ mapped pool of NPA block LF.
 4. NIX block transmits the pkt on the designated channel.
 5. NPC MCAM entries can be installed to divert pkt onto a different channel.
+
+Devlink health reporters
+========================
+
+NPA Reporters
+-------------
+The NPA reporters are responsible for reporting and recovering the following group of errors
+1. GENERAL events
+   - Error due to operation of unmapped PF.
+   - Error due to disabled alloc/free for other HW blocks (NIX, SSO, TIM, DPI and AURA).
+2. ERROR events
+   - Fault due to NPA_AQ_INST_S read or NPA_AQ_RES_S write.
+   - AQ Doorbell Error.
+3. RAS events
+   - RAS Error Reporting for NPA_AQ_INST_S/NPA_AQ_RES_S.
+4. RVU events
+   - Error due to unmapped slot.
+
+Sample Output
+-------------
+~# devlink health
+pci/0002:01:00.0:
+  reporter hw_npa_intr
+      state healthy error 2872 recover 2872 last_dump_date 2020-12-10 last_dump_time 09:39:09 grace_period 0 auto_recover true auto_dump true
+  reporter hw_npa_gen
+      state healthy error 2872 recover 2872 last_dump_date 2020-12-11 last_dump_time 04:43:04 grace_period 0 auto_recover true auto_dump true
+  reporter hw_npa_err
+      state healthy error 2871 recover 2871 last_dump_date 2020-12-10 last_dump_time 09:39:17 grace_period 0 auto_recover true auto_dump true
+   reporter hw_npa_ras
+      state healthy error 0 recover 0 last_dump_date 2020-12-10 last_dump_time 09:32:40 grace_period 0 auto_recover true auto_dump true
+
+Each reporter dumps the
+ - Error Type
+ - Error Register value
+ - Reason in words
+
+For eg:
+~# devlink health dump show  pci/0002:01:00.0 reporter hw_npa_gen
+ NPA_AF_GENERAL:
+         NPA General Interrupt Reg : 1
+         NIX0: free disabled RX
+~# devlink health dump show  pci/0002:01:00.0 reporter hw_npa_intr
+ NPA_AF_RVU:
+         NPA RVU Interrupt Reg : 1
+         Unmap Slot Error
+~# devlink health dump show  pci/0002:01:00.0 reporter hw_npa_err
+ NPA_AF_ERR:
+        NPA Error Interrupt Reg : 4096
+        AQ Doorbell Error
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCHv6 net-next 0/3] Add devlink and devlink health reporters to
  2020-12-11  6:25 [PATCHv6 net-next 0/3] Add devlink and devlink health reporters to George Cherian
                   ` (2 preceding siblings ...)
  2020-12-11  6:25 ` [PATCH 3/3] docs: octeontx2: Add Documentation for NPA health reporters George Cherian
@ 2020-12-15  2:00 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2020-12-15  2:00 UTC (permalink / raw)
  To: George Cherian
  Cc: netdev, linux-kernel, kuba, davem, sgoutham, lcherian, gakula,
	willemdebruijn.kernel, saeed, jiri

Hello:

This series was applied to netdev/net-next.git (refs/heads/master):

On Fri, 11 Dec 2020 11:55:23 +0530 you wrote:
> Add basic devlink and devlink health reporters.
> Devlink health reporters are added for NPA block.
> 
> Address Jakub's comment to add devlink support for error reporting.
> https://www.spinics.net/lists/netdev/msg670712.html
> 
> For now, I have dropped the NIX block health reporters.
> This series attempts to add health reporters only for the NPA block.
> As per Jakub's suggestion separate reporters per event is used and also
> got rid of the counters.
> 
> [...]

Here is the summary with links:
  - [PATCHv6,net-next,1/3] octeontx2-af: Add devlink suppoort to af driver
    https://git.kernel.org/netdev/net-next/c/fae06da4f261
  - [2/3] octeontx2-af: Add devlink health reporters for NPA
    https://git.kernel.org/netdev/net-next/c/f1168d1e207c
  - [3/3] docs: octeontx2: Add Documentation for NPA health reporters
    https://git.kernel.org/netdev/net-next/c/80b9414832a1

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-12-15  2:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-11  6:25 [PATCHv6 net-next 0/3] Add devlink and devlink health reporters to George Cherian
2020-12-11  6:25 ` [PATCHv6 net-next 1/3] octeontx2-af: Add devlink suppoort to af driver George Cherian
2020-12-11  6:25 ` [PATCH 2/3] octeontx2-af: Add devlink health reporters for NPA George Cherian
2020-12-11  6:25 ` [PATCH 3/3] docs: octeontx2: Add Documentation for NPA health reporters George Cherian
2020-12-15  2:00 ` [PATCHv6 net-next 0/3] Add devlink and devlink health reporters to patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).