All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
@ 2015-05-29 22:04 ` Loc Ho
  0 siblings, 0 replies; 24+ messages in thread
From: Loc Ho @ 2015-05-29 22:04 UTC (permalink / raw)
  To: dougthompson-aS9lmoZGLiVWk0Htik3J/w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	mchehab-JPH+aEBZ4P+UEJcrhfAQsw, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	mark.rutland-5wv7dgnIgG8, ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg
  Cc: linux-edac-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	jcm-H+wXaHxf7aLQT0dZR+AlfA, patches-qTEPVZfXA3Y, Loc Ho

v1:
* Add L3/SoC support to the APM X-Gene SoC EDAC driver

---
Loc Ho (3):
  Documentation: Update the APM X-Gene SoC EDAC DTS binding for L3/SoC
    subnodes
  edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
  arm64: Add L3/SoC DT subnodes to the APM X-Gene SoC EDAC node

 .../devicetree/bindings/edac/apm-xgene-edac.txt    |   18 +
 arch/arm64/boot/dts/apm/apm-storm.dtsi             |   10 +
 drivers/edac/xgene_edac.c                          |  746 +++++++++++++++++++-
 3 files changed, 773 insertions(+), 1 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 0/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
@ 2015-05-29 22:04 ` Loc Ho
  0 siblings, 0 replies; 24+ messages in thread
From: Loc Ho @ 2015-05-29 22:04 UTC (permalink / raw)
  To: linux-arm-kernel

v1:
* Add L3/SoC support to the APM X-Gene SoC EDAC driver

---
Loc Ho (3):
  Documentation: Update the APM X-Gene SoC EDAC DTS binding for L3/SoC
    subnodes
  edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
  arm64: Add L3/SoC DT subnodes to the APM X-Gene SoC EDAC node

 .../devicetree/bindings/edac/apm-xgene-edac.txt    |   18 +
 arch/arm64/boot/dts/apm/apm-storm.dtsi             |   10 +
 drivers/edac/xgene_edac.c                          |  746 +++++++++++++++++++-
 3 files changed, 773 insertions(+), 1 deletions(-)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/3] Documentation: Update the APM X-Gene SoC EDAC DTS binding for L3/SoC subnodes
  2015-05-29 22:04 ` Loc Ho
@ 2015-05-29 22:04     ` Loc Ho
  -1 siblings, 0 replies; 24+ messages in thread
From: Loc Ho @ 2015-05-29 22:04 UTC (permalink / raw)
  To: dougthompson-aS9lmoZGLiVWk0Htik3J/w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	mchehab-JPH+aEBZ4P+UEJcrhfAQsw, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	mark.rutland-5wv7dgnIgG8, ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg
  Cc: linux-edac-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	jcm-H+wXaHxf7aLQT0dZR+AlfA, patches-qTEPVZfXA3Y, Loc Ho

This patch updates documentation for the APM X-Gene SoC EDAC DTS binding
for L3/SoC subnodes.

Signed-off-by: Loc Ho <lho-qTEPVZfXA3Y@public.gmane.org>
---
 .../devicetree/bindings/edac/apm-xgene-edac.txt    |   18 ++++++++++++++++++
 1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/Documentation/devicetree/bindings/edac/apm-xgene-edac.txt b/Documentation/devicetree/bindings/edac/apm-xgene-edac.txt
index 480911c..0aa4bd3 100644
--- a/Documentation/devicetree/bindings/edac/apm-xgene-edac.txt
+++ b/Documentation/devicetree/bindings/edac/apm-xgene-edac.txt
@@ -29,6 +29,14 @@ Required properties for PMD subnode:
 - reg			: First resource shall be the PMD resource.
 - pmd-controller	: Instance number of the PMD controller.
 
+Required properties for L3 subnode:
+- compatible		: Shall be "apm,xgene-edac-l3".
+- reg			: First resource shall be the L3 resource.
+
+Required properties for SoC subnode:
+- compatible		: Shall be "apm,xgene-edac-soc".
+- reg			: First resource shall be the SoC resource.
+
 Example:
 	csw: csw@7e200000 {
 		compatible = "apm,xgene-csw", "syscon";
@@ -75,4 +83,14 @@ Example:
 			reg = <0x0 0x7c000000 0x0 0x200000>;
 			pmd-controller = <0>;
 		};
+
+		edacl3@7e600000 {
+			compatible = "apm,xgene-edac-l3";
+			reg = <0x0 0x7e600000 0x0 0x1000>;
+		};
+
+		edacsoc@7e930000 {
+			compatible = "apm,xgene-edac-soc";
+			reg = <0x0 0x7e930000 0x0 0x1000>;
+		};
 	};
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 1/3] Documentation: Update the APM X-Gene SoC EDAC DTS binding for L3/SoC subnodes
@ 2015-05-29 22:04     ` Loc Ho
  0 siblings, 0 replies; 24+ messages in thread
From: Loc Ho @ 2015-05-29 22:04 UTC (permalink / raw)
  To: linux-arm-kernel

This patch updates documentation for the APM X-Gene SoC EDAC DTS binding
for L3/SoC subnodes.

Signed-off-by: Loc Ho <lho@apm.com>
---
 .../devicetree/bindings/edac/apm-xgene-edac.txt    |   18 ++++++++++++++++++
 1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/Documentation/devicetree/bindings/edac/apm-xgene-edac.txt b/Documentation/devicetree/bindings/edac/apm-xgene-edac.txt
index 480911c..0aa4bd3 100644
--- a/Documentation/devicetree/bindings/edac/apm-xgene-edac.txt
+++ b/Documentation/devicetree/bindings/edac/apm-xgene-edac.txt
@@ -29,6 +29,14 @@ Required properties for PMD subnode:
 - reg			: First resource shall be the PMD resource.
 - pmd-controller	: Instance number of the PMD controller.
 
+Required properties for L3 subnode:
+- compatible		: Shall be "apm,xgene-edac-l3".
+- reg			: First resource shall be the L3 resource.
+
+Required properties for SoC subnode:
+- compatible		: Shall be "apm,xgene-edac-soc".
+- reg			: First resource shall be the SoC resource.
+
 Example:
 	csw: csw at 7e200000 {
 		compatible = "apm,xgene-csw", "syscon";
@@ -75,4 +83,14 @@ Example:
 			reg = <0x0 0x7c000000 0x0 0x200000>;
 			pmd-controller = <0>;
 		};
+
+		edacl3 at 7e600000 {
+			compatible = "apm,xgene-edac-l3";
+			reg = <0x0 0x7e600000 0x0 0x1000>;
+		};
+
+		edacsoc at 7e930000 {
+			compatible = "apm,xgene-edac-soc";
+			reg = <0x0 0x7e930000 0x0 0x1000>;
+		};
 	};
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
  2015-05-29 22:04     ` Loc Ho
@ 2015-05-29 22:04         ` Loc Ho
  -1 siblings, 0 replies; 24+ messages in thread
From: Loc Ho @ 2015-05-29 22:04 UTC (permalink / raw)
  To: dougthompson-aS9lmoZGLiVWk0Htik3J/w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	mchehab-JPH+aEBZ4P+UEJcrhfAQsw, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	mark.rutland-5wv7dgnIgG8, ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg
  Cc: linux-edac-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	jcm-H+wXaHxf7aLQT0dZR+AlfA, patches-qTEPVZfXA3Y, Loc Ho

This patch adds EDAC support for the L3 and SoC components.

Signed-off-by: Loc Ho <lho-qTEPVZfXA3Y@public.gmane.org>
---
 drivers/edac/xgene_edac.c |  746 ++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 745 insertions(+), 1 deletions(-)

diff --git a/drivers/edac/xgene_edac.c b/drivers/edac/xgene_edac.c
index b515857..934f6c5 100644
--- a/drivers/edac/xgene_edac.c
+++ b/drivers/edac/xgene_edac.c
@@ -66,6 +66,8 @@ struct xgene_edac {
 
 	struct list_head	mcus;
 	struct list_head	pmds;
+	struct list_head	l3s;
+	struct list_head	socs;
 
 	struct mutex		mc_lock;
 	int			mc_active_mask;
@@ -1057,10 +1059,727 @@ static int xgene_edac_pmd_remove(struct xgene_edac_pmd_ctx *pmd)
 	return 0;
 }
 
+/* L3 Error device */
+#define L3C_ESR				(0x0A * 4)
+#define  L3C_ESR_DATATAG_MASK		0x00000200
+#define  L3C_ESR_MULTIHIT_MASK		0x00000100
+#define  L3C_ESR_UCEVICT_MASK		0x00000040
+#define  L3C_ESR_MULTIUCERR_MASK	0x00000020
+#define  L3C_ESR_MULTICERR_MASK		0x00000010
+#define  L3C_ESR_UCERR_MASK		0x00000008
+#define  L3C_ESR_CERR_MASK		0x00000004
+#define  L3C_ESR_UCERRINTR_MASK		0x00000002
+#define  L3C_ESR_CERRINTR_MASK		0x00000001
+#define L3C_ECR				(0x0B * 4)
+#define  L3C_ECR_UCINTREN		0x00000008
+#define  L3C_ECR_CINTREN		0x00000004
+#define  L3C_UCERREN			0x00000002
+#define  L3C_CERREN			0x00000001
+#define L3C_ELR				(0x0C * 4)
+#define  L3C_ELR_ERRSYN(src)		((src & 0xFF800000) >> 23)
+#define  L3C_ELR_ERRWAY(src)		((src & 0x007E0000) >> 17)
+#define  L3C_ELR_AGENTID(src)		((src & 0x0001E000) >> 13)
+#define  L3C_ELR_ERRGRP(src)		((src & 0x00000F00) >> 8)
+#define  L3C_ELR_OPTYPE(src)		((src & 0x000000F0) >> 4)
+#define  L3C_ELR_PADDRHIGH(src)		(src & 0x0000000F)
+#define L3C_AELR			(0x0D * 4)
+#define L3C_BELR			(0x0E * 4)
+#define  L3C_BELR_BANK(src)		(src & 0x0000000F)
+
+struct xgene_edac_dev_ctx {
+	struct list_head	next;
+	struct device		ddev;
+	char			*name;
+	struct xgene_edac	*edac;
+	struct edac_device_ctl_info *edac_dev;
+	int			edac_idx;
+	void __iomem		*dev_csr;
+};
+
+static void xgene_edac_l3_check(struct edac_device_ctl_info *edac_dev)
+{
+	struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
+	u32 l3cesr;
+	u32 l3celr;
+	u32 l3caelr;
+	u32 l3cbelr;
+
+	l3cesr = readl(ctx->dev_csr + L3C_ESR);
+	if (!(l3cesr & (L3C_ESR_UCERR_MASK | L3C_ESR_CERR_MASK)))
+		return;
+
+	if (l3cesr & L3C_ESR_UCERR_MASK)
+		dev_err(edac_dev->dev, "L3C uncorrectable error\n");
+	if (l3cesr & L3C_ESR_CERR_MASK)
+		dev_warn(edac_dev->dev, "L3C correctable error\n");
+
+	l3celr = readl(ctx->dev_csr + L3C_ELR);
+	l3caelr = readl(ctx->dev_csr + L3C_AELR);
+	l3cbelr = readl(ctx->dev_csr + L3C_BELR);
+	if (l3cesr & L3C_ESR_MULTIHIT_MASK)
+		dev_err(edac_dev->dev, "L3C multiple hit error\n");
+	if (l3cesr & L3C_ESR_UCEVICT_MASK)
+		dev_err(edac_dev->dev,
+			"L3C dropped eviction of line with error\n");
+	if (l3cesr & L3C_ESR_MULTIUCERR_MASK)
+		dev_err(edac_dev->dev, "L3C multiple uncorrectable error\n");
+	if (l3cesr & L3C_ESR_DATATAG_MASK)
+		dev_err(edac_dev->dev,
+			"L3C data error syndrome 0x%X group 0x%X\n",
+			L3C_ELR_ERRSYN(l3celr), L3C_ELR_ERRGRP(l3celr));
+	else
+		dev_err(edac_dev->dev,
+			"L3C tag error syndrome 0x%X Way of Tag 0x%X Agent ID 0x%X Operation type 0x%X\n",
+			L3C_ELR_ERRSYN(l3celr), L3C_ELR_ERRWAY(l3celr),
+			L3C_ELR_AGENTID(l3celr), L3C_ELR_OPTYPE(l3celr));
+	/*
+	 * NOTE: Address [41:38] in L3C_ELR_PADDRHIGH(l3celr).
+	 *       Address [37:6] in l3caelr. Lower 6 bits are zero.
+	 */
+	dev_err(edac_dev->dev, "L3C error address 0x%08X.%08X bank %d\n",
+		L3C_ELR_PADDRHIGH(l3celr) << 6 | (l3caelr >> 26),
+		(l3caelr & 0x3FFFFFFF) << 6, L3C_BELR_BANK(l3cbelr));
+	dev_err(edac_dev->dev,
+		"L3C error status register value 0x%X\n", l3cesr);
+
+	/* Clear L3C error interrupt */
+	writel(0, ctx->dev_csr + L3C_ESR);
+
+	if (l3cesr & L3C_ESR_CERR_MASK)
+		edac_device_handle_ce(edac_dev, 0, 0, edac_dev->ctl_name);
+	if (l3cesr & L3C_ESR_UCERR_MASK)
+		edac_device_handle_ue(edac_dev, 0, 0, edac_dev->ctl_name);
+}
+
+static void xgene_edac_l3_hw_init(struct edac_device_ctl_info *edac_dev,
+				  bool enable)
+{
+	struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
+	u32 val;
+
+	val = readl(ctx->dev_csr + L3C_ECR);
+	val |= L3C_UCERREN | L3C_CERREN;
+	/* On disable, we just disable interrupt but keep error enabled */
+	if (edac_dev->op_state == OP_RUNNING_INTERRUPT) {
+		if (enable)
+			val |= L3C_ECR_UCINTREN | L3C_ECR_CINTREN;
+		else
+			val &= ~(L3C_ECR_UCINTREN | L3C_ECR_CINTREN);
+	}
+	writel(val, ctx->dev_csr + L3C_ECR);
+
+	if (edac_dev->op_state == OP_RUNNING_INTERRUPT) {
+		/* Enable/disable L3 error top level interrupt */
+		if (enable) {
+			xgene_edac_pcp_clrbits(ctx->edac, PCPHPERRINTMSK,
+					       L3C_UNCORR_ERR_MASK);
+			xgene_edac_pcp_clrbits(ctx->edac, PCPLPERRINTMSK,
+					       L3C_CORR_ERR_MASK);
+		} else {
+			xgene_edac_pcp_setbits(ctx->edac, PCPHPERRINTMSK,
+					       L3C_UNCORR_ERR_MASK);
+			xgene_edac_pcp_setbits(ctx->edac, PCPLPERRINTMSK,
+					       L3C_CORR_ERR_MASK);
+		}
+	}
+}
+
+static ssize_t xgene_edac_l3_inject_ctrl_show(
+	struct edac_device_ctl_info *edac_dev, char *data)
+{
+	struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
+
+	return sprintf(data, "0x%08X", readl(ctx->dev_csr + L3C_ESR));
+}
+
+static ssize_t xgene_edac_l3_inject_ctrl_store(
+	struct edac_device_ctl_info *edac_dev, const char *data, size_t count)
+{
+	struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
+	u32 val;
+
+	if (kstrtou32(data, 0, &val))
+		return -EINVAL;
+	writel(val, ctx->dev_csr + L3C_ESR);
+	return count;
+}
+
+static struct edac_dev_sysfs_attribute xgene_edac_l3_sysfs_attributes[] = {
+	{ .attr = {
+		  .name = "inject_ctrl",
+		  .mode = (S_IRUGO | S_IWUSR)
+	  },
+	 .show = xgene_edac_l3_inject_ctrl_show,
+	 .store = xgene_edac_l3_inject_ctrl_store },
+
+	/* End of list */
+	{ .attr = {.name = NULL } }
+};
+
+static int xgene_edac_l3_add(struct xgene_edac *edac, struct device_node *np)
+{
+	struct edac_device_ctl_info *edac_dev;
+	struct xgene_edac_dev_ctx *ctx;
+	struct resource res;
+	int edac_idx;
+	int rc = 0;
+
+	if (!devres_open_group(edac->dev, xgene_edac_l3_add, GFP_KERNEL))
+		return -ENOMEM;
+
+	edac_idx = edac_device_alloc_index();
+	edac_dev = edac_device_alloc_ctl_info(sizeof(*ctx),
+					      "l3c", 1, "l3c", 1, 0, NULL, 0,
+					      edac_idx);
+	if (!edac_dev) {
+		rc = -ENOMEM;
+		goto err;
+	}
+
+	ctx = edac_dev->pvt_info;
+	ctx->name = "xgene_l3_err";
+	ctx->edac_idx = edac_idx;
+	ctx->edac = edac;
+	ctx->edac_dev = edac_dev;
+	ctx->ddev = *edac->dev;
+	edac_dev->dev = &ctx->ddev;
+	edac_dev->ctl_name = ctx->name;
+	edac_dev->dev_name = ctx->name;
+	edac_dev->mod_name = EDAC_MOD_STR;
+
+	rc = of_address_to_resource(np, 0, &res);
+	if (rc < 0) {
+		dev_err(edac->dev, "no L3 resource address\n");
+		goto err1;
+	}
+	ctx->dev_csr = devm_ioremap_resource(edac->dev, &res);
+	if (IS_ERR(ctx->dev_csr)) {
+		dev_err(edac->dev,
+			"devm_ioremap_resource failed for L3 resource address\n");
+		rc = PTR_ERR(ctx->dev_csr);
+		goto err1;
+	}
+
+	if (edac_op_state == EDAC_OPSTATE_POLL)
+		edac_dev->edac_check = xgene_edac_l3_check;
+
+	edac_dev->sysfs_attributes = xgene_edac_l3_sysfs_attributes;
+
+	rc = edac_device_add_device(edac_dev);
+	if (rc > 0) {
+		dev_err(edac->dev, "failed edac_device_add_device()\n");
+		rc = -ENOMEM;
+		goto err1;
+	}
+
+	if (edac_op_state == EDAC_OPSTATE_INT)
+		edac_dev->op_state = OP_RUNNING_INTERRUPT;
+
+	list_add(&ctx->next, &edac->l3s);
+
+	xgene_edac_l3_hw_init(edac_dev, 1);
+
+	devres_remove_group(edac->dev, xgene_edac_l3_add);
+
+	dev_info(edac->dev, "X-Gene EDAC L3 registered\n");
+	return 0;
+
+err1:
+	edac_device_free_ctl_info(edac_dev);
+err:
+	devres_release_group(edac->dev, xgene_edac_l3_add);
+	return rc;
+}
+
+static int xgene_edac_l3_remove(struct xgene_edac_dev_ctx *l3)
+{
+	struct edac_device_ctl_info *edac_dev = l3->edac_dev;
+
+	xgene_edac_l3_hw_init(edac_dev, 0);
+	edac_device_del_device(l3->edac->dev);
+	edac_device_free_ctl_info(edac_dev);
+	return 0;
+}
+
+/* SoC Error device */
+#define IOBAXIS0TRANSERRINTSTS		0x0000
+#define  IOBAXIS0_M_ILLEGAL_ACCESS_MASK	0x00000002
+#define  IOBAXIS0_ILLEGAL_ACCESS_MASK	0x00000001
+#define IOBAXIS0TRANSERRINTMSK		0x0004
+#define IOBAXIS0TRANSERRREQINFOL	0x0008
+#define IOBAXIS0TRANSERRREQINFOH	0x000c
+#define  REQTYPE_RD(src)		(((src) & 0x00000001))
+#define  ERRADDRH_RD(src)		(((src) & 0xffc00000) >> 22)
+#define IOBAXIS1TRANSERRINTSTS		0x0010
+#define IOBAXIS1TRANSERRINTMSK		0x0014
+#define IOBAXIS1TRANSERRREQINFOL	0x0018
+#define IOBAXIS1TRANSERRREQINFOH	0x001c
+#define IOBPATRANSERRINTSTS		0x0020
+#define  IOBPA_M_REQIDRAM_CORRUPT_MASK	0x00000080
+#define  IOBPA_REQIDRAM_CORRUPT_MASK	0x00000040
+#define  IOBPA_M_TRANS_CORRUPT_MASK	0x00000020
+#define  IOBPA_TRANS_CORRUPT_MASK	0x00000010
+#define  IOBPA_M_WDATA_CORRUPT_MASK	0x00000008
+#define  IOBPA_WDATA_CORRUPT_MASK	0x00000004
+#define  IOBPA_M_RDATA_CORRUPT_MASK	0x00000002
+#define  IOBPA_RDATA_CORRUPT_MASK	0x00000001
+#define IOBBATRANSERRINTSTS		0x0030
+#define  M_ILLEGAL_ACCESS_MASK		0x00008000
+#define  ILLEGAL_ACCESS_MASK		0x00004000
+#define  M_WIDRAM_CORRUPT_MASK		0x00002000
+#define  WIDRAM_CORRUPT_MASK		0x00001000
+#define  M_RIDRAM_CORRUPT_MASK		0x00000800
+#define  RIDRAM_CORRUPT_MASK		0x00000400
+#define  M_TRANS_CORRUPT_MASK		0x00000200
+#define  TRANS_CORRUPT_MASK		0x00000100
+#define  M_WDATA_CORRUPT_MASK		0x00000080
+#define  WDATA_CORRUPT_MASK		0x00000040
+#define  M_RBM_POISONED_REQ_MASK	0x00000020
+#define  RBM_POISONED_REQ_MASK		0x00000010
+#define  M_XGIC_POISONED_REQ_MASK	0x00000008
+#define  XGIC_POISONED_REQ_MASK		0x00000004
+#define  M_WRERR_RESP_MASK		0x00000002
+#define  WRERR_RESP_MASK		0x00000001
+#define IOBBATRANSERRREQINFOL		0x0038
+#define IOBBATRANSERRREQINFOH		0x003c
+#define  REQTYPE_F2_RD(src)		(((src) & 0x00000001))
+#define  ERRADDRH_F2_RD(src)		(((src) & 0xffc00000) >> 22)
+#define IOBBATRANSERRCSWREQID		0x0040
+#define XGICTRANSERRINTSTS		0x0050
+#define  M_WR_ACCESS_ERR_MASK		0x00000008
+#define  WR_ACCESS_ERR_MASK		0x00000004
+#define  M_RD_ACCESS_ERR_MASK		0x00000002
+#define  RD_ACCESS_ERR_MASK		0x00000001
+#define XGICTRANSERRINTMSK		0x0054
+#define XGICTRANSERRREQINFO		0x0058
+#define  REQTYPE_MASK			0x04000000
+#define  ERRADDR_RD(src)		((src) & 0x03ffffff)
+#define GLBL_ERR_STS			0x0800
+#define  MDED_ERR_MASK			0x00000008
+#define  DED_ERR_MASK			0x00000004
+#define  MSEC_ERR_MASK			0x00000002
+#define  SEC_ERR_MASK			0x00000001
+#define GLBL_SEC_ERRL			0x0810
+#define GLBL_SEC_ERRH			0x0818
+#define GLBL_MSEC_ERRL			0x0820
+#define GLBL_MSEC_ERRH			0x0828
+#define GLBL_DED_ERRL			0x0830
+#define GLBL_DED_ERRLMASK		0x0834
+#define GLBL_DED_ERRH			0x0838
+#define GLBL_DED_ERRHMASK		0x083c
+#define GLBL_MDED_ERRL			0x0840
+#define GLBL_MDED_ERRLMASK		0x0844
+#define GLBL_MDED_ERRH			0x0848
+#define GLBL_MDED_ERRHMASK		0x084c
+
+static void xgene_edac_iob_gic_report(struct edac_device_ctl_info *edac_dev)
+{
+	struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
+	u32 err_addr_lo;
+	u32 err_addr_hi;
+	u32 reg;
+	u32 info;
+
+	/* GIC transaction error interrupt */
+	reg = readl(ctx->dev_csr + XGICTRANSERRINTSTS);
+	if (reg) {
+		dev_err(edac_dev->dev, "XGIC transaction error\n");
+		if (reg & RD_ACCESS_ERR_MASK)
+			dev_err(edac_dev->dev, "XGIC read size error\n");
+		if (reg & M_RD_ACCESS_ERR_MASK)
+			dev_err(edac_dev->dev,
+				"Multiple XGIC read size error\n");
+		if (reg & WR_ACCESS_ERR_MASK)
+			dev_err(edac_dev->dev, "XGIC write size error\n");
+		if (reg & M_WR_ACCESS_ERR_MASK)
+			dev_err(edac_dev->dev,
+				"Multiple XGIC write size error\n");
+		info = readl(ctx->dev_csr + XGICTRANSERRREQINFO);
+		dev_err(edac_dev->dev, "XGIC %s access @ 0x%08X (0x%08X)\n",
+			info & REQTYPE_MASK ? "read" : "write",
+			ERRADDR_RD(info), info);
+		writel(reg, ctx->dev_csr + XGICTRANSERRINTSTS);
+	}
+
+	/* IOB memory error */
+	reg = readl(ctx->dev_csr + GLBL_ERR_STS);
+	if (reg) {
+		if (reg & SEC_ERR_MASK) {
+			err_addr_lo = readl(ctx->dev_csr + GLBL_SEC_ERRL);
+			err_addr_hi = readl(ctx->dev_csr + GLBL_SEC_ERRH);
+			dev_err(edac_dev->dev,
+				"IOB single-bit correctable memory at 0x%08X.%08X error\n",
+				err_addr_lo, err_addr_hi);
+			writel(err_addr_lo, ctx->dev_csr + GLBL_SEC_ERRL);
+			writel(err_addr_hi, ctx->dev_csr + GLBL_SEC_ERRH);
+		}
+		if (reg & MSEC_ERR_MASK) {
+			err_addr_lo = readl(ctx->dev_csr + GLBL_MSEC_ERRL);
+			err_addr_hi = readl(ctx->dev_csr + GLBL_MSEC_ERRH);
+			dev_err(edac_dev->dev,
+				"IOB multiple single-bit correctable memory at 0x%08X.%08X error\n",
+				err_addr_lo, err_addr_hi);
+			writel(err_addr_lo, ctx->dev_csr + GLBL_MSEC_ERRL);
+			writel(err_addr_hi, ctx->dev_csr + GLBL_MSEC_ERRH);
+		}
+		if (reg & (SEC_ERR_MASK | MSEC_ERR_MASK))
+			edac_device_handle_ce(edac_dev, 0, 0,
+					      edac_dev->ctl_name);
+
+		if (reg & DED_ERR_MASK) {
+			err_addr_lo = readl(ctx->dev_csr + GLBL_DED_ERRL);
+			err_addr_hi = readl(ctx->dev_csr + GLBL_DED_ERRH);
+			dev_err(edac_dev->dev,
+				"IOB double-bit uncorrectable memory at 0x%08X.%08X error\n",
+				err_addr_lo, err_addr_hi);
+			writel(err_addr_lo, ctx->dev_csr + GLBL_DED_ERRL);
+			writel(err_addr_hi, ctx->dev_csr + GLBL_DED_ERRH);
+		}
+		if (reg & MDED_ERR_MASK) {
+			err_addr_lo = readl(ctx->dev_csr + GLBL_MDED_ERRL);
+			err_addr_hi = readl(ctx->dev_csr + GLBL_MDED_ERRH);
+			dev_err(edac_dev->dev,
+				"Multiple IOB double-bit uncorrectable memory at 0x%08X.%08X error\n",
+				err_addr_lo, err_addr_hi);
+			writel(err_addr_lo, ctx->dev_csr + GLBL_MDED_ERRL);
+			writel(err_addr_hi, ctx->dev_csr + GLBL_MDED_ERRH);
+		}
+		if (reg & (DED_ERR_MASK | MDED_ERR_MASK))
+			edac_device_handle_ue(edac_dev, 0, 0,
+					      edac_dev->ctl_name);
+	}
+}
+
+static void xgene_edac_rb_report(struct edac_device_ctl_info *edac_dev)
+{
+	struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
+	u32 err_addr_lo;
+	u32 err_addr_hi;
+	u32 reg;
+
+	/* IOB Bridge agent transaction error interrupt */
+	reg = readl(ctx->dev_csr + IOBBATRANSERRINTSTS);
+	if (!reg)
+		return;
+
+	dev_err(edac_dev->dev, "IOB bridge agent (BA) transaction error\n");
+	if (reg & WRERR_RESP_MASK)
+		dev_err(edac_dev->dev, "IOB BA write response error\n");
+	if (reg & M_WRERR_RESP_MASK)
+		dev_err(edac_dev->dev,
+			"Multiple IOB BA write response error\n");
+	if (reg & XGIC_POISONED_REQ_MASK)
+		dev_err(edac_dev->dev, "IOB BA XGIC poisoned write error\n");
+	if (reg & M_XGIC_POISONED_REQ_MASK)
+		dev_err(edac_dev->dev,
+			"Multiple IOB BA XGIC poisoned write error\n");
+	if (reg & RBM_POISONED_REQ_MASK)
+		dev_err(edac_dev->dev, "IOB BA RBM poisoned write error\n");
+	if (reg & M_RBM_POISONED_REQ_MASK)
+		dev_err(edac_dev->dev,
+			"Multiple IOB BA RBM poisoned write error\n");
+	if (reg & WDATA_CORRUPT_MASK)
+		dev_err(edac_dev->dev, "IOB BA write error\n");
+	if (reg & M_WDATA_CORRUPT_MASK)
+		dev_err(edac_dev->dev, "Multiple IOB BA write error\n");
+	if (reg & TRANS_CORRUPT_MASK)
+		dev_err(edac_dev->dev, "IOB BA transaction error\n");
+	if (reg & M_TRANS_CORRUPT_MASK)
+		dev_err(edac_dev->dev, "Multiple IOB BA transaction error\n");
+	if (reg & RIDRAM_CORRUPT_MASK)
+		dev_err(edac_dev->dev,
+			"IOB BA RDIDRAM read transaction ID error\n");
+	if (reg & M_RIDRAM_CORRUPT_MASK)
+		dev_err(edac_dev->dev,
+			"Multiple IOB BA RDIDRAM read transaction ID error\n");
+	if (reg & WIDRAM_CORRUPT_MASK)
+		dev_err(edac_dev->dev,
+			"IOB BA RDIDRAM write transaction ID error\n");
+	if (reg & M_WIDRAM_CORRUPT_MASK)
+		dev_err(edac_dev->dev,
+			"Multiple IOB BA RDIDRAM write transaction ID error\n");
+	if (reg & ILLEGAL_ACCESS_MASK)
+		dev_err(edac_dev->dev,
+			"IOB BA XGIC/RB illegal access error\n");
+	if (reg & M_ILLEGAL_ACCESS_MASK)
+		dev_err(edac_dev->dev,
+			"Multiple IOB BA XGIC/RB illegal access error\n");
+
+	err_addr_lo = readl(ctx->dev_csr + IOBBATRANSERRREQINFOL);
+	err_addr_hi = readl(ctx->dev_csr + IOBBATRANSERRREQINFOH);
+	dev_err(edac_dev->dev, "IOB BA %s access at 0x%02X.%08X (0x%08X)\n",
+		REQTYPE_F2_RD(err_addr_hi) ? "read" : "write",
+		ERRADDRH_F2_RD(err_addr_hi), err_addr_lo, err_addr_hi);
+	if (reg & WRERR_RESP_MASK)
+		dev_err(edac_dev->dev, "IOB BA requestor ID 0x%08X\n",
+			readl(ctx->dev_csr + IOBBATRANSERRCSWREQID));
+	writel(reg, ctx->dev_csr + IOBBATRANSERRINTSTS);
+}
+
+static void xgene_edac_pa_report(struct edac_device_ctl_info *edac_dev)
+{
+	struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
+	u32 err_addr_lo;
+	u32 err_addr_hi;
+	u32 reg;
+
+	/* IOB Processing agent transaction error interrupt */
+	reg = readl(ctx->dev_csr + IOBPATRANSERRINTSTS);
+	if (reg) {
+		dev_err(edac_dev->dev,
+			"IOB procesing agent (PA) transaction error\n");
+		if (reg & IOBPA_RDATA_CORRUPT_MASK)
+			dev_err(edac_dev->dev, "IOB PA read data RAM error\n");
+		if (reg & IOBPA_M_RDATA_CORRUPT_MASK)
+			dev_err(edac_dev->dev,
+				"Mutilple IOB PA read data RAM error\n");
+		if (reg & IOBPA_WDATA_CORRUPT_MASK)
+			dev_err(edac_dev->dev,
+				"IOB PA write data RAM error\n");
+		if (reg & IOBPA_M_WDATA_CORRUPT_MASK)
+			dev_err(edac_dev->dev,
+				"Mutilple IOB PA write data RAM error\n");
+		if (reg & IOBPA_TRANS_CORRUPT_MASK)
+			dev_err(edac_dev->dev, "IOB PA transaction error\n");
+		if (reg & IOBPA_M_TRANS_CORRUPT_MASK)
+			dev_err(edac_dev->dev,
+				"Mutilple IOB PA transaction error\n");
+		if (reg & IOBPA_REQIDRAM_CORRUPT_MASK)
+			dev_err(edac_dev->dev,
+				"IOB PA transaction ID RAM error\n");
+		if (reg & IOBPA_M_REQIDRAM_CORRUPT_MASK)
+			dev_err(edac_dev->dev,
+				"Multiple IOB PA transaction ID RAM error\n");
+		writel(reg, ctx->dev_csr + IOBPATRANSERRINTSTS);
+	}
+
+	/* IOB AXI0 Error */
+	reg = readl(ctx->dev_csr + IOBAXIS0TRANSERRINTSTS);
+	if (reg) {
+		err_addr_lo = readl(ctx->dev_csr + IOBAXIS0TRANSERRREQINFOL);
+		err_addr_hi = readl(ctx->dev_csr + IOBAXIS0TRANSERRREQINFOH);
+		dev_err(edac_dev->dev,
+			"%sAXI slave 0 illegal %s access @ 0x%02X.%08X (0x%08X)\n",
+			reg & IOBAXIS0_M_ILLEGAL_ACCESS_MASK ? "Multiple " : "",
+			REQTYPE_RD(err_addr_hi) ? "read" : "write",
+			ERRADDRH_RD(err_addr_hi), err_addr_lo, err_addr_hi);
+		writel(reg, ctx->dev_csr + IOBAXIS0TRANSERRINTSTS);
+	}
+
+	/* IOB AXI1 Error */
+	reg = readl(ctx->dev_csr + IOBAXIS1TRANSERRINTSTS);
+	if (reg) {
+		err_addr_lo = readl(ctx->dev_csr + IOBAXIS1TRANSERRREQINFOL);
+		err_addr_hi = readl(ctx->dev_csr + IOBAXIS1TRANSERRREQINFOH);
+		dev_err(edac_dev->dev,
+			"%sAXI slave 1 illegal %s access @ 0x%02X.%08X (0x%08X)\n",
+			reg & IOBAXIS0_M_ILLEGAL_ACCESS_MASK ? "Multiple " : "",
+			REQTYPE_RD(err_addr_hi) ? "read" : "write",
+			ERRADDRH_RD(err_addr_hi), err_addr_lo, err_addr_hi);
+		writel(reg, ctx->dev_csr + IOBAXIS1TRANSERRINTSTS);
+	}
+}
+
+static void xgene_edac_soc_check(struct edac_device_ctl_info *edac_dev)
+{
+	struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
+	static const char * const mem_err_ip[] = {
+		"10GbE0",
+		"10GbE1",
+		"Security",
+		"SATA45",
+		"SATA23/ETH23",
+		"SATA01/ETH01",
+		"USB1",
+		"USB0",
+		"QML",
+		"QM0",
+		"QM1 (XGbE01)",
+		"PCIE4",
+		"PCIE3",
+		"PCIE2",
+		"PCIE1",
+		"PCIE0",
+		"CTX Manager",
+		"OCM",
+		"1GbE",
+		"CLE",
+		"AHBC",
+		"PktDMA",
+		"GFC",
+		"MSLIM",
+		"10GbE2",
+		"10GbE3",
+		"QM2 (XGbE23)",
+		"IOB",
+		"unknown",
+		"unknown",
+		"unknown",
+		"unknown",
+	};
+	u32 pcp_hp_stat;
+	u32 pcp_lp_stat;
+	u32 reg;
+	int i;
+
+	xgene_edac_pcp_rd(ctx->edac, PCPHPERRINTSTS, &pcp_hp_stat);
+	xgene_edac_pcp_rd(ctx->edac, PCPLPERRINTSTS, &pcp_lp_stat);
+	xgene_edac_pcp_rd(ctx->edac, MEMERRINTSTS, &reg);
+	if (!((pcp_hp_stat & (IOB_PA_ERR_MASK | IOB_BA_ERR_MASK |
+			     IOB_XGIC_ERR_MASK | IOB_RB_ERR_MASK)) ||
+	      (pcp_lp_stat & CSW_SWITCH_TRACE_ERR_MASK) || reg))
+		return;
+
+	if (pcp_hp_stat & IOB_XGIC_ERR_MASK)
+		xgene_edac_iob_gic_report(edac_dev);
+
+	if (pcp_hp_stat & (IOB_RB_ERR_MASK | IOB_BA_ERR_MASK))
+		xgene_edac_rb_report(edac_dev);
+
+	if (pcp_hp_stat & IOB_PA_ERR_MASK)
+		xgene_edac_pa_report(edac_dev);
+
+	if (pcp_lp_stat & CSW_SWITCH_TRACE_ERR_MASK) {
+		dev_info(edac_dev->dev,
+			 "CSW switch trace correctable memory parity error\n");
+		edac_device_handle_ce(edac_dev, 0, 0, edac_dev->ctl_name);
+	}
+
+	for (i = 0; i < 31; i++) {
+		if (reg & (1 << i)) {
+			dev_err(edac_dev->dev, "%s memory parity error\n",
+				mem_err_ip[i]);
+			edac_device_handle_ue(edac_dev, 0, 0,
+					      edac_dev->ctl_name);
+		}
+	}
+}
+
+static void xgene_edac_soc_hw_init(struct edac_device_ctl_info *edac_dev,
+				   bool enable)
+{
+	struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
+
+	/* Enable SoC IP error interrupt */
+	if (edac_dev->op_state == OP_RUNNING_INTERRUPT) {
+		if (enable) {
+			xgene_edac_pcp_clrbits(ctx->edac, PCPHPERRINTMSK,
+					       IOB_PA_ERR_MASK |
+					       IOB_BA_ERR_MASK |
+					       IOB_XGIC_ERR_MASK |
+					       IOB_RB_ERR_MASK);
+			xgene_edac_pcp_clrbits(ctx->edac, PCPLPERRINTMSK,
+					       CSW_SWITCH_TRACE_ERR_MASK);
+		} else {
+			xgene_edac_pcp_setbits(ctx->edac, PCPHPERRINTMSK,
+					       IOB_PA_ERR_MASK |
+					       IOB_BA_ERR_MASK |
+					       IOB_XGIC_ERR_MASK |
+					       IOB_RB_ERR_MASK);
+			xgene_edac_pcp_setbits(ctx->edac, PCPLPERRINTMSK,
+					       CSW_SWITCH_TRACE_ERR_MASK);
+		}
+
+		writel(enable ? 0x0 : 0xFFFFFFFF,
+		       ctx->dev_csr + IOBAXIS0TRANSERRINTMSK);
+		writel(enable ? 0x0 : 0xFFFFFFFF,
+		       ctx->dev_csr + IOBAXIS1TRANSERRINTMSK);
+		writel(enable ? 0x0 : 0xFFFFFFFF,
+		       ctx->dev_csr + XGICTRANSERRINTMSK);
+
+		xgene_edac_pcp_setbits(ctx->edac, MEMERRINTMSK,
+				       enable ? 0x0 : 0xFFFFFFFF);
+	}
+}
+
+static int xgene_edac_soc_add(struct xgene_edac *edac, struct device_node *np)
+{
+	struct edac_device_ctl_info *edac_dev;
+	struct xgene_edac_dev_ctx *ctx;
+	struct resource res;
+	int edac_idx;
+	int rc = 0;
+
+	if (!devres_open_group(edac->dev, xgene_edac_soc_add, GFP_KERNEL))
+		return -ENOMEM;
+
+	edac_idx = edac_device_alloc_index();
+	edac_dev = edac_device_alloc_ctl_info(sizeof(*ctx),
+					      "SOC", 1, "SOC", 1, 2, NULL, 0,
+					      edac_idx);
+	if (!edac_dev) {
+		rc = -ENOMEM;
+		goto err;
+	}
+
+	ctx = edac_dev->pvt_info;
+	ctx->name = "xgene_soc_err";
+	ctx->edac_idx = edac_idx;
+	ctx->edac = edac;
+	ctx->edac_dev = edac_dev;
+	ctx->ddev = *edac->dev;
+	edac_dev->dev = &ctx->ddev;
+	edac_dev->ctl_name = ctx->name;
+	edac_dev->dev_name = ctx->name;
+	edac_dev->mod_name = EDAC_MOD_STR;
+
+	rc = of_address_to_resource(np, 0, &res);
+	if (rc < 0) {
+		dev_err(edac->dev, "no SoC resource address\n");
+		goto err1;
+	}
+	ctx->dev_csr = devm_ioremap_resource(edac->dev, &res);
+	if (IS_ERR(ctx->dev_csr)) {
+		dev_err(edac->dev,
+			"devm_ioremap_resource failed for soc resource address\n");
+		rc = PTR_ERR(ctx->dev_csr);
+		goto err1;
+	}
+
+	if (edac_op_state == EDAC_OPSTATE_POLL)
+		edac_dev->edac_check = xgene_edac_soc_check;
+
+	rc = edac_device_add_device(edac_dev);
+	if (rc > 0) {
+		dev_err(edac->dev, "failed edac_device_add_device()\n");
+		rc = -ENOMEM;
+		goto err1;
+	}
+
+	if (edac_op_state == EDAC_OPSTATE_INT)
+		edac_dev->op_state = OP_RUNNING_INTERRUPT;
+
+	list_add(&ctx->next, &edac->socs);
+
+	xgene_edac_soc_hw_init(edac_dev, 1);
+
+	devres_remove_group(edac->dev, xgene_edac_soc_add);
+
+	dev_info(edac->dev, "X-Gene EDAC SoC registered\n");
+	return 0;
+
+err1:
+	edac_device_free_ctl_info(edac_dev);
+err:
+	devres_release_group(edac->dev, xgene_edac_soc_add);
+	return rc;
+}
+
+static int xgene_edac_soc_remove(struct xgene_edac_dev_ctx *soc)
+{
+	struct edac_device_ctl_info *edac_dev = soc->edac_dev;
+
+	xgene_edac_soc_hw_init(edac_dev, 0);
+	edac_device_del_device(soc->edac->dev);
+	edac_device_free_ctl_info(edac_dev);
+	return 0;
+}
+
 static irqreturn_t xgene_edac_isr(int irq, void *dev_id)
 {
 	struct xgene_edac *ctx = dev_id;
 	struct xgene_edac_pmd_ctx *pmd;
+	struct xgene_edac_dev_ctx *node;
 	unsigned int pcp_hp_stat;
 	unsigned int pcp_lp_stat;
 
@@ -1081,6 +1800,14 @@ static irqreturn_t xgene_edac_isr(int irq, void *dev_id)
 			xgene_edac_pmd_check(pmd->edac_dev);
 	}
 
+	list_for_each_entry(node, &ctx->l3s, next) {
+		xgene_edac_l3_check(node->edac_dev);
+	}
+
+	list_for_each_entry(node, &ctx->socs, next) {
+		xgene_edac_soc_check(node->edac_dev);
+	}
+
 	return IRQ_HANDLED;
 }
 
@@ -1099,6 +1826,8 @@ static int xgene_edac_probe(struct platform_device *pdev)
 	platform_set_drvdata(pdev, edac);
 	INIT_LIST_HEAD(&edac->mcus);
 	INIT_LIST_HEAD(&edac->pmds);
+	INIT_LIST_HEAD(&edac->l3s);
+	INIT_LIST_HEAD(&edac->socs);
 	spin_lock_init(&edac->lock);
 	mutex_init(&edac->mc_lock);
 
@@ -1168,8 +1897,12 @@ static int xgene_edac_probe(struct platform_device *pdev)
 			continue;
 		if (of_device_is_compatible(child, "apm,xgene-edac-mc"))
 			xgene_edac_mc_add(edac, child);
-		if (of_device_is_compatible(child, "apm,xgene-edac-pmd"))
+		else if (of_device_is_compatible(child, "apm,xgene-edac-pmd"))
 			xgene_edac_pmd_add(edac, child);
+		else if (of_device_is_compatible(child, "apm,xgene-edac-l3"))
+			xgene_edac_l3_add(edac, child);
+		else if (of_device_is_compatible(child, "apm,xgene-edac-soc"))
+			xgene_edac_soc_add(edac, child);
 	}
 
 	return 0;
@@ -1185,6 +1918,8 @@ static int xgene_edac_remove(struct platform_device *pdev)
 	struct xgene_edac_mc_ctx *temp_mcu;
 	struct xgene_edac_pmd_ctx *pmd;
 	struct xgene_edac_pmd_ctx *temp_pmd;
+	struct xgene_edac_dev_ctx *node;
+	struct xgene_edac_dev_ctx *temp_node;
 
 	list_for_each_entry_safe(mcu, temp_mcu, &edac->mcus, next) {
 		xgene_edac_mc_remove(mcu);
@@ -1193,6 +1928,15 @@ static int xgene_edac_remove(struct platform_device *pdev)
 	list_for_each_entry_safe(pmd, temp_pmd, &edac->pmds, next) {
 		xgene_edac_pmd_remove(pmd);
 	}
+
+	list_for_each_entry_safe(node, temp_node, &edac->l3s, next) {
+		xgene_edac_l3_remove(node);
+	}
+
+	list_for_each_entry_safe(node, temp_node, &edac->socs, next) {
+		xgene_edac_soc_remove(node);
+	}
+
 	return 0;
 }
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
@ 2015-05-29 22:04         ` Loc Ho
  0 siblings, 0 replies; 24+ messages in thread
From: Loc Ho @ 2015-05-29 22:04 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds EDAC support for the L3 and SoC components.

Signed-off-by: Loc Ho <lho@apm.com>
---
 drivers/edac/xgene_edac.c |  746 ++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 745 insertions(+), 1 deletions(-)

diff --git a/drivers/edac/xgene_edac.c b/drivers/edac/xgene_edac.c
index b515857..934f6c5 100644
--- a/drivers/edac/xgene_edac.c
+++ b/drivers/edac/xgene_edac.c
@@ -66,6 +66,8 @@ struct xgene_edac {
 
 	struct list_head	mcus;
 	struct list_head	pmds;
+	struct list_head	l3s;
+	struct list_head	socs;
 
 	struct mutex		mc_lock;
 	int			mc_active_mask;
@@ -1057,10 +1059,727 @@ static int xgene_edac_pmd_remove(struct xgene_edac_pmd_ctx *pmd)
 	return 0;
 }
 
+/* L3 Error device */
+#define L3C_ESR				(0x0A * 4)
+#define  L3C_ESR_DATATAG_MASK		0x00000200
+#define  L3C_ESR_MULTIHIT_MASK		0x00000100
+#define  L3C_ESR_UCEVICT_MASK		0x00000040
+#define  L3C_ESR_MULTIUCERR_MASK	0x00000020
+#define  L3C_ESR_MULTICERR_MASK		0x00000010
+#define  L3C_ESR_UCERR_MASK		0x00000008
+#define  L3C_ESR_CERR_MASK		0x00000004
+#define  L3C_ESR_UCERRINTR_MASK		0x00000002
+#define  L3C_ESR_CERRINTR_MASK		0x00000001
+#define L3C_ECR				(0x0B * 4)
+#define  L3C_ECR_UCINTREN		0x00000008
+#define  L3C_ECR_CINTREN		0x00000004
+#define  L3C_UCERREN			0x00000002
+#define  L3C_CERREN			0x00000001
+#define L3C_ELR				(0x0C * 4)
+#define  L3C_ELR_ERRSYN(src)		((src & 0xFF800000) >> 23)
+#define  L3C_ELR_ERRWAY(src)		((src & 0x007E0000) >> 17)
+#define  L3C_ELR_AGENTID(src)		((src & 0x0001E000) >> 13)
+#define  L3C_ELR_ERRGRP(src)		((src & 0x00000F00) >> 8)
+#define  L3C_ELR_OPTYPE(src)		((src & 0x000000F0) >> 4)
+#define  L3C_ELR_PADDRHIGH(src)		(src & 0x0000000F)
+#define L3C_AELR			(0x0D * 4)
+#define L3C_BELR			(0x0E * 4)
+#define  L3C_BELR_BANK(src)		(src & 0x0000000F)
+
+struct xgene_edac_dev_ctx {
+	struct list_head	next;
+	struct device		ddev;
+	char			*name;
+	struct xgene_edac	*edac;
+	struct edac_device_ctl_info *edac_dev;
+	int			edac_idx;
+	void __iomem		*dev_csr;
+};
+
+static void xgene_edac_l3_check(struct edac_device_ctl_info *edac_dev)
+{
+	struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
+	u32 l3cesr;
+	u32 l3celr;
+	u32 l3caelr;
+	u32 l3cbelr;
+
+	l3cesr = readl(ctx->dev_csr + L3C_ESR);
+	if (!(l3cesr & (L3C_ESR_UCERR_MASK | L3C_ESR_CERR_MASK)))
+		return;
+
+	if (l3cesr & L3C_ESR_UCERR_MASK)
+		dev_err(edac_dev->dev, "L3C uncorrectable error\n");
+	if (l3cesr & L3C_ESR_CERR_MASK)
+		dev_warn(edac_dev->dev, "L3C correctable error\n");
+
+	l3celr = readl(ctx->dev_csr + L3C_ELR);
+	l3caelr = readl(ctx->dev_csr + L3C_AELR);
+	l3cbelr = readl(ctx->dev_csr + L3C_BELR);
+	if (l3cesr & L3C_ESR_MULTIHIT_MASK)
+		dev_err(edac_dev->dev, "L3C multiple hit error\n");
+	if (l3cesr & L3C_ESR_UCEVICT_MASK)
+		dev_err(edac_dev->dev,
+			"L3C dropped eviction of line with error\n");
+	if (l3cesr & L3C_ESR_MULTIUCERR_MASK)
+		dev_err(edac_dev->dev, "L3C multiple uncorrectable error\n");
+	if (l3cesr & L3C_ESR_DATATAG_MASK)
+		dev_err(edac_dev->dev,
+			"L3C data error syndrome 0x%X group 0x%X\n",
+			L3C_ELR_ERRSYN(l3celr), L3C_ELR_ERRGRP(l3celr));
+	else
+		dev_err(edac_dev->dev,
+			"L3C tag error syndrome 0x%X Way of Tag 0x%X Agent ID 0x%X Operation type 0x%X\n",
+			L3C_ELR_ERRSYN(l3celr), L3C_ELR_ERRWAY(l3celr),
+			L3C_ELR_AGENTID(l3celr), L3C_ELR_OPTYPE(l3celr));
+	/*
+	 * NOTE: Address [41:38] in L3C_ELR_PADDRHIGH(l3celr).
+	 *       Address [37:6] in l3caelr. Lower 6 bits are zero.
+	 */
+	dev_err(edac_dev->dev, "L3C error address 0x%08X.%08X bank %d\n",
+		L3C_ELR_PADDRHIGH(l3celr) << 6 | (l3caelr >> 26),
+		(l3caelr & 0x3FFFFFFF) << 6, L3C_BELR_BANK(l3cbelr));
+	dev_err(edac_dev->dev,
+		"L3C error status register value 0x%X\n", l3cesr);
+
+	/* Clear L3C error interrupt */
+	writel(0, ctx->dev_csr + L3C_ESR);
+
+	if (l3cesr & L3C_ESR_CERR_MASK)
+		edac_device_handle_ce(edac_dev, 0, 0, edac_dev->ctl_name);
+	if (l3cesr & L3C_ESR_UCERR_MASK)
+		edac_device_handle_ue(edac_dev, 0, 0, edac_dev->ctl_name);
+}
+
+static void xgene_edac_l3_hw_init(struct edac_device_ctl_info *edac_dev,
+				  bool enable)
+{
+	struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
+	u32 val;
+
+	val = readl(ctx->dev_csr + L3C_ECR);
+	val |= L3C_UCERREN | L3C_CERREN;
+	/* On disable, we just disable interrupt but keep error enabled */
+	if (edac_dev->op_state == OP_RUNNING_INTERRUPT) {
+		if (enable)
+			val |= L3C_ECR_UCINTREN | L3C_ECR_CINTREN;
+		else
+			val &= ~(L3C_ECR_UCINTREN | L3C_ECR_CINTREN);
+	}
+	writel(val, ctx->dev_csr + L3C_ECR);
+
+	if (edac_dev->op_state == OP_RUNNING_INTERRUPT) {
+		/* Enable/disable L3 error top level interrupt */
+		if (enable) {
+			xgene_edac_pcp_clrbits(ctx->edac, PCPHPERRINTMSK,
+					       L3C_UNCORR_ERR_MASK);
+			xgene_edac_pcp_clrbits(ctx->edac, PCPLPERRINTMSK,
+					       L3C_CORR_ERR_MASK);
+		} else {
+			xgene_edac_pcp_setbits(ctx->edac, PCPHPERRINTMSK,
+					       L3C_UNCORR_ERR_MASK);
+			xgene_edac_pcp_setbits(ctx->edac, PCPLPERRINTMSK,
+					       L3C_CORR_ERR_MASK);
+		}
+	}
+}
+
+static ssize_t xgene_edac_l3_inject_ctrl_show(
+	struct edac_device_ctl_info *edac_dev, char *data)
+{
+	struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
+
+	return sprintf(data, "0x%08X", readl(ctx->dev_csr + L3C_ESR));
+}
+
+static ssize_t xgene_edac_l3_inject_ctrl_store(
+	struct edac_device_ctl_info *edac_dev, const char *data, size_t count)
+{
+	struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
+	u32 val;
+
+	if (kstrtou32(data, 0, &val))
+		return -EINVAL;
+	writel(val, ctx->dev_csr + L3C_ESR);
+	return count;
+}
+
+static struct edac_dev_sysfs_attribute xgene_edac_l3_sysfs_attributes[] = {
+	{ .attr = {
+		  .name = "inject_ctrl",
+		  .mode = (S_IRUGO | S_IWUSR)
+	  },
+	 .show = xgene_edac_l3_inject_ctrl_show,
+	 .store = xgene_edac_l3_inject_ctrl_store },
+
+	/* End of list */
+	{ .attr = {.name = NULL } }
+};
+
+static int xgene_edac_l3_add(struct xgene_edac *edac, struct device_node *np)
+{
+	struct edac_device_ctl_info *edac_dev;
+	struct xgene_edac_dev_ctx *ctx;
+	struct resource res;
+	int edac_idx;
+	int rc = 0;
+
+	if (!devres_open_group(edac->dev, xgene_edac_l3_add, GFP_KERNEL))
+		return -ENOMEM;
+
+	edac_idx = edac_device_alloc_index();
+	edac_dev = edac_device_alloc_ctl_info(sizeof(*ctx),
+					      "l3c", 1, "l3c", 1, 0, NULL, 0,
+					      edac_idx);
+	if (!edac_dev) {
+		rc = -ENOMEM;
+		goto err;
+	}
+
+	ctx = edac_dev->pvt_info;
+	ctx->name = "xgene_l3_err";
+	ctx->edac_idx = edac_idx;
+	ctx->edac = edac;
+	ctx->edac_dev = edac_dev;
+	ctx->ddev = *edac->dev;
+	edac_dev->dev = &ctx->ddev;
+	edac_dev->ctl_name = ctx->name;
+	edac_dev->dev_name = ctx->name;
+	edac_dev->mod_name = EDAC_MOD_STR;
+
+	rc = of_address_to_resource(np, 0, &res);
+	if (rc < 0) {
+		dev_err(edac->dev, "no L3 resource address\n");
+		goto err1;
+	}
+	ctx->dev_csr = devm_ioremap_resource(edac->dev, &res);
+	if (IS_ERR(ctx->dev_csr)) {
+		dev_err(edac->dev,
+			"devm_ioremap_resource failed for L3 resource address\n");
+		rc = PTR_ERR(ctx->dev_csr);
+		goto err1;
+	}
+
+	if (edac_op_state == EDAC_OPSTATE_POLL)
+		edac_dev->edac_check = xgene_edac_l3_check;
+
+	edac_dev->sysfs_attributes = xgene_edac_l3_sysfs_attributes;
+
+	rc = edac_device_add_device(edac_dev);
+	if (rc > 0) {
+		dev_err(edac->dev, "failed edac_device_add_device()\n");
+		rc = -ENOMEM;
+		goto err1;
+	}
+
+	if (edac_op_state == EDAC_OPSTATE_INT)
+		edac_dev->op_state = OP_RUNNING_INTERRUPT;
+
+	list_add(&ctx->next, &edac->l3s);
+
+	xgene_edac_l3_hw_init(edac_dev, 1);
+
+	devres_remove_group(edac->dev, xgene_edac_l3_add);
+
+	dev_info(edac->dev, "X-Gene EDAC L3 registered\n");
+	return 0;
+
+err1:
+	edac_device_free_ctl_info(edac_dev);
+err:
+	devres_release_group(edac->dev, xgene_edac_l3_add);
+	return rc;
+}
+
+static int xgene_edac_l3_remove(struct xgene_edac_dev_ctx *l3)
+{
+	struct edac_device_ctl_info *edac_dev = l3->edac_dev;
+
+	xgene_edac_l3_hw_init(edac_dev, 0);
+	edac_device_del_device(l3->edac->dev);
+	edac_device_free_ctl_info(edac_dev);
+	return 0;
+}
+
+/* SoC Error device */
+#define IOBAXIS0TRANSERRINTSTS		0x0000
+#define  IOBAXIS0_M_ILLEGAL_ACCESS_MASK	0x00000002
+#define  IOBAXIS0_ILLEGAL_ACCESS_MASK	0x00000001
+#define IOBAXIS0TRANSERRINTMSK		0x0004
+#define IOBAXIS0TRANSERRREQINFOL	0x0008
+#define IOBAXIS0TRANSERRREQINFOH	0x000c
+#define  REQTYPE_RD(src)		(((src) & 0x00000001))
+#define  ERRADDRH_RD(src)		(((src) & 0xffc00000) >> 22)
+#define IOBAXIS1TRANSERRINTSTS		0x0010
+#define IOBAXIS1TRANSERRINTMSK		0x0014
+#define IOBAXIS1TRANSERRREQINFOL	0x0018
+#define IOBAXIS1TRANSERRREQINFOH	0x001c
+#define IOBPATRANSERRINTSTS		0x0020
+#define  IOBPA_M_REQIDRAM_CORRUPT_MASK	0x00000080
+#define  IOBPA_REQIDRAM_CORRUPT_MASK	0x00000040
+#define  IOBPA_M_TRANS_CORRUPT_MASK	0x00000020
+#define  IOBPA_TRANS_CORRUPT_MASK	0x00000010
+#define  IOBPA_M_WDATA_CORRUPT_MASK	0x00000008
+#define  IOBPA_WDATA_CORRUPT_MASK	0x00000004
+#define  IOBPA_M_RDATA_CORRUPT_MASK	0x00000002
+#define  IOBPA_RDATA_CORRUPT_MASK	0x00000001
+#define IOBBATRANSERRINTSTS		0x0030
+#define  M_ILLEGAL_ACCESS_MASK		0x00008000
+#define  ILLEGAL_ACCESS_MASK		0x00004000
+#define  M_WIDRAM_CORRUPT_MASK		0x00002000
+#define  WIDRAM_CORRUPT_MASK		0x00001000
+#define  M_RIDRAM_CORRUPT_MASK		0x00000800
+#define  RIDRAM_CORRUPT_MASK		0x00000400
+#define  M_TRANS_CORRUPT_MASK		0x00000200
+#define  TRANS_CORRUPT_MASK		0x00000100
+#define  M_WDATA_CORRUPT_MASK		0x00000080
+#define  WDATA_CORRUPT_MASK		0x00000040
+#define  M_RBM_POISONED_REQ_MASK	0x00000020
+#define  RBM_POISONED_REQ_MASK		0x00000010
+#define  M_XGIC_POISONED_REQ_MASK	0x00000008
+#define  XGIC_POISONED_REQ_MASK		0x00000004
+#define  M_WRERR_RESP_MASK		0x00000002
+#define  WRERR_RESP_MASK		0x00000001
+#define IOBBATRANSERRREQINFOL		0x0038
+#define IOBBATRANSERRREQINFOH		0x003c
+#define  REQTYPE_F2_RD(src)		(((src) & 0x00000001))
+#define  ERRADDRH_F2_RD(src)		(((src) & 0xffc00000) >> 22)
+#define IOBBATRANSERRCSWREQID		0x0040
+#define XGICTRANSERRINTSTS		0x0050
+#define  M_WR_ACCESS_ERR_MASK		0x00000008
+#define  WR_ACCESS_ERR_MASK		0x00000004
+#define  M_RD_ACCESS_ERR_MASK		0x00000002
+#define  RD_ACCESS_ERR_MASK		0x00000001
+#define XGICTRANSERRINTMSK		0x0054
+#define XGICTRANSERRREQINFO		0x0058
+#define  REQTYPE_MASK			0x04000000
+#define  ERRADDR_RD(src)		((src) & 0x03ffffff)
+#define GLBL_ERR_STS			0x0800
+#define  MDED_ERR_MASK			0x00000008
+#define  DED_ERR_MASK			0x00000004
+#define  MSEC_ERR_MASK			0x00000002
+#define  SEC_ERR_MASK			0x00000001
+#define GLBL_SEC_ERRL			0x0810
+#define GLBL_SEC_ERRH			0x0818
+#define GLBL_MSEC_ERRL			0x0820
+#define GLBL_MSEC_ERRH			0x0828
+#define GLBL_DED_ERRL			0x0830
+#define GLBL_DED_ERRLMASK		0x0834
+#define GLBL_DED_ERRH			0x0838
+#define GLBL_DED_ERRHMASK		0x083c
+#define GLBL_MDED_ERRL			0x0840
+#define GLBL_MDED_ERRLMASK		0x0844
+#define GLBL_MDED_ERRH			0x0848
+#define GLBL_MDED_ERRHMASK		0x084c
+
+static void xgene_edac_iob_gic_report(struct edac_device_ctl_info *edac_dev)
+{
+	struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
+	u32 err_addr_lo;
+	u32 err_addr_hi;
+	u32 reg;
+	u32 info;
+
+	/* GIC transaction error interrupt */
+	reg = readl(ctx->dev_csr + XGICTRANSERRINTSTS);
+	if (reg) {
+		dev_err(edac_dev->dev, "XGIC transaction error\n");
+		if (reg & RD_ACCESS_ERR_MASK)
+			dev_err(edac_dev->dev, "XGIC read size error\n");
+		if (reg & M_RD_ACCESS_ERR_MASK)
+			dev_err(edac_dev->dev,
+				"Multiple XGIC read size error\n");
+		if (reg & WR_ACCESS_ERR_MASK)
+			dev_err(edac_dev->dev, "XGIC write size error\n");
+		if (reg & M_WR_ACCESS_ERR_MASK)
+			dev_err(edac_dev->dev,
+				"Multiple XGIC write size error\n");
+		info = readl(ctx->dev_csr + XGICTRANSERRREQINFO);
+		dev_err(edac_dev->dev, "XGIC %s access @ 0x%08X (0x%08X)\n",
+			info & REQTYPE_MASK ? "read" : "write",
+			ERRADDR_RD(info), info);
+		writel(reg, ctx->dev_csr + XGICTRANSERRINTSTS);
+	}
+
+	/* IOB memory error */
+	reg = readl(ctx->dev_csr + GLBL_ERR_STS);
+	if (reg) {
+		if (reg & SEC_ERR_MASK) {
+			err_addr_lo = readl(ctx->dev_csr + GLBL_SEC_ERRL);
+			err_addr_hi = readl(ctx->dev_csr + GLBL_SEC_ERRH);
+			dev_err(edac_dev->dev,
+				"IOB single-bit correctable memory at 0x%08X.%08X error\n",
+				err_addr_lo, err_addr_hi);
+			writel(err_addr_lo, ctx->dev_csr + GLBL_SEC_ERRL);
+			writel(err_addr_hi, ctx->dev_csr + GLBL_SEC_ERRH);
+		}
+		if (reg & MSEC_ERR_MASK) {
+			err_addr_lo = readl(ctx->dev_csr + GLBL_MSEC_ERRL);
+			err_addr_hi = readl(ctx->dev_csr + GLBL_MSEC_ERRH);
+			dev_err(edac_dev->dev,
+				"IOB multiple single-bit correctable memory at 0x%08X.%08X error\n",
+				err_addr_lo, err_addr_hi);
+			writel(err_addr_lo, ctx->dev_csr + GLBL_MSEC_ERRL);
+			writel(err_addr_hi, ctx->dev_csr + GLBL_MSEC_ERRH);
+		}
+		if (reg & (SEC_ERR_MASK | MSEC_ERR_MASK))
+			edac_device_handle_ce(edac_dev, 0, 0,
+					      edac_dev->ctl_name);
+
+		if (reg & DED_ERR_MASK) {
+			err_addr_lo = readl(ctx->dev_csr + GLBL_DED_ERRL);
+			err_addr_hi = readl(ctx->dev_csr + GLBL_DED_ERRH);
+			dev_err(edac_dev->dev,
+				"IOB double-bit uncorrectable memory at 0x%08X.%08X error\n",
+				err_addr_lo, err_addr_hi);
+			writel(err_addr_lo, ctx->dev_csr + GLBL_DED_ERRL);
+			writel(err_addr_hi, ctx->dev_csr + GLBL_DED_ERRH);
+		}
+		if (reg & MDED_ERR_MASK) {
+			err_addr_lo = readl(ctx->dev_csr + GLBL_MDED_ERRL);
+			err_addr_hi = readl(ctx->dev_csr + GLBL_MDED_ERRH);
+			dev_err(edac_dev->dev,
+				"Multiple IOB double-bit uncorrectable memory at 0x%08X.%08X error\n",
+				err_addr_lo, err_addr_hi);
+			writel(err_addr_lo, ctx->dev_csr + GLBL_MDED_ERRL);
+			writel(err_addr_hi, ctx->dev_csr + GLBL_MDED_ERRH);
+		}
+		if (reg & (DED_ERR_MASK | MDED_ERR_MASK))
+			edac_device_handle_ue(edac_dev, 0, 0,
+					      edac_dev->ctl_name);
+	}
+}
+
+static void xgene_edac_rb_report(struct edac_device_ctl_info *edac_dev)
+{
+	struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
+	u32 err_addr_lo;
+	u32 err_addr_hi;
+	u32 reg;
+
+	/* IOB Bridge agent transaction error interrupt */
+	reg = readl(ctx->dev_csr + IOBBATRANSERRINTSTS);
+	if (!reg)
+		return;
+
+	dev_err(edac_dev->dev, "IOB bridge agent (BA) transaction error\n");
+	if (reg & WRERR_RESP_MASK)
+		dev_err(edac_dev->dev, "IOB BA write response error\n");
+	if (reg & M_WRERR_RESP_MASK)
+		dev_err(edac_dev->dev,
+			"Multiple IOB BA write response error\n");
+	if (reg & XGIC_POISONED_REQ_MASK)
+		dev_err(edac_dev->dev, "IOB BA XGIC poisoned write error\n");
+	if (reg & M_XGIC_POISONED_REQ_MASK)
+		dev_err(edac_dev->dev,
+			"Multiple IOB BA XGIC poisoned write error\n");
+	if (reg & RBM_POISONED_REQ_MASK)
+		dev_err(edac_dev->dev, "IOB BA RBM poisoned write error\n");
+	if (reg & M_RBM_POISONED_REQ_MASK)
+		dev_err(edac_dev->dev,
+			"Multiple IOB BA RBM poisoned write error\n");
+	if (reg & WDATA_CORRUPT_MASK)
+		dev_err(edac_dev->dev, "IOB BA write error\n");
+	if (reg & M_WDATA_CORRUPT_MASK)
+		dev_err(edac_dev->dev, "Multiple IOB BA write error\n");
+	if (reg & TRANS_CORRUPT_MASK)
+		dev_err(edac_dev->dev, "IOB BA transaction error\n");
+	if (reg & M_TRANS_CORRUPT_MASK)
+		dev_err(edac_dev->dev, "Multiple IOB BA transaction error\n");
+	if (reg & RIDRAM_CORRUPT_MASK)
+		dev_err(edac_dev->dev,
+			"IOB BA RDIDRAM read transaction ID error\n");
+	if (reg & M_RIDRAM_CORRUPT_MASK)
+		dev_err(edac_dev->dev,
+			"Multiple IOB BA RDIDRAM read transaction ID error\n");
+	if (reg & WIDRAM_CORRUPT_MASK)
+		dev_err(edac_dev->dev,
+			"IOB BA RDIDRAM write transaction ID error\n");
+	if (reg & M_WIDRAM_CORRUPT_MASK)
+		dev_err(edac_dev->dev,
+			"Multiple IOB BA RDIDRAM write transaction ID error\n");
+	if (reg & ILLEGAL_ACCESS_MASK)
+		dev_err(edac_dev->dev,
+			"IOB BA XGIC/RB illegal access error\n");
+	if (reg & M_ILLEGAL_ACCESS_MASK)
+		dev_err(edac_dev->dev,
+			"Multiple IOB BA XGIC/RB illegal access error\n");
+
+	err_addr_lo = readl(ctx->dev_csr + IOBBATRANSERRREQINFOL);
+	err_addr_hi = readl(ctx->dev_csr + IOBBATRANSERRREQINFOH);
+	dev_err(edac_dev->dev, "IOB BA %s access at 0x%02X.%08X (0x%08X)\n",
+		REQTYPE_F2_RD(err_addr_hi) ? "read" : "write",
+		ERRADDRH_F2_RD(err_addr_hi), err_addr_lo, err_addr_hi);
+	if (reg & WRERR_RESP_MASK)
+		dev_err(edac_dev->dev, "IOB BA requestor ID 0x%08X\n",
+			readl(ctx->dev_csr + IOBBATRANSERRCSWREQID));
+	writel(reg, ctx->dev_csr + IOBBATRANSERRINTSTS);
+}
+
+static void xgene_edac_pa_report(struct edac_device_ctl_info *edac_dev)
+{
+	struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
+	u32 err_addr_lo;
+	u32 err_addr_hi;
+	u32 reg;
+
+	/* IOB Processing agent transaction error interrupt */
+	reg = readl(ctx->dev_csr + IOBPATRANSERRINTSTS);
+	if (reg) {
+		dev_err(edac_dev->dev,
+			"IOB procesing agent (PA) transaction error\n");
+		if (reg & IOBPA_RDATA_CORRUPT_MASK)
+			dev_err(edac_dev->dev, "IOB PA read data RAM error\n");
+		if (reg & IOBPA_M_RDATA_CORRUPT_MASK)
+			dev_err(edac_dev->dev,
+				"Mutilple IOB PA read data RAM error\n");
+		if (reg & IOBPA_WDATA_CORRUPT_MASK)
+			dev_err(edac_dev->dev,
+				"IOB PA write data RAM error\n");
+		if (reg & IOBPA_M_WDATA_CORRUPT_MASK)
+			dev_err(edac_dev->dev,
+				"Mutilple IOB PA write data RAM error\n");
+		if (reg & IOBPA_TRANS_CORRUPT_MASK)
+			dev_err(edac_dev->dev, "IOB PA transaction error\n");
+		if (reg & IOBPA_M_TRANS_CORRUPT_MASK)
+			dev_err(edac_dev->dev,
+				"Mutilple IOB PA transaction error\n");
+		if (reg & IOBPA_REQIDRAM_CORRUPT_MASK)
+			dev_err(edac_dev->dev,
+				"IOB PA transaction ID RAM error\n");
+		if (reg & IOBPA_M_REQIDRAM_CORRUPT_MASK)
+			dev_err(edac_dev->dev,
+				"Multiple IOB PA transaction ID RAM error\n");
+		writel(reg, ctx->dev_csr + IOBPATRANSERRINTSTS);
+	}
+
+	/* IOB AXI0 Error */
+	reg = readl(ctx->dev_csr + IOBAXIS0TRANSERRINTSTS);
+	if (reg) {
+		err_addr_lo = readl(ctx->dev_csr + IOBAXIS0TRANSERRREQINFOL);
+		err_addr_hi = readl(ctx->dev_csr + IOBAXIS0TRANSERRREQINFOH);
+		dev_err(edac_dev->dev,
+			"%sAXI slave 0 illegal %s access @ 0x%02X.%08X (0x%08X)\n",
+			reg & IOBAXIS0_M_ILLEGAL_ACCESS_MASK ? "Multiple " : "",
+			REQTYPE_RD(err_addr_hi) ? "read" : "write",
+			ERRADDRH_RD(err_addr_hi), err_addr_lo, err_addr_hi);
+		writel(reg, ctx->dev_csr + IOBAXIS0TRANSERRINTSTS);
+	}
+
+	/* IOB AXI1 Error */
+	reg = readl(ctx->dev_csr + IOBAXIS1TRANSERRINTSTS);
+	if (reg) {
+		err_addr_lo = readl(ctx->dev_csr + IOBAXIS1TRANSERRREQINFOL);
+		err_addr_hi = readl(ctx->dev_csr + IOBAXIS1TRANSERRREQINFOH);
+		dev_err(edac_dev->dev,
+			"%sAXI slave 1 illegal %s access @ 0x%02X.%08X (0x%08X)\n",
+			reg & IOBAXIS0_M_ILLEGAL_ACCESS_MASK ? "Multiple " : "",
+			REQTYPE_RD(err_addr_hi) ? "read" : "write",
+			ERRADDRH_RD(err_addr_hi), err_addr_lo, err_addr_hi);
+		writel(reg, ctx->dev_csr + IOBAXIS1TRANSERRINTSTS);
+	}
+}
+
+static void xgene_edac_soc_check(struct edac_device_ctl_info *edac_dev)
+{
+	struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
+	static const char * const mem_err_ip[] = {
+		"10GbE0",
+		"10GbE1",
+		"Security",
+		"SATA45",
+		"SATA23/ETH23",
+		"SATA01/ETH01",
+		"USB1",
+		"USB0",
+		"QML",
+		"QM0",
+		"QM1 (XGbE01)",
+		"PCIE4",
+		"PCIE3",
+		"PCIE2",
+		"PCIE1",
+		"PCIE0",
+		"CTX Manager",
+		"OCM",
+		"1GbE",
+		"CLE",
+		"AHBC",
+		"PktDMA",
+		"GFC",
+		"MSLIM",
+		"10GbE2",
+		"10GbE3",
+		"QM2 (XGbE23)",
+		"IOB",
+		"unknown",
+		"unknown",
+		"unknown",
+		"unknown",
+	};
+	u32 pcp_hp_stat;
+	u32 pcp_lp_stat;
+	u32 reg;
+	int i;
+
+	xgene_edac_pcp_rd(ctx->edac, PCPHPERRINTSTS, &pcp_hp_stat);
+	xgene_edac_pcp_rd(ctx->edac, PCPLPERRINTSTS, &pcp_lp_stat);
+	xgene_edac_pcp_rd(ctx->edac, MEMERRINTSTS, &reg);
+	if (!((pcp_hp_stat & (IOB_PA_ERR_MASK | IOB_BA_ERR_MASK |
+			     IOB_XGIC_ERR_MASK | IOB_RB_ERR_MASK)) ||
+	      (pcp_lp_stat & CSW_SWITCH_TRACE_ERR_MASK) || reg))
+		return;
+
+	if (pcp_hp_stat & IOB_XGIC_ERR_MASK)
+		xgene_edac_iob_gic_report(edac_dev);
+
+	if (pcp_hp_stat & (IOB_RB_ERR_MASK | IOB_BA_ERR_MASK))
+		xgene_edac_rb_report(edac_dev);
+
+	if (pcp_hp_stat & IOB_PA_ERR_MASK)
+		xgene_edac_pa_report(edac_dev);
+
+	if (pcp_lp_stat & CSW_SWITCH_TRACE_ERR_MASK) {
+		dev_info(edac_dev->dev,
+			 "CSW switch trace correctable memory parity error\n");
+		edac_device_handle_ce(edac_dev, 0, 0, edac_dev->ctl_name);
+	}
+
+	for (i = 0; i < 31; i++) {
+		if (reg & (1 << i)) {
+			dev_err(edac_dev->dev, "%s memory parity error\n",
+				mem_err_ip[i]);
+			edac_device_handle_ue(edac_dev, 0, 0,
+					      edac_dev->ctl_name);
+		}
+	}
+}
+
+static void xgene_edac_soc_hw_init(struct edac_device_ctl_info *edac_dev,
+				   bool enable)
+{
+	struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
+
+	/* Enable SoC IP error interrupt */
+	if (edac_dev->op_state == OP_RUNNING_INTERRUPT) {
+		if (enable) {
+			xgene_edac_pcp_clrbits(ctx->edac, PCPHPERRINTMSK,
+					       IOB_PA_ERR_MASK |
+					       IOB_BA_ERR_MASK |
+					       IOB_XGIC_ERR_MASK |
+					       IOB_RB_ERR_MASK);
+			xgene_edac_pcp_clrbits(ctx->edac, PCPLPERRINTMSK,
+					       CSW_SWITCH_TRACE_ERR_MASK);
+		} else {
+			xgene_edac_pcp_setbits(ctx->edac, PCPHPERRINTMSK,
+					       IOB_PA_ERR_MASK |
+					       IOB_BA_ERR_MASK |
+					       IOB_XGIC_ERR_MASK |
+					       IOB_RB_ERR_MASK);
+			xgene_edac_pcp_setbits(ctx->edac, PCPLPERRINTMSK,
+					       CSW_SWITCH_TRACE_ERR_MASK);
+		}
+
+		writel(enable ? 0x0 : 0xFFFFFFFF,
+		       ctx->dev_csr + IOBAXIS0TRANSERRINTMSK);
+		writel(enable ? 0x0 : 0xFFFFFFFF,
+		       ctx->dev_csr + IOBAXIS1TRANSERRINTMSK);
+		writel(enable ? 0x0 : 0xFFFFFFFF,
+		       ctx->dev_csr + XGICTRANSERRINTMSK);
+
+		xgene_edac_pcp_setbits(ctx->edac, MEMERRINTMSK,
+				       enable ? 0x0 : 0xFFFFFFFF);
+	}
+}
+
+static int xgene_edac_soc_add(struct xgene_edac *edac, struct device_node *np)
+{
+	struct edac_device_ctl_info *edac_dev;
+	struct xgene_edac_dev_ctx *ctx;
+	struct resource res;
+	int edac_idx;
+	int rc = 0;
+
+	if (!devres_open_group(edac->dev, xgene_edac_soc_add, GFP_KERNEL))
+		return -ENOMEM;
+
+	edac_idx = edac_device_alloc_index();
+	edac_dev = edac_device_alloc_ctl_info(sizeof(*ctx),
+					      "SOC", 1, "SOC", 1, 2, NULL, 0,
+					      edac_idx);
+	if (!edac_dev) {
+		rc = -ENOMEM;
+		goto err;
+	}
+
+	ctx = edac_dev->pvt_info;
+	ctx->name = "xgene_soc_err";
+	ctx->edac_idx = edac_idx;
+	ctx->edac = edac;
+	ctx->edac_dev = edac_dev;
+	ctx->ddev = *edac->dev;
+	edac_dev->dev = &ctx->ddev;
+	edac_dev->ctl_name = ctx->name;
+	edac_dev->dev_name = ctx->name;
+	edac_dev->mod_name = EDAC_MOD_STR;
+
+	rc = of_address_to_resource(np, 0, &res);
+	if (rc < 0) {
+		dev_err(edac->dev, "no SoC resource address\n");
+		goto err1;
+	}
+	ctx->dev_csr = devm_ioremap_resource(edac->dev, &res);
+	if (IS_ERR(ctx->dev_csr)) {
+		dev_err(edac->dev,
+			"devm_ioremap_resource failed for soc resource address\n");
+		rc = PTR_ERR(ctx->dev_csr);
+		goto err1;
+	}
+
+	if (edac_op_state == EDAC_OPSTATE_POLL)
+		edac_dev->edac_check = xgene_edac_soc_check;
+
+	rc = edac_device_add_device(edac_dev);
+	if (rc > 0) {
+		dev_err(edac->dev, "failed edac_device_add_device()\n");
+		rc = -ENOMEM;
+		goto err1;
+	}
+
+	if (edac_op_state == EDAC_OPSTATE_INT)
+		edac_dev->op_state = OP_RUNNING_INTERRUPT;
+
+	list_add(&ctx->next, &edac->socs);
+
+	xgene_edac_soc_hw_init(edac_dev, 1);
+
+	devres_remove_group(edac->dev, xgene_edac_soc_add);
+
+	dev_info(edac->dev, "X-Gene EDAC SoC registered\n");
+	return 0;
+
+err1:
+	edac_device_free_ctl_info(edac_dev);
+err:
+	devres_release_group(edac->dev, xgene_edac_soc_add);
+	return rc;
+}
+
+static int xgene_edac_soc_remove(struct xgene_edac_dev_ctx *soc)
+{
+	struct edac_device_ctl_info *edac_dev = soc->edac_dev;
+
+	xgene_edac_soc_hw_init(edac_dev, 0);
+	edac_device_del_device(soc->edac->dev);
+	edac_device_free_ctl_info(edac_dev);
+	return 0;
+}
+
 static irqreturn_t xgene_edac_isr(int irq, void *dev_id)
 {
 	struct xgene_edac *ctx = dev_id;
 	struct xgene_edac_pmd_ctx *pmd;
+	struct xgene_edac_dev_ctx *node;
 	unsigned int pcp_hp_stat;
 	unsigned int pcp_lp_stat;
 
@@ -1081,6 +1800,14 @@ static irqreturn_t xgene_edac_isr(int irq, void *dev_id)
 			xgene_edac_pmd_check(pmd->edac_dev);
 	}
 
+	list_for_each_entry(node, &ctx->l3s, next) {
+		xgene_edac_l3_check(node->edac_dev);
+	}
+
+	list_for_each_entry(node, &ctx->socs, next) {
+		xgene_edac_soc_check(node->edac_dev);
+	}
+
 	return IRQ_HANDLED;
 }
 
@@ -1099,6 +1826,8 @@ static int xgene_edac_probe(struct platform_device *pdev)
 	platform_set_drvdata(pdev, edac);
 	INIT_LIST_HEAD(&edac->mcus);
 	INIT_LIST_HEAD(&edac->pmds);
+	INIT_LIST_HEAD(&edac->l3s);
+	INIT_LIST_HEAD(&edac->socs);
 	spin_lock_init(&edac->lock);
 	mutex_init(&edac->mc_lock);
 
@@ -1168,8 +1897,12 @@ static int xgene_edac_probe(struct platform_device *pdev)
 			continue;
 		if (of_device_is_compatible(child, "apm,xgene-edac-mc"))
 			xgene_edac_mc_add(edac, child);
-		if (of_device_is_compatible(child, "apm,xgene-edac-pmd"))
+		else if (of_device_is_compatible(child, "apm,xgene-edac-pmd"))
 			xgene_edac_pmd_add(edac, child);
+		else if (of_device_is_compatible(child, "apm,xgene-edac-l3"))
+			xgene_edac_l3_add(edac, child);
+		else if (of_device_is_compatible(child, "apm,xgene-edac-soc"))
+			xgene_edac_soc_add(edac, child);
 	}
 
 	return 0;
@@ -1185,6 +1918,8 @@ static int xgene_edac_remove(struct platform_device *pdev)
 	struct xgene_edac_mc_ctx *temp_mcu;
 	struct xgene_edac_pmd_ctx *pmd;
 	struct xgene_edac_pmd_ctx *temp_pmd;
+	struct xgene_edac_dev_ctx *node;
+	struct xgene_edac_dev_ctx *temp_node;
 
 	list_for_each_entry_safe(mcu, temp_mcu, &edac->mcus, next) {
 		xgene_edac_mc_remove(mcu);
@@ -1193,6 +1928,15 @@ static int xgene_edac_remove(struct platform_device *pdev)
 	list_for_each_entry_safe(pmd, temp_pmd, &edac->pmds, next) {
 		xgene_edac_pmd_remove(pmd);
 	}
+
+	list_for_each_entry_safe(node, temp_node, &edac->l3s, next) {
+		xgene_edac_l3_remove(node);
+	}
+
+	list_for_each_entry_safe(node, temp_node, &edac->socs, next) {
+		xgene_edac_soc_remove(node);
+	}
+
 	return 0;
 }
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/3] arm64: Add L3/SoC DT subnodes to the APM X-Gene SoC EDAC node
  2015-05-29 22:04         ` Loc Ho
@ 2015-05-29 22:04             ` Loc Ho
  -1 siblings, 0 replies; 24+ messages in thread
From: Loc Ho @ 2015-05-29 22:04 UTC (permalink / raw)
  To: dougthompson-aS9lmoZGLiVWk0Htik3J/w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	mchehab-JPH+aEBZ4P+UEJcrhfAQsw, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	mark.rutland-5wv7dgnIgG8, ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg
  Cc: linux-edac-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	jcm-H+wXaHxf7aLQT0dZR+AlfA, patches-qTEPVZfXA3Y, Loc Ho

This patch adds L3/SoC DT subnodes to the APM X-Gene SoC EDAC node.

Signed-off-by: Loc Ho <lho-qTEPVZfXA3Y@public.gmane.org>
---
 arch/arm64/boot/dts/apm/apm-storm.dtsi |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/arch/arm64/boot/dts/apm/apm-storm.dtsi b/arch/arm64/boot/dts/apm/apm-storm.dtsi
index 577799f..9ebc0d4 100644
--- a/arch/arm64/boot/dts/apm/apm-storm.dtsi
+++ b/arch/arm64/boot/dts/apm/apm-storm.dtsi
@@ -455,6 +455,16 @@
 				reg = <0x0 0x7c600000 0x0 0x200000>;
 				pmd-controller = <3>;
 			};
+
+			edacl3@7e600000 {
+				compatible = "apm,xgene-edac-l3";
+				reg = <0x0 0x7e600000 0x0 0x1000>;
+			};
+
+			edacsoc@7e930000 {
+				compatible = "apm,xgene-edac-soc";
+				reg = <0x0 0x7e930000 0x0 0x1000>;
+			};
 		};
 
 		pcie0: pcie@1f2b0000 {
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/3] arm64: Add L3/SoC DT subnodes to the APM X-Gene SoC EDAC node
@ 2015-05-29 22:04             ` Loc Ho
  0 siblings, 0 replies; 24+ messages in thread
From: Loc Ho @ 2015-05-29 22:04 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds L3/SoC DT subnodes to the APM X-Gene SoC EDAC node.

Signed-off-by: Loc Ho <lho@apm.com>
---
 arch/arm64/boot/dts/apm/apm-storm.dtsi |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/arch/arm64/boot/dts/apm/apm-storm.dtsi b/arch/arm64/boot/dts/apm/apm-storm.dtsi
index 577799f..9ebc0d4 100644
--- a/arch/arm64/boot/dts/apm/apm-storm.dtsi
+++ b/arch/arm64/boot/dts/apm/apm-storm.dtsi
@@ -455,6 +455,16 @@
 				reg = <0x0 0x7c600000 0x0 0x200000>;
 				pmd-controller = <3>;
 			};
+
+			edacl3 at 7e600000 {
+				compatible = "apm,xgene-edac-l3";
+				reg = <0x0 0x7e600000 0x0 0x1000>;
+			};
+
+			edacsoc at 7e930000 {
+				compatible = "apm,xgene-edac-soc";
+				reg = <0x0 0x7e930000 0x0 0x1000>;
+			};
 		};
 
 		pcie0: pcie at 1f2b0000 {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
  2015-05-29 22:04         ` Loc Ho
@ 2015-06-01 11:43             ` Borislav Petkov
  -1 siblings, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2015-06-01 11:43 UTC (permalink / raw)
  To: Loc Ho
  Cc: dougthompson-aS9lmoZGLiVWk0Htik3J/w,
	mchehab-JPH+aEBZ4P+UEJcrhfAQsw, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	mark.rutland-5wv7dgnIgG8, ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg,
	linux-edac-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	jcm-H+wXaHxf7aLQT0dZR+AlfA, patches-qTEPVZfXA3Y

On Fri, May 29, 2015 at 04:04:34PM -0600, Loc Ho wrote:
> This patch adds EDAC support for the L3 and SoC components.

So what was the reason for that split now? None, AFAICT.

> +/* L3 Error device */
> +#define L3C_ESR				(0x0A * 4)
> +#define  L3C_ESR_DATATAG_MASK		0x00000200
> +#define  L3C_ESR_MULTIHIT_MASK		0x00000100
> +#define  L3C_ESR_UCEVICT_MASK		0x00000040
> +#define  L3C_ESR_MULTIUCERR_MASK	0x00000020
> +#define  L3C_ESR_MULTICERR_MASK		0x00000010
> +#define  L3C_ESR_UCERR_MASK		0x00000008
> +#define  L3C_ESR_CERR_MASK		0x00000004
> +#define  L3C_ESR_UCERRINTR_MASK		0x00000002
> +#define  L3C_ESR_CERRINTR_MASK		0x00000001
> +#define L3C_ECR				(0x0B * 4)
> +#define  L3C_ECR_UCINTREN		0x00000008
> +#define  L3C_ECR_CINTREN		0x00000004
> +#define  L3C_UCERREN			0x00000002
> +#define  L3C_CERREN			0x00000001
> +#define L3C_ELR				(0x0C * 4)
> +#define  L3C_ELR_ERRSYN(src)		((src & 0xFF800000) >> 23)
> +#define  L3C_ELR_ERRWAY(src)		((src & 0x007E0000) >> 17)
> +#define  L3C_ELR_AGENTID(src)		((src & 0x0001E000) >> 13)
> +#define  L3C_ELR_ERRGRP(src)		((src & 0x00000F00) >> 8)
> +#define  L3C_ELR_OPTYPE(src)		((src & 0x000000F0) >> 4)
> +#define  L3C_ELR_PADDRHIGH(src)		(src & 0x0000000F)
> +#define L3C_AELR			(0x0D * 4)
> +#define L3C_BELR			(0x0E * 4)
> +#define  L3C_BELR_BANK(src)		(src & 0x0000000F)

Use BIT() for all those single bits, as before.

...

> +static struct edac_dev_sysfs_attribute xgene_edac_l3_sysfs_attributes[] = {
> +	{ .attr = {
> +		  .name = "inject_ctrl",
> +		  .mode = (S_IRUGO | S_IWUSR)
> +	  },
> +	 .show = xgene_edac_l3_inject_ctrl_show,
> +	 .store = xgene_edac_l3_inject_ctrl_store },
> +
> +	/* End of list */
> +	{ .attr = {.name = NULL } }
> +};

Why are those sysfs nodes? Didn't we say that inject nodes should be in
debugfs?

So I'm going to stop looking here and wait for you to do the same
changes to L3 and SOC as for the rest of the driver. Go through what
just went upstream and do the same changes to that patch instead of
blindly resending the original version.

It is not a competition who gets their stuff upstream first, ok?!

> +static int xgene_edac_l3_add(struct xgene_edac *edac, struct device_node *np)
> +{
> +	struct edac_device_ctl_info *edac_dev;
> +	struct xgene_edac_dev_ctx *ctx;
> +	struct resource res;
> +	int edac_idx;
> +	int rc = 0;
> +
> +	if (!devres_open_group(edac->dev, xgene_edac_l3_add, GFP_KERNEL))
> +		return -ENOMEM;
> +
> +	edac_idx = edac_device_alloc_index();
> +	edac_dev = edac_device_alloc_ctl_info(sizeof(*ctx),
> +					      "l3c", 1, "l3c", 1, 0, NULL, 0,
> +					      edac_idx);
> +	if (!edac_dev) {
> +		rc = -ENOMEM;
> +		goto err;
> +	}
> +
> +	ctx = edac_dev->pvt_info;
> +	ctx->name = "xgene_l3_err";
> +	ctx->edac_idx = edac_idx;
> +	ctx->edac = edac;
> +	ctx->edac_dev = edac_dev;
> +	ctx->ddev = *edac->dev;
> +	edac_dev->dev = &ctx->ddev;
> +	edac_dev->ctl_name = ctx->name;
> +	edac_dev->dev_name = ctx->name;
> +	edac_dev->mod_name = EDAC_MOD_STR;

As before, do the allocation and preparation of edac_dev only after ...

> +
> +	rc = of_address_to_resource(np, 0, &res);
> +	if (rc < 0) {
> +		dev_err(edac->dev, "no L3 resource address\n");
> +		goto err1;
> +	}
> +	ctx->dev_csr = devm_ioremap_resource(edac->dev, &res);
> +	if (IS_ERR(ctx->dev_csr)) {
> +		dev_err(edac->dev,
> +			"devm_ioremap_resource failed for L3 resource address\n");
> +		rc = PTR_ERR(ctx->dev_csr);
> +		goto err1;
> +	}

... those above have succeeded.

> +
> +	if (edac_op_state == EDAC_OPSTATE_POLL)
> +		edac_dev->edac_check = xgene_edac_l3_check;
> +
> +	edac_dev->sysfs_attributes = xgene_edac_l3_sysfs_attributes;
> +
> +	rc = edac_device_add_device(edac_dev);
> +	if (rc > 0) {
> +		dev_err(edac->dev, "failed edac_device_add_device()\n");
> +		rc = -ENOMEM;
> +		goto err1;
> +	}
> +
> +	if (edac_op_state == EDAC_OPSTATE_INT)
> +		edac_dev->op_state = OP_RUNNING_INTERRUPT;
> +
> +	list_add(&ctx->next, &edac->l3s);
> +
> +	xgene_edac_l3_hw_init(edac_dev, 1);

Shouldn't you init the hw regs *before* you add it to the list of l3s
and *not* after?

> +
> +	devres_remove_group(edac->dev, xgene_edac_l3_add);
> +
> +	dev_info(edac->dev, "X-Gene EDAC L3 registered\n");
> +	return 0;
> +
> +err1:
> +	edac_device_free_ctl_info(edac_dev);
> +err:
> +	devres_release_group(edac->dev, xgene_edac_l3_add);
> +	return rc;
> +}
> +
> +static int xgene_edac_l3_remove(struct xgene_edac_dev_ctx *l3)
> +{
> +	struct edac_device_ctl_info *edac_dev = l3->edac_dev;
> +
> +	xgene_edac_l3_hw_init(edac_dev, 0);
> +	edac_device_del_device(l3->edac->dev);
> +	edac_device_free_ctl_info(edac_dev);
> +	return 0;
> +}
> +
> +/* SoC Error device */
> +#define IOBAXIS0TRANSERRINTSTS		0x0000
> +#define  IOBAXIS0_M_ILLEGAL_ACCESS_MASK	0x00000002
> +#define  IOBAXIS0_ILLEGAL_ACCESS_MASK	0x00000001
> +#define IOBAXIS0TRANSERRINTMSK		0x0004
> +#define IOBAXIS0TRANSERRREQINFOL	0x0008
> +#define IOBAXIS0TRANSERRREQINFOH	0x000c
> +#define  REQTYPE_RD(src)		(((src) & 0x00000001))
> +#define  ERRADDRH_RD(src)		(((src) & 0xffc00000) >> 22)
> +#define IOBAXIS1TRANSERRINTSTS		0x0010
> +#define IOBAXIS1TRANSERRINTMSK		0x0014
> +#define IOBAXIS1TRANSERRREQINFOL	0x0018
> +#define IOBAXIS1TRANSERRREQINFOH	0x001c
> +#define IOBPATRANSERRINTSTS		0x0020
> +#define  IOBPA_M_REQIDRAM_CORRUPT_MASK	0x00000080
> +#define  IOBPA_REQIDRAM_CORRUPT_MASK	0x00000040
> +#define  IOBPA_M_TRANS_CORRUPT_MASK	0x00000020
> +#define  IOBPA_TRANS_CORRUPT_MASK	0x00000010
> +#define  IOBPA_M_WDATA_CORRUPT_MASK	0x00000008
> +#define  IOBPA_WDATA_CORRUPT_MASK	0x00000004
> +#define  IOBPA_M_RDATA_CORRUPT_MASK	0x00000002
> +#define  IOBPA_RDATA_CORRUPT_MASK	0x00000001
> +#define IOBBATRANSERRINTSTS		0x0030
> +#define  M_ILLEGAL_ACCESS_MASK		0x00008000
> +#define  ILLEGAL_ACCESS_MASK		0x00004000
> +#define  M_WIDRAM_CORRUPT_MASK		0x00002000
> +#define  WIDRAM_CORRUPT_MASK		0x00001000
> +#define  M_RIDRAM_CORRUPT_MASK		0x00000800
> +#define  RIDRAM_CORRUPT_MASK		0x00000400
> +#define  M_TRANS_CORRUPT_MASK		0x00000200
> +#define  TRANS_CORRUPT_MASK		0x00000100
> +#define  M_WDATA_CORRUPT_MASK		0x00000080
> +#define  WDATA_CORRUPT_MASK		0x00000040
> +#define  M_RBM_POISONED_REQ_MASK	0x00000020
> +#define  RBM_POISONED_REQ_MASK		0x00000010
> +#define  M_XGIC_POISONED_REQ_MASK	0x00000008
> +#define  XGIC_POISONED_REQ_MASK		0x00000004
> +#define  M_WRERR_RESP_MASK		0x00000002
> +#define  WRERR_RESP_MASK		0x00000001
> +#define IOBBATRANSERRREQINFOL		0x0038
> +#define IOBBATRANSERRREQINFOH		0x003c
> +#define  REQTYPE_F2_RD(src)		(((src) & 0x00000001))
> +#define  ERRADDRH_F2_RD(src)		(((src) & 0xffc00000) >> 22)
> +#define IOBBATRANSERRCSWREQID		0x0040
> +#define XGICTRANSERRINTSTS		0x0050
> +#define  M_WR_ACCESS_ERR_MASK		0x00000008
> +#define  WR_ACCESS_ERR_MASK		0x00000004
> +#define  M_RD_ACCESS_ERR_MASK		0x00000002
> +#define  RD_ACCESS_ERR_MASK		0x00000001
> +#define XGICTRANSERRINTMSK		0x0054
> +#define XGICTRANSERRREQINFO		0x0058
> +#define  REQTYPE_MASK			0x04000000
> +#define  ERRADDR_RD(src)		((src) & 0x03ffffff)
> +#define GLBL_ERR_STS			0x0800
> +#define  MDED_ERR_MASK			0x00000008
> +#define  DED_ERR_MASK			0x00000004
> +#define  MSEC_ERR_MASK			0x00000002
> +#define  SEC_ERR_MASK			0x00000001
> +#define GLBL_SEC_ERRL			0x0810
> +#define GLBL_SEC_ERRH			0x0818
> +#define GLBL_MSEC_ERRL			0x0820
> +#define GLBL_MSEC_ERRH			0x0828
> +#define GLBL_DED_ERRL			0x0830
> +#define GLBL_DED_ERRLMASK		0x0834
> +#define GLBL_DED_ERRH			0x0838
> +#define GLBL_DED_ERRHMASK		0x083c
> +#define GLBL_MDED_ERRL			0x0840
> +#define GLBL_MDED_ERRLMASK		0x0844
> +#define GLBL_MDED_ERRH			0x0848
> +#define GLBL_MDED_ERRHMASK		0x084c

Use BIT() for all those single bits, as before.

And so on and so on...

Please take the time and do it right.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 2/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
@ 2015-06-01 11:43             ` Borislav Petkov
  0 siblings, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2015-06-01 11:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 29, 2015 at 04:04:34PM -0600, Loc Ho wrote:
> This patch adds EDAC support for the L3 and SoC components.

So what was the reason for that split now? None, AFAICT.

> +/* L3 Error device */
> +#define L3C_ESR				(0x0A * 4)
> +#define  L3C_ESR_DATATAG_MASK		0x00000200
> +#define  L3C_ESR_MULTIHIT_MASK		0x00000100
> +#define  L3C_ESR_UCEVICT_MASK		0x00000040
> +#define  L3C_ESR_MULTIUCERR_MASK	0x00000020
> +#define  L3C_ESR_MULTICERR_MASK		0x00000010
> +#define  L3C_ESR_UCERR_MASK		0x00000008
> +#define  L3C_ESR_CERR_MASK		0x00000004
> +#define  L3C_ESR_UCERRINTR_MASK		0x00000002
> +#define  L3C_ESR_CERRINTR_MASK		0x00000001
> +#define L3C_ECR				(0x0B * 4)
> +#define  L3C_ECR_UCINTREN		0x00000008
> +#define  L3C_ECR_CINTREN		0x00000004
> +#define  L3C_UCERREN			0x00000002
> +#define  L3C_CERREN			0x00000001
> +#define L3C_ELR				(0x0C * 4)
> +#define  L3C_ELR_ERRSYN(src)		((src & 0xFF800000) >> 23)
> +#define  L3C_ELR_ERRWAY(src)		((src & 0x007E0000) >> 17)
> +#define  L3C_ELR_AGENTID(src)		((src & 0x0001E000) >> 13)
> +#define  L3C_ELR_ERRGRP(src)		((src & 0x00000F00) >> 8)
> +#define  L3C_ELR_OPTYPE(src)		((src & 0x000000F0) >> 4)
> +#define  L3C_ELR_PADDRHIGH(src)		(src & 0x0000000F)
> +#define L3C_AELR			(0x0D * 4)
> +#define L3C_BELR			(0x0E * 4)
> +#define  L3C_BELR_BANK(src)		(src & 0x0000000F)

Use BIT() for all those single bits, as before.

...

> +static struct edac_dev_sysfs_attribute xgene_edac_l3_sysfs_attributes[] = {
> +	{ .attr = {
> +		  .name = "inject_ctrl",
> +		  .mode = (S_IRUGO | S_IWUSR)
> +	  },
> +	 .show = xgene_edac_l3_inject_ctrl_show,
> +	 .store = xgene_edac_l3_inject_ctrl_store },
> +
> +	/* End of list */
> +	{ .attr = {.name = NULL } }
> +};

Why are those sysfs nodes? Didn't we say that inject nodes should be in
debugfs?

So I'm going to stop looking here and wait for you to do the same
changes to L3 and SOC as for the rest of the driver. Go through what
just went upstream and do the same changes to that patch instead of
blindly resending the original version.

It is not a competition who gets their stuff upstream first, ok?!

> +static int xgene_edac_l3_add(struct xgene_edac *edac, struct device_node *np)
> +{
> +	struct edac_device_ctl_info *edac_dev;
> +	struct xgene_edac_dev_ctx *ctx;
> +	struct resource res;
> +	int edac_idx;
> +	int rc = 0;
> +
> +	if (!devres_open_group(edac->dev, xgene_edac_l3_add, GFP_KERNEL))
> +		return -ENOMEM;
> +
> +	edac_idx = edac_device_alloc_index();
> +	edac_dev = edac_device_alloc_ctl_info(sizeof(*ctx),
> +					      "l3c", 1, "l3c", 1, 0, NULL, 0,
> +					      edac_idx);
> +	if (!edac_dev) {
> +		rc = -ENOMEM;
> +		goto err;
> +	}
> +
> +	ctx = edac_dev->pvt_info;
> +	ctx->name = "xgene_l3_err";
> +	ctx->edac_idx = edac_idx;
> +	ctx->edac = edac;
> +	ctx->edac_dev = edac_dev;
> +	ctx->ddev = *edac->dev;
> +	edac_dev->dev = &ctx->ddev;
> +	edac_dev->ctl_name = ctx->name;
> +	edac_dev->dev_name = ctx->name;
> +	edac_dev->mod_name = EDAC_MOD_STR;

As before, do the allocation and preparation of edac_dev only after ...

> +
> +	rc = of_address_to_resource(np, 0, &res);
> +	if (rc < 0) {
> +		dev_err(edac->dev, "no L3 resource address\n");
> +		goto err1;
> +	}
> +	ctx->dev_csr = devm_ioremap_resource(edac->dev, &res);
> +	if (IS_ERR(ctx->dev_csr)) {
> +		dev_err(edac->dev,
> +			"devm_ioremap_resource failed for L3 resource address\n");
> +		rc = PTR_ERR(ctx->dev_csr);
> +		goto err1;
> +	}

... those above have succeeded.

> +
> +	if (edac_op_state == EDAC_OPSTATE_POLL)
> +		edac_dev->edac_check = xgene_edac_l3_check;
> +
> +	edac_dev->sysfs_attributes = xgene_edac_l3_sysfs_attributes;
> +
> +	rc = edac_device_add_device(edac_dev);
> +	if (rc > 0) {
> +		dev_err(edac->dev, "failed edac_device_add_device()\n");
> +		rc = -ENOMEM;
> +		goto err1;
> +	}
> +
> +	if (edac_op_state == EDAC_OPSTATE_INT)
> +		edac_dev->op_state = OP_RUNNING_INTERRUPT;
> +
> +	list_add(&ctx->next, &edac->l3s);
> +
> +	xgene_edac_l3_hw_init(edac_dev, 1);

Shouldn't you init the hw regs *before* you add it to the list of l3s
and *not* after?

> +
> +	devres_remove_group(edac->dev, xgene_edac_l3_add);
> +
> +	dev_info(edac->dev, "X-Gene EDAC L3 registered\n");
> +	return 0;
> +
> +err1:
> +	edac_device_free_ctl_info(edac_dev);
> +err:
> +	devres_release_group(edac->dev, xgene_edac_l3_add);
> +	return rc;
> +}
> +
> +static int xgene_edac_l3_remove(struct xgene_edac_dev_ctx *l3)
> +{
> +	struct edac_device_ctl_info *edac_dev = l3->edac_dev;
> +
> +	xgene_edac_l3_hw_init(edac_dev, 0);
> +	edac_device_del_device(l3->edac->dev);
> +	edac_device_free_ctl_info(edac_dev);
> +	return 0;
> +}
> +
> +/* SoC Error device */
> +#define IOBAXIS0TRANSERRINTSTS		0x0000
> +#define  IOBAXIS0_M_ILLEGAL_ACCESS_MASK	0x00000002
> +#define  IOBAXIS0_ILLEGAL_ACCESS_MASK	0x00000001
> +#define IOBAXIS0TRANSERRINTMSK		0x0004
> +#define IOBAXIS0TRANSERRREQINFOL	0x0008
> +#define IOBAXIS0TRANSERRREQINFOH	0x000c
> +#define  REQTYPE_RD(src)		(((src) & 0x00000001))
> +#define  ERRADDRH_RD(src)		(((src) & 0xffc00000) >> 22)
> +#define IOBAXIS1TRANSERRINTSTS		0x0010
> +#define IOBAXIS1TRANSERRINTMSK		0x0014
> +#define IOBAXIS1TRANSERRREQINFOL	0x0018
> +#define IOBAXIS1TRANSERRREQINFOH	0x001c
> +#define IOBPATRANSERRINTSTS		0x0020
> +#define  IOBPA_M_REQIDRAM_CORRUPT_MASK	0x00000080
> +#define  IOBPA_REQIDRAM_CORRUPT_MASK	0x00000040
> +#define  IOBPA_M_TRANS_CORRUPT_MASK	0x00000020
> +#define  IOBPA_TRANS_CORRUPT_MASK	0x00000010
> +#define  IOBPA_M_WDATA_CORRUPT_MASK	0x00000008
> +#define  IOBPA_WDATA_CORRUPT_MASK	0x00000004
> +#define  IOBPA_M_RDATA_CORRUPT_MASK	0x00000002
> +#define  IOBPA_RDATA_CORRUPT_MASK	0x00000001
> +#define IOBBATRANSERRINTSTS		0x0030
> +#define  M_ILLEGAL_ACCESS_MASK		0x00008000
> +#define  ILLEGAL_ACCESS_MASK		0x00004000
> +#define  M_WIDRAM_CORRUPT_MASK		0x00002000
> +#define  WIDRAM_CORRUPT_MASK		0x00001000
> +#define  M_RIDRAM_CORRUPT_MASK		0x00000800
> +#define  RIDRAM_CORRUPT_MASK		0x00000400
> +#define  M_TRANS_CORRUPT_MASK		0x00000200
> +#define  TRANS_CORRUPT_MASK		0x00000100
> +#define  M_WDATA_CORRUPT_MASK		0x00000080
> +#define  WDATA_CORRUPT_MASK		0x00000040
> +#define  M_RBM_POISONED_REQ_MASK	0x00000020
> +#define  RBM_POISONED_REQ_MASK		0x00000010
> +#define  M_XGIC_POISONED_REQ_MASK	0x00000008
> +#define  XGIC_POISONED_REQ_MASK		0x00000004
> +#define  M_WRERR_RESP_MASK		0x00000002
> +#define  WRERR_RESP_MASK		0x00000001
> +#define IOBBATRANSERRREQINFOL		0x0038
> +#define IOBBATRANSERRREQINFOH		0x003c
> +#define  REQTYPE_F2_RD(src)		(((src) & 0x00000001))
> +#define  ERRADDRH_F2_RD(src)		(((src) & 0xffc00000) >> 22)
> +#define IOBBATRANSERRCSWREQID		0x0040
> +#define XGICTRANSERRINTSTS		0x0050
> +#define  M_WR_ACCESS_ERR_MASK		0x00000008
> +#define  WR_ACCESS_ERR_MASK		0x00000004
> +#define  M_RD_ACCESS_ERR_MASK		0x00000002
> +#define  RD_ACCESS_ERR_MASK		0x00000001
> +#define XGICTRANSERRINTMSK		0x0054
> +#define XGICTRANSERRREQINFO		0x0058
> +#define  REQTYPE_MASK			0x04000000
> +#define  ERRADDR_RD(src)		((src) & 0x03ffffff)
> +#define GLBL_ERR_STS			0x0800
> +#define  MDED_ERR_MASK			0x00000008
> +#define  DED_ERR_MASK			0x00000004
> +#define  MSEC_ERR_MASK			0x00000002
> +#define  SEC_ERR_MASK			0x00000001
> +#define GLBL_SEC_ERRL			0x0810
> +#define GLBL_SEC_ERRH			0x0818
> +#define GLBL_MSEC_ERRL			0x0820
> +#define GLBL_MSEC_ERRH			0x0828
> +#define GLBL_DED_ERRL			0x0830
> +#define GLBL_DED_ERRLMASK		0x0834
> +#define GLBL_DED_ERRH			0x0838
> +#define GLBL_DED_ERRHMASK		0x083c
> +#define GLBL_MDED_ERRL			0x0840
> +#define GLBL_MDED_ERRLMASK		0x0844
> +#define GLBL_MDED_ERRH			0x0848
> +#define GLBL_MDED_ERRHMASK		0x084c

Use BIT() for all those single bits, as before.

And so on and so on...

Please take the time and do it right.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
  2015-05-29 22:04         ` Loc Ho
@ 2015-06-01 14:39             ` Arnd Bergmann
  -1 siblings, 0 replies; 24+ messages in thread
From: Arnd Bergmann @ 2015-06-01 14:39 UTC (permalink / raw)
  To: Loc Ho
  Cc: dougthompson-aS9lmoZGLiVWk0Htik3J/w, bp-Gina5bIWoIWzQB+pC5nmwQ,
	mchehab-JPH+aEBZ4P+UEJcrhfAQsw, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	mark.rutland-5wv7dgnIgG8, ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg,
	linux-edac-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	jcm-H+wXaHxf7aLQT0dZR+AlfA, patches-qTEPVZfXA3Y

On Friday 29 May 2015 16:04:34 Loc Ho wrote:
> +static void xgene_edac_soc_check(struct edac_device_ctl_info *edac_dev)
> +{
> +       struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
> +       static const char * const mem_err_ip[] = {
> +               "10GbE0",
> +               "10GbE1",
> +               "Security",
> +               "SATA45",
> +               "SATA23/ETH23",
> +               "SATA01/ETH01",
> +               "USB1",
> +               "USB0",
> +               "QML",
> +               "QM0",
> +               "QM1 (XGbE01)",
> +               "PCIE4",

This list seems a little too hardware specific, I'd assume that the numbers
get a different meaning with the xgene2.
	Arnd
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 2/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
@ 2015-06-01 14:39             ` Arnd Bergmann
  0 siblings, 0 replies; 24+ messages in thread
From: Arnd Bergmann @ 2015-06-01 14:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 29 May 2015 16:04:34 Loc Ho wrote:
> +static void xgene_edac_soc_check(struct edac_device_ctl_info *edac_dev)
> +{
> +       struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
> +       static const char * const mem_err_ip[] = {
> +               "10GbE0",
> +               "10GbE1",
> +               "Security",
> +               "SATA45",
> +               "SATA23/ETH23",
> +               "SATA01/ETH01",
> +               "USB1",
> +               "USB0",
> +               "QML",
> +               "QM0",
> +               "QM1 (XGbE01)",
> +               "PCIE4",

This list seems a little too hardware specific, I'd assume that the numbers
get a different meaning with the xgene2.
	Arnd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] Documentation: Update the APM X-Gene SoC EDAC DTS binding for L3/SoC subnodes
  2015-05-29 22:04     ` Loc Ho
@ 2015-06-01 14:39         ` Arnd Bergmann
  -1 siblings, 0 replies; 24+ messages in thread
From: Arnd Bergmann @ 2015-06-01 14:39 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: Loc Ho, dougthompson-aS9lmoZGLiVWk0Htik3J/w,
	bp-Gina5bIWoIWzQB+pC5nmwQ, mchehab-JPH+aEBZ4P+UEJcrhfAQsw,
	robh+dt-DgEjT+Ai2ygdnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg,
	devicetree-u79uwXL29TY76Z2rM5mHXA, jcm-H+wXaHxf7aLQT0dZR+AlfA,
	patches-qTEPVZfXA3Y, linux-edac-u79uwXL29TY76Z2rM5mHXA

On Friday 29 May 2015 16:04:33 Loc Ho wrote:
> This patch updates documentation for the APM X-Gene SoC EDAC DTS binding
> for L3/SoC subnodes.
> 
> Signed-off-by: Loc Ho <lho-qTEPVZfXA3Y@public.gmane.org>
> 

Acked-by: Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>

The binding seems fine.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/3] Documentation: Update the APM X-Gene SoC EDAC DTS binding for L3/SoC subnodes
@ 2015-06-01 14:39         ` Arnd Bergmann
  0 siblings, 0 replies; 24+ messages in thread
From: Arnd Bergmann @ 2015-06-01 14:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 29 May 2015 16:04:33 Loc Ho wrote:
> This patch updates documentation for the APM X-Gene SoC EDAC DTS binding
> for L3/SoC subnodes.
> 
> Signed-off-by: Loc Ho <lho@apm.com>
> 

Acked-by: Arnd Bergmann <arnd@arndb.de>

The binding seems fine.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
  2015-06-01 11:43             ` Borislav Petkov
@ 2015-06-01 17:26                 ` Loc Ho
  -1 siblings, 0 replies; 24+ messages in thread
From: Loc Ho @ 2015-06-01 17:26 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Doug Thompson, Mauro Carvalho Chehab, Rob Herring, Mark Rutland,
	Ian Campbell, linux-edac-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Jon Masters,
	patches-qTEPVZfXA3Y

Hi,

>> +static struct edac_dev_sysfs_attribute xgene_edac_l3_sysfs_attributes[] = {
>> +     { .attr = {
>> +               .name = "inject_ctrl",
>> +               .mode = (S_IRUGO | S_IWUSR)
>> +       },
>> +      .show = xgene_edac_l3_inject_ctrl_show,
>> +      .store = xgene_edac_l3_inject_ctrl_store },
>> +
>> +     /* End of list */
>> +     { .attr = {.name = NULL } }
>> +};
>
> Why are those sysfs nodes? Didn't we say that inject nodes should be in
> debugfs?
>
> So I'm going to stop looking here and wait for you to do the same
> changes to L3 and SOC as for the rest of the driver. Go through what
> just went upstream and do the same changes to that patch instead of
> blindly resending the original version.
>
> It is not a competition who gets their stuff upstream first, ok?!

Sorry about this and being a bit careless here.

>
>> +
>> +     if (edac_op_state == EDAC_OPSTATE_POLL)
>> +             edac_dev->edac_check = xgene_edac_l3_check;
>> +
>> +     edac_dev->sysfs_attributes = xgene_edac_l3_sysfs_attributes;
>> +
>> +     rc = edac_device_add_device(edac_dev);
>> +     if (rc > 0) {
>> +             dev_err(edac->dev, "failed edac_device_add_device()\n");
>> +             rc = -ENOMEM;
>> +             goto err1;
>> +     }
>> +
>> +     if (edac_op_state == EDAC_OPSTATE_INT)
>> +             edac_dev->op_state = OP_RUNNING_INTERRUPT;
>> +
>> +     list_add(&ctx->next, &edac->l3s);
>> +
>> +     xgene_edac_l3_hw_init(edac_dev, 1);
>
> Shouldn't you init the hw regs *before* you add it to the list of l3s
> and *not* after?

I want to make sure that the node is added in case there is an pending
interrupt. Otherwise, it will not get cleared and will continuously
generates interrupt.

-Loc
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 2/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
@ 2015-06-01 17:26                 ` Loc Ho
  0 siblings, 0 replies; 24+ messages in thread
From: Loc Ho @ 2015-06-01 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

>> +static struct edac_dev_sysfs_attribute xgene_edac_l3_sysfs_attributes[] = {
>> +     { .attr = {
>> +               .name = "inject_ctrl",
>> +               .mode = (S_IRUGO | S_IWUSR)
>> +       },
>> +      .show = xgene_edac_l3_inject_ctrl_show,
>> +      .store = xgene_edac_l3_inject_ctrl_store },
>> +
>> +     /* End of list */
>> +     { .attr = {.name = NULL } }
>> +};
>
> Why are those sysfs nodes? Didn't we say that inject nodes should be in
> debugfs?
>
> So I'm going to stop looking here and wait for you to do the same
> changes to L3 and SOC as for the rest of the driver. Go through what
> just went upstream and do the same changes to that patch instead of
> blindly resending the original version.
>
> It is not a competition who gets their stuff upstream first, ok?!

Sorry about this and being a bit careless here.

>
>> +
>> +     if (edac_op_state == EDAC_OPSTATE_POLL)
>> +             edac_dev->edac_check = xgene_edac_l3_check;
>> +
>> +     edac_dev->sysfs_attributes = xgene_edac_l3_sysfs_attributes;
>> +
>> +     rc = edac_device_add_device(edac_dev);
>> +     if (rc > 0) {
>> +             dev_err(edac->dev, "failed edac_device_add_device()\n");
>> +             rc = -ENOMEM;
>> +             goto err1;
>> +     }
>> +
>> +     if (edac_op_state == EDAC_OPSTATE_INT)
>> +             edac_dev->op_state = OP_RUNNING_INTERRUPT;
>> +
>> +     list_add(&ctx->next, &edac->l3s);
>> +
>> +     xgene_edac_l3_hw_init(edac_dev, 1);
>
> Shouldn't you init the hw regs *before* you add it to the list of l3s
> and *not* after?

I want to make sure that the node is added in case there is an pending
interrupt. Otherwise, it will not get cleared and will continuously
generates interrupt.

-Loc

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
  2015-06-01 14:39             ` Arnd Bergmann
@ 2015-06-01 17:46               ` Loc Ho
  -1 siblings, 0 replies; 24+ messages in thread
From: Loc Ho @ 2015-06-01 17:46 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Doug Thompson, Borislav Petkov, Mauro Carvalho Chehab,
	Rob Herring, Mark Rutland, Ian Campbell,
	linux-edac-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Jon Masters,
	patches-qTEPVZfXA3Y

Hi,

>> +static void xgene_edac_soc_check(struct edac_device_ctl_info *edac_dev)
>> +{
>> +       struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
>> +       static const char * const mem_err_ip[] = {
>> +               "10GbE0",
>> +               "10GbE1",
>> +               "Security",
>> +               "SATA45",
>> +               "SATA23/ETH23",
>> +               "SATA01/ETH01",
>> +               "USB1",
>> +               "USB0",
>> +               "QML",
>> +               "QM0",
>> +               "QM1 (XGbE01)",
>> +               "PCIE4",
>
> This list seems a little too hardware specific, I'd assume that the numbers
> get a different meaning with the xgene2.

You are right... Let me just prints the error code instead. Anyone who
care will have to do post processing. Future or existent next
generation may re-define them.

-Loc
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 2/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
@ 2015-06-01 17:46               ` Loc Ho
  0 siblings, 0 replies; 24+ messages in thread
From: Loc Ho @ 2015-06-01 17:46 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

>> +static void xgene_edac_soc_check(struct edac_device_ctl_info *edac_dev)
>> +{
>> +       struct xgene_edac_dev_ctx *ctx = edac_dev->pvt_info;
>> +       static const char * const mem_err_ip[] = {
>> +               "10GbE0",
>> +               "10GbE1",
>> +               "Security",
>> +               "SATA45",
>> +               "SATA23/ETH23",
>> +               "SATA01/ETH01",
>> +               "USB1",
>> +               "USB0",
>> +               "QML",
>> +               "QM0",
>> +               "QM1 (XGbE01)",
>> +               "PCIE4",
>
> This list seems a little too hardware specific, I'd assume that the numbers
> get a different meaning with the xgene2.

You are right... Let me just prints the error code instead. Anyone who
care will have to do post processing. Future or existent next
generation may re-define them.

-Loc

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
  2015-06-01 17:46               ` Loc Ho
@ 2015-06-01 18:31                   ` Borislav Petkov
  -1 siblings, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2015-06-01 18:31 UTC (permalink / raw)
  To: Loc Ho
  Cc: Arnd Bergmann, Doug Thompson, Mauro Carvalho Chehab, Rob Herring,
	Mark Rutland, Ian Campbell, linux-edac-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Jon Masters,
	patches-qTEPVZfXA3Y

On Mon, Jun 01, 2015 at 10:46:55AM -0700, Loc Ho wrote:
> You are right... Let me just prints the error code instead. Anyone who
> care will have to do post processing.

Don't forget about the usability of the driver. If a user has to go
open manuals when an error happens, you could just as well report naked
register values and have a tool decode them.

And this strategy (mcelog) turned out to be a real PITA, IMO.

> Future or existent next generation may re-define them.

You could easily have a strings array of per-family or per-soc error
descriptions. For an example, take a look at drivers/edac/mce_amd.c
which decodes MCEs on all relevant AMD machines. This is much more
user-friendly than dumping register values which most people have no
idea of where to start looking (and they don't really need to).

HTH.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 2/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
@ 2015-06-01 18:31                   ` Borislav Petkov
  0 siblings, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2015-06-01 18:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 01, 2015 at 10:46:55AM -0700, Loc Ho wrote:
> You are right... Let me just prints the error code instead. Anyone who
> care will have to do post processing.

Don't forget about the usability of the driver. If a user has to go
open manuals when an error happens, you could just as well report naked
register values and have a tool decode them.

And this strategy (mcelog) turned out to be a real PITA, IMO.

> Future or existent next generation may re-define them.

You could easily have a strings array of per-family or per-soc error
descriptions. For an example, take a look at drivers/edac/mce_amd.c
which decodes MCEs on all relevant AMD machines. This is much more
user-friendly than dumping register values which most people have no
idea of where to start looking (and they don't really need to).

HTH.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
  2015-06-01 18:31                   ` Borislav Petkov
@ 2015-06-01 18:37                     ` Arnd Bergmann
  -1 siblings, 0 replies; 24+ messages in thread
From: Arnd Bergmann @ 2015-06-01 18:37 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Mark Rutland, devicetree, Ian Campbell, Jon Masters,
	Mauro Carvalho Chehab, patches, Rob Herring, Loc Ho,
	Doug Thompson, linux-arm-kernel, linux-edac

On Monday 01 June 2015 20:31:14 Borislav Petkov wrote:
> On Mon, Jun 01, 2015 at 10:46:55AM -0700, Loc Ho wrote:
> > You are right... Let me just prints the error code instead. Anyone who
> > care will have to do post processing.
> 
> Don't forget about the usability of the driver. If a user has to go
> open manuals when an error happens, you could just as well report naked
> register values and have a tool decode them.
> 
> And this strategy (mcelog) turned out to be a real PITA, IMO.
> 
> > Future or existent next generation may re-define them.
> 
> You could easily have a strings array of per-family or per-soc error
> descriptions. For an example, take a look at drivers/edac/mce_amd.c
> which decodes MCEs on all relevant AMD machines. This is much more
> user-friendly than dumping register values which most people have no
> idea of where to start looking (and they don't really need to).

That would require having a way to identify the SoC with a distinct
compatible string. I don't know if a name exists for X-Gene that could
be used here.

	Arnd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 2/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
@ 2015-06-01 18:37                     ` Arnd Bergmann
  0 siblings, 0 replies; 24+ messages in thread
From: Arnd Bergmann @ 2015-06-01 18:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Monday 01 June 2015 20:31:14 Borislav Petkov wrote:
> On Mon, Jun 01, 2015 at 10:46:55AM -0700, Loc Ho wrote:
> > You are right... Let me just prints the error code instead. Anyone who
> > care will have to do post processing.
> 
> Don't forget about the usability of the driver. If a user has to go
> open manuals when an error happens, you could just as well report naked
> register values and have a tool decode them.
> 
> And this strategy (mcelog) turned out to be a real PITA, IMO.
> 
> > Future or existent next generation may re-define them.
> 
> You could easily have a strings array of per-family or per-soc error
> descriptions. For an example, take a look at drivers/edac/mce_amd.c
> which decodes MCEs on all relevant AMD machines. This is much more
> user-friendly than dumping register values which most people have no
> idea of where to start looking (and they don't really need to).

That would require having a way to identify the SoC with a distinct
compatible string. I don't know if a name exists for X-Gene that could
be used here.

	Arnd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
  2015-06-01 18:37                     ` Arnd Bergmann
@ 2015-06-01 18:59                       ` Loc Ho
  -1 siblings, 0 replies; 24+ messages in thread
From: Loc Ho @ 2015-06-01 18:59 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Borislav Petkov, Doug Thompson, Mauro Carvalho Chehab,
	Rob Herring, Mark Rutland, Ian Campbell,
	linux-edac-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Jon Masters,
	patches-qTEPVZfXA3Y

Hi,

>> > You are right... Let me just prints the error code instead. Anyone who
>> > care will have to do post processing.
>>
>> Don't forget about the usability of the driver. If a user has to go
>> open manuals when an error happens, you could just as well report naked
>> register values and have a tool decode them.
>>
>> And this strategy (mcelog) turned out to be a real PITA, IMO.
>>
>> > Future or existent next generation may re-define them.
>>
>> You could easily have a strings array of per-family or per-soc error
>> descriptions. For an example, take a look at drivers/edac/mce_amd.c
>> which decodes MCEs on all relevant AMD machines. This is much more
>> user-friendly than dumping register values which most people have no
>> idea of where to start looking (and they don't really need to).
>
> That would require having a way to identify the SoC with a distinct
> compatible string. I don't know if a name exists for X-Gene that could
> be used here.
>

I am not sure either but will try to figure this out. The issue with
compatible string is the requirement to have FW go and patch it up. We
don't have multiple DT's for each minor version of the chip.

-Loc
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 2/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver
@ 2015-06-01 18:59                       ` Loc Ho
  0 siblings, 0 replies; 24+ messages in thread
From: Loc Ho @ 2015-06-01 18:59 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

>> > You are right... Let me just prints the error code instead. Anyone who
>> > care will have to do post processing.
>>
>> Don't forget about the usability of the driver. If a user has to go
>> open manuals when an error happens, you could just as well report naked
>> register values and have a tool decode them.
>>
>> And this strategy (mcelog) turned out to be a real PITA, IMO.
>>
>> > Future or existent next generation may re-define them.
>>
>> You could easily have a strings array of per-family or per-soc error
>> descriptions. For an example, take a look at drivers/edac/mce_amd.c
>> which decodes MCEs on all relevant AMD machines. This is much more
>> user-friendly than dumping register values which most people have no
>> idea of where to start looking (and they don't really need to).
>
> That would require having a way to identify the SoC with a distinct
> compatible string. I don't know if a name exists for X-Gene that could
> be used here.
>

I am not sure either but will try to figure this out. The issue with
compatible string is the requirement to have FW go and patch it up. We
don't have multiple DT's for each minor version of the chip.

-Loc

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2015-06-01 18:59 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-29 22:04 [PATCH 0/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver Loc Ho
2015-05-29 22:04 ` Loc Ho
     [not found] ` <1432937075-16558-1-git-send-email-lho-qTEPVZfXA3Y@public.gmane.org>
2015-05-29 22:04   ` [PATCH 1/3] Documentation: Update the APM X-Gene SoC EDAC DTS binding for L3/SoC subnodes Loc Ho
2015-05-29 22:04     ` Loc Ho
     [not found]     ` <1432937075-16558-2-git-send-email-lho-qTEPVZfXA3Y@public.gmane.org>
2015-05-29 22:04       ` [PATCH 2/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver Loc Ho
2015-05-29 22:04         ` Loc Ho
     [not found]         ` <1432937075-16558-3-git-send-email-lho-qTEPVZfXA3Y@public.gmane.org>
2015-05-29 22:04           ` [PATCH 3/3] arm64: Add L3/SoC DT subnodes to the APM X-Gene SoC EDAC node Loc Ho
2015-05-29 22:04             ` Loc Ho
2015-06-01 11:43           ` [PATCH 2/3] edac: Add L3/SoC support to the APM X-Gene SoC EDAC driver Borislav Petkov
2015-06-01 11:43             ` Borislav Petkov
     [not found]             ` <20150601114346.GA10169-fF5Pk5pvG8Y@public.gmane.org>
2015-06-01 17:26               ` Loc Ho
2015-06-01 17:26                 ` Loc Ho
2015-06-01 14:39           ` Arnd Bergmann
2015-06-01 14:39             ` Arnd Bergmann
2015-06-01 17:46             ` Loc Ho
2015-06-01 17:46               ` Loc Ho
     [not found]               ` <CAPw-ZTm+1hfh4b-opq=HQDek-wc4A4nT=B6DcuyUBWsm8a-ZbQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-06-01 18:31                 ` Borislav Petkov
2015-06-01 18:31                   ` Borislav Petkov
2015-06-01 18:37                   ` Arnd Bergmann
2015-06-01 18:37                     ` Arnd Bergmann
2015-06-01 18:59                     ` Loc Ho
2015-06-01 18:59                       ` Loc Ho
2015-06-01 14:39       ` [PATCH 1/3] Documentation: Update the APM X-Gene SoC EDAC DTS binding for L3/SoC subnodes Arnd Bergmann
2015-06-01 14:39         ` Arnd Bergmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.