* [PATCH v6 0/2] EDAC: Add support for Xilinx ZynqMP OCM EDAC @ 2022-11-02 7:06 Sai Krishna Potthuri 2022-11-02 7:06 ` [PATCH v6 1/2] dt-bindings: edac: Add bindings for Xilinx ZynqMP OCM Sai Krishna Potthuri 2022-11-02 7:06 ` [PATCH v6 2/2] EDAC/zynqmp: Add EDAC support " Sai Krishna Potthuri 0 siblings, 2 replies; 14+ messages in thread From: Sai Krishna Potthuri @ 2022-11-02 7:06 UTC (permalink / raw) To: Rob Herring, Krzysztof Kozlowski, Michal Simek, Borislav Petkov, Mauro Carvalho Chehab, Tony Luck, James Morse, Robert Richter Cc: devicetree, linux-arm-kernel, linux-kernel, linux-edac, saikrishna12468, git, Sai Krishna Potthuri Add dt-binding and driver for Xilinx ZynqMP OCM controller. changes in v6: -> 2/2 - Updated subject prefix and commit description. -> 2/2 - Used Debugfs instead of sysfs for error injection and placed the injection logic under CONFIG_EDAC_DEBUG. -> 2/2 - Dropped zynqmp_ocm prefix for all static APIs and structures. -> 2/2 - Fixed few more comments related to using caps for acronyms, dealing error info, UE logic simplification, using BIT() definitions. changes in v5: -> 1/2, 2/2 - Added 'Co-developed-by' tag. -> 2/2 - Updated the driver hep text to be more clear about the hardware this driver is targeted. -> 2/2 - Fixed the warning reported by kernel test robot. changes in v4: -> 2/2 - Replaced \n\r with \n. changes in v3: -> 1/2 - Moved the binding from edac to memory-controllers directory. -> 1/2 - Changed the file name to match with the compatible. -> 1/2 - Used additionalProperties instead of unevaluatedProperties. -> 1/2 - Used macro instead of constant value. changes in v2: -> 1/2 - Used define for interrupt flag. -> 1/2 - Updated the description and title. -> 2/2 - Removed Kernel doc for probe and remove. -> 2/2 - Used COMPILE_TEST, used wrapper for get and ioremap resource. -> 2/2 - Fixed few comments related to static variable declaration and print statements. Sai Krishna Potthuri (1): EDAC/zynqmp: Add EDAC support for Xilinx ZynqMP OCM Shubhrajyoti Datta (1): dt-bindings: edac: Add bindings for Xilinx ZynqMP OCM .../xlnx,zynqmp-ocmc-1.0.yaml | 45 ++ MAINTAINERS | 7 + drivers/edac/Kconfig | 9 + drivers/edac/Makefile | 1 + drivers/edac/zynqmp_ocm_edac.c | 485 ++++++++++++++++++ 5 files changed, 547 insertions(+) create mode 100644 Documentation/devicetree/bindings/memory-controllers/xlnx,zynqmp-ocmc-1.0.yaml create mode 100644 drivers/edac/zynqmp_ocm_edac.c -- 2.25.1 ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v6 1/2] dt-bindings: edac: Add bindings for Xilinx ZynqMP OCM 2022-11-02 7:06 [PATCH v6 0/2] EDAC: Add support for Xilinx ZynqMP OCM EDAC Sai Krishna Potthuri @ 2022-11-02 7:06 ` Sai Krishna Potthuri 2022-11-10 9:10 ` (subset) " Krzysztof Kozlowski ` (2 more replies) 2022-11-02 7:06 ` [PATCH v6 2/2] EDAC/zynqmp: Add EDAC support " Sai Krishna Potthuri 1 sibling, 3 replies; 14+ messages in thread From: Sai Krishna Potthuri @ 2022-11-02 7:06 UTC (permalink / raw) To: Rob Herring, Krzysztof Kozlowski, Michal Simek, Borislav Petkov, Mauro Carvalho Chehab, Tony Luck, James Morse, Robert Richter Cc: devicetree, linux-arm-kernel, linux-kernel, linux-edac, saikrishna12468, git, Shubhrajyoti Datta, Sai Krishna Potthuri, Krzysztof Kozlowski From: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com> Add bindings for Xilinx ZynqMP OCM controller. Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com> Co-developed-by: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com> Signed-off-by: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> --- .../xlnx,zynqmp-ocmc-1.0.yaml | 45 +++++++++++++++++++ 1 file changed, 45 insertions(+) create mode 100644 Documentation/devicetree/bindings/memory-controllers/xlnx,zynqmp-ocmc-1.0.yaml diff --git a/Documentation/devicetree/bindings/memory-controllers/xlnx,zynqmp-ocmc-1.0.yaml b/Documentation/devicetree/bindings/memory-controllers/xlnx,zynqmp-ocmc-1.0.yaml new file mode 100644 index 000000000000..ca9fc747bf4f --- /dev/null +++ b/Documentation/devicetree/bindings/memory-controllers/xlnx,zynqmp-ocmc-1.0.yaml @@ -0,0 +1,45 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/memory-controllers/xlnx,zynqmp-ocmc-1.0.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Xilinx Zynqmp OCM(On-Chip Memory) Controller + +maintainers: + - Shubhrajyoti Datta <shubhrajyoti.datta@amd.com> + - Sai Krishna Potthuri <sai.krishna.potthuri@amd.com> + +description: | + The OCM supports 64-bit wide ECC functionality to detect multi-bit errors + and recover from a single-bit memory fault.On a write, if all bytes are + being written, the ECC is generated and written into the ECC RAM along with + the write-data that is written into the data RAM. If one or more bytes are + not written, then the read operation results in an correctable error or + uncorrectable error. + +properties: + compatible: + const: xlnx,zynqmp-ocmc-1.0 + + reg: + maxItems: 1 + + interrupts: + maxItems: 1 + +required: + - compatible + - reg + - interrupts + +additionalProperties: false + +examples: + - | + #include <dt-bindings/interrupt-controller/arm-gic.h> + memory-controller@ff960000 { + compatible = "xlnx,zynqmp-ocmc-1.0"; + reg = <0xff960000 0x1000>; + interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>; + }; -- 2.25.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: (subset) [PATCH v6 1/2] dt-bindings: edac: Add bindings for Xilinx ZynqMP OCM 2022-11-02 7:06 ` [PATCH v6 1/2] dt-bindings: edac: Add bindings for Xilinx ZynqMP OCM Sai Krishna Potthuri @ 2022-11-10 9:10 ` Krzysztof Kozlowski 2022-11-10 9:12 ` Krzysztof Kozlowski 2022-11-10 10:06 ` Krzysztof Kozlowski 2 siblings, 0 replies; 14+ messages in thread From: Krzysztof Kozlowski @ 2022-11-10 9:10 UTC (permalink / raw) To: Robert Richter, Mauro Carvalho Chehab, Krzysztof Kozlowski, Borislav Petkov, Tony Luck, Rob Herring, Michal Simek, James Morse, Sai Krishna Potthuri Cc: Krzysztof Kozlowski, Shubhrajyoti Datta, devicetree, saikrishna12468, git, linux-edac, linux-kernel, linux-arm-kernel On Wed, 2 Nov 2022 12:36:54 +0530, Sai Krishna Potthuri wrote: > From: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com> > > Add bindings for Xilinx ZynqMP OCM controller. > > Applied, thanks! [1/2] dt-bindings: edac: Add bindings for Xilinx ZynqMP OCM https://git.kernel.org/krzk/linux-mem-ctrl/c/efd1547170c3b153f161c293fc11b9fe2f1ece01 Best regards, -- Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 1/2] dt-bindings: edac: Add bindings for Xilinx ZynqMP OCM 2022-11-02 7:06 ` [PATCH v6 1/2] dt-bindings: edac: Add bindings for Xilinx ZynqMP OCM Sai Krishna Potthuri 2022-11-10 9:10 ` (subset) " Krzysztof Kozlowski @ 2022-11-10 9:12 ` Krzysztof Kozlowski 2022-11-10 10:06 ` Krzysztof Kozlowski 2 siblings, 0 replies; 14+ messages in thread From: Krzysztof Kozlowski @ 2022-11-10 9:12 UTC (permalink / raw) To: Sai Krishna Potthuri, Rob Herring, Krzysztof Kozlowski, Michal Simek, Borislav Petkov, Mauro Carvalho Chehab, Tony Luck, James Morse, Robert Richter Cc: devicetree, linux-arm-kernel, linux-kernel, linux-edac, saikrishna12468, git, Shubhrajyoti Datta On 02/11/2022 08:06, Sai Krishna Potthuri wrote: > From: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com> > > Add bindings for Xilinx ZynqMP OCM controller. > Applied with fixes in subject - please use prefixes matching the subsystem (memory-controllers) and do not put twice "bindings" (one is enough). Best regards, Krzysztof ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 1/2] dt-bindings: edac: Add bindings for Xilinx ZynqMP OCM 2022-11-02 7:06 ` [PATCH v6 1/2] dt-bindings: edac: Add bindings for Xilinx ZynqMP OCM Sai Krishna Potthuri 2022-11-10 9:10 ` (subset) " Krzysztof Kozlowski 2022-11-10 9:12 ` Krzysztof Kozlowski @ 2022-11-10 10:06 ` Krzysztof Kozlowski 2 siblings, 0 replies; 14+ messages in thread From: Krzysztof Kozlowski @ 2022-11-10 10:06 UTC (permalink / raw) To: Sai Krishna Potthuri, Rob Herring, Krzysztof Kozlowski, Michal Simek, Borislav Petkov, Mauro Carvalho Chehab, Tony Luck, James Morse, Robert Richter Cc: devicetree, linux-arm-kernel, linux-kernel, linux-edac, saikrishna12468, git, Shubhrajyoti Datta On 02/11/2022 08:06, Sai Krishna Potthuri wrote: > From: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com> > > Add bindings for Xilinx ZynqMP OCM controller. > > Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com> > Co-developed-by: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com> > Signed-off-by: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com> > Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Use subject prefixes matching the subsystem (git log --oneline -- ...). Subsystem is memory-controllers. Drop redundant, second "bindings" (so "bindings for"). With changes above my tag can stay. Best regards, Krzysztof ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v6 2/2] EDAC/zynqmp: Add EDAC support for Xilinx ZynqMP OCM 2022-11-02 7:06 [PATCH v6 0/2] EDAC: Add support for Xilinx ZynqMP OCM EDAC Sai Krishna Potthuri 2022-11-02 7:06 ` [PATCH v6 1/2] dt-bindings: edac: Add bindings for Xilinx ZynqMP OCM Sai Krishna Potthuri @ 2022-11-02 7:06 ` Sai Krishna Potthuri 2022-11-08 18:39 ` Borislav Petkov 2022-11-25 15:12 ` Borislav Petkov 1 sibling, 2 replies; 14+ messages in thread From: Sai Krishna Potthuri @ 2022-11-02 7:06 UTC (permalink / raw) To: Rob Herring, Krzysztof Kozlowski, Michal Simek, Borislav Petkov, Mauro Carvalho Chehab, Tony Luck, James Morse, Robert Richter Cc: devicetree, linux-arm-kernel, linux-kernel, linux-edac, saikrishna12468, git, Sai Krishna Potthuri, Shubhrajyoti Datta, kernel test robot Add EDAC support for Xilinx ZynqMP OCM Controller, this driver reports CE and UE errors upon interrupt generation, and also creates UE/CE debugfs entries for error injection when EDAC_DEBUG config is enabled. Co-developed-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com> Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com> Signed-off-by: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com> Reported-by: kernel test robot <lkp@intel.com> --- MAINTAINERS | 7 + drivers/edac/Kconfig | 9 + drivers/edac/Makefile | 1 + drivers/edac/zynqmp_ocm_edac.c | 485 +++++++++++++++++++++++++++++++++ 4 files changed, 502 insertions(+) create mode 100644 drivers/edac/zynqmp_ocm_edac.c diff --git a/MAINTAINERS b/MAINTAINERS index edc96cdb85e8..7a40caf536c2 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -21692,6 +21692,13 @@ F: Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp-dpdma.yaml F: drivers/dma/xilinx/xilinx_dpdma.c F: include/dt-bindings/dma/xlnx-zynqmp-dpdma.h +XILINX ZYNQMP OCM EDAC DRIVER +M: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com> +M: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com> +S: Maintained +F: Documentation/devicetree/bindings/memory-controllers/xlnx,zynqmp-ocmc-1.0.yaml +F: drivers/edac/zynqmp_ocm_edac.c + XILINX ZYNQMP PSGTR PHY DRIVER M: Anurag Kumar Vulisha <anurag.kumar.vulisha@xilinx.com> M: Laurent Pinchart <laurent.pinchart@ideasonboard.com> diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig index 58ab63642e72..bc30b7d02062 100644 --- a/drivers/edac/Kconfig +++ b/drivers/edac/Kconfig @@ -539,4 +539,13 @@ config EDAC_DMC520 Support for error detection and correction on the SoCs with ARM DMC-520 DRAM controller. +config EDAC_ZYNQMP_OCM + tristate "Xilinx ZynqMP OCM Controller" + depends on ARCH_ZYNQMP || COMPILE_TEST + help + This driver is targeted for Xilinx ZynqMP OCM (On Chip Memory) + controller to support for error detection and correction. + This driver can also be built as a module. If so, the module + will be called zynqmp_ocm_edac. + endif # EDAC diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile index 2d1641a27a28..634c1cec1588 100644 --- a/drivers/edac/Makefile +++ b/drivers/edac/Makefile @@ -84,3 +84,4 @@ obj-$(CONFIG_EDAC_QCOM) += qcom_edac.o obj-$(CONFIG_EDAC_ASPEED) += aspeed_edac.o obj-$(CONFIG_EDAC_BLUEFIELD) += bluefield_edac.o obj-$(CONFIG_EDAC_DMC520) += dmc520_edac.o +obj-$(CONFIG_EDAC_ZYNQMP_OCM) += zynqmp_ocm_edac.o diff --git a/drivers/edac/zynqmp_ocm_edac.c b/drivers/edac/zynqmp_ocm_edac.c new file mode 100644 index 000000000000..32f025b72380 --- /dev/null +++ b/drivers/edac/zynqmp_ocm_edac.c @@ -0,0 +1,485 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Xilinx ZynqMP OCM ECC Driver + * + * Copyright (C) 2022 Advanced Micro Devices, Inc. + */ + +#include <linux/edac.h> +#include <linux/interrupt.h> +#include <linux/module.h> +#include <linux/of.h> +#include <linux/of_platform.h> +#include <linux/platform_device.h> + +#include "edac_module.h" + +#define ZYNQMP_OCM_EDAC_MSG_SIZE 256 + +#define ZYNQMP_OCM_EDAC_STRING "zynqmp_ocm" + +/* Controller registers */ +#define CTRL_OFST 0x0 +#define OCM_ISR_OFST 0x04 +#define OCM_IMR_OFST 0x08 +#define OCM_IEN_OFST 0x0C +#define OCM_IDS_OFST 0x10 + +/* ECC control register */ +#define ECC_CTRL_OFST 0x14 + +/* Correctable error info registers */ +#define CE_FFA_OFST 0x1C +#define CE_FFD0_OFST 0x20 +#define CE_FFD1_OFST 0x24 +#define CE_FFD2_OFST 0x28 +#define CE_FFD3_OFST 0x2C +#define CE_FFE_OFST 0x30 + +/* Uncorrectable error info registers */ +#define UE_FFA_OFST 0x34 +#define UE_FFD0_OFST 0x38 +#define UE_FFD1_OFST 0x3C +#define UE_FFD2_OFST 0x40 +#define UE_FFD3_OFST 0x44 +#define UE_FFE_OFST 0x48 + +/* ECC control register bit field definitions */ +#define ECC_CTRL_CLR_CE_ERR 0x40 +#define ECC_CTRL_CLR_UE_ERR 0x80 + +/* Fault injection data and count registers */ +#define OCM_FID0_OFST 0x4C +#define OCM_FID1_OFST 0x50 +#define OCM_FID2_OFST 0x54 +#define OCM_FID3_OFST 0x58 +#define OCM_FIC_OFST 0x74 + +#define UE_MAX_BITPOS_LOWER 31 +#define UE_MIN_BITPOS_UPPER 32 +#define UE_MAX_BITPOS_UPPER 63 + +/* Interrupt masks */ +#define OCM_CEINTR_MASK BIT(6) +#define OCM_UEINTR_MASK BIT(7) +#define OCM_ECC_ENABLE_MASK BIT(0) +#define OCM_CEUE_MASK GENMASK(7, 6) + +#define OCM_FICOUNT_MASK GENMASK(23, 0) +#define OCM_NUM_UE_BITPOS 2 +#define OCM_BASEVAL 0xFFFC0000 +#define EDAC_DEVICE "ZynqMP-OCM" + +/** + * struct ecc_error_info - ECC error log information + * @addr: Fault generated at this address + * @data0: Generated fault data (lower 32-bit) + * @data1: Generated fault data (upper 32-bit) + */ +struct ecc_error_info { + u32 addr; + u32 data0; + u32 data1; +}; + +/** + * struct ecc_status - ECC status information to report + * @ce_cnt: Correctable error count + * @ue_cnt: Uncorrectable error count + * @ceinfo: Correctable error log information + * @ueinfo: Uncorrectable error log information + */ +struct ecc_status { + u32 ce_cnt; + u32 ue_cnt; + struct ecc_error_info ceinfo; + struct ecc_error_info ueinfo; +}; + +/** + * struct edac_priv - OCM controller private instance data + * @baseaddr: Base address of the OCM controller + * @message: Buffer for framing the event specific info + * @stat: ECC status information + * @ce_cnt: Correctable Error count + * @ue_cnt: Uncorrectable Error count + * @debugfs_dir: Directory entry for debugfs + * @ce_bitpos: Bit position for Correctable Error + * @ue_bitpos: Array to store UnCorrectable Error bit positions + * @fault_injection_cnt: Fault Injection Counter value + */ +struct edac_priv { + void __iomem *baseaddr; + char message[ZYNQMP_OCM_EDAC_MSG_SIZE]; + struct ecc_status stat; + u32 ce_cnt; + u32 ue_cnt; +#ifdef CONFIG_EDAC_DEBUG + struct dentry *debugfs_dir; + u8 ce_bitpos; + u8 ue_bitpos[OCM_NUM_UE_BITPOS]; + u32 fault_injection_cnt; +#endif +}; + +/** + * get_error_info - Get the current ECC error info + * @base: Pointer to the base address of the OCM memory controller + * @p: Pointer to the OCM ECC status structure + * @mask: Status register mask value + * + * Determines there is any ECC error or not + * + */ +static void get_error_info(void __iomem *base, struct ecc_status *p, int mask) +{ + if (mask & OCM_CEINTR_MASK) { + p->ce_cnt++; + p->ceinfo.data0 = readl(base + CE_FFD0_OFST); + p->ceinfo.data1 = readl(base + CE_FFD1_OFST); + p->ceinfo.addr = (OCM_BASEVAL | readl(base + CE_FFA_OFST)); + writel(ECC_CTRL_CLR_CE_ERR, base + OCM_ISR_OFST); + } else { + p->ue_cnt++; + p->ueinfo.data0 = readl(base + UE_FFD0_OFST); + p->ueinfo.data1 = readl(base + UE_FFD1_OFST); + p->ueinfo.addr = (OCM_BASEVAL | readl(base + UE_FFA_OFST)); + writel(ECC_CTRL_CLR_UE_ERR, base + OCM_ISR_OFST); + } +} + +/** + * handle_error - Handle controller error types CE and UE + * @dci: Pointer to the EDAC device controller instance + * @p: Pointer to the OCM ECC status structure + * + * Handles the controller ECC correctable and uncorrectable error. + */ +static void handle_error(struct edac_device_ctl_info *dci, struct ecc_status *p) +{ + struct edac_priv *priv = dci->pvt_info; + struct ecc_error_info *pinf; + + if (p->ce_cnt) { + pinf = &p->ceinfo; + snprintf(priv->message, ZYNQMP_OCM_EDAC_MSG_SIZE, + "\nOCM ECC error type :%s\nAddr: [0x%x]\nFault Data[0x%08x%08x]", + "CE", pinf->addr, pinf->data1, pinf->data0); + edac_device_handle_ce(dci, 0, 0, priv->message); + } + + if (p->ue_cnt) { + pinf = &p->ueinfo; + snprintf(priv->message, ZYNQMP_OCM_EDAC_MSG_SIZE, + "\nOCM ECC error type :%s\nAddr: [0x%x]\nFault Data[0x%08x%08x]", + "UE", pinf->addr, pinf->data1, pinf->data0); + edac_device_handle_ue(dci, 0, 0, priv->message); + } + + memset(p, 0, sizeof(*p)); +} + +/** + * intr_handler - ISR routine + * @irq: irq number + * @dev_id: device id pointer + * + * Return: IRQ_NONE, if interrupt not set or IRQ_HANDLED otherwise + */ +static irqreturn_t intr_handler(int irq, void *dev_id) +{ + struct edac_device_ctl_info *dci = dev_id; + struct edac_priv *priv = dci->pvt_info; + int regval; + + regval = readl(priv->baseaddr + OCM_ISR_OFST); + if (!(regval & OCM_CEUE_MASK)) + return IRQ_NONE; + + get_error_info(priv->baseaddr, &priv->stat, regval); + + priv->ce_cnt += priv->stat.ce_cnt; + priv->ue_cnt += priv->stat.ue_cnt; + handle_error(dci, &priv->stat); + + return IRQ_HANDLED; +} + +/** + * get_eccstate - Return the controller ECC status + * @base: Pointer to the OCM memory controller base address + * + * Get the ECC enable/disable status for the controller + * + * Return: ECC status 0/1. + */ +static bool get_eccstate(void __iomem *base) +{ + return readl(base + ECC_CTRL_OFST) & OCM_ECC_ENABLE_MASK; +} + +#ifdef CONFIG_EDAC_DEBUG +/** + * inject_fault_count - write fault injection count + * @priv: Pointer to the EDAC private struct + * + * Update the fault injection count register, once the counter reaches + * zero, it injects errors + */ +static void inject_fault_count(struct edac_priv *priv) +{ + u32 ficount = priv->fault_injection_cnt; + + ficount &= OCM_FICOUNT_MASK; + if (ficount != priv->fault_injection_cnt) + edac_printk(KERN_INFO, EDAC_DEVICE, + "Value truncated to 24-bits\n"); + + writel(ficount, priv->baseaddr + OCM_FIC_OFST); +} + +/** + * inject_cebitpos - Set CE bit position + * @priv: Pointer to the EDAC private struct + * + * Set any one bit to inject CE error + */ +static void inject_cebitpos(struct edac_priv *priv) +{ + if (priv->ce_bitpos <= UE_MAX_BITPOS_LOWER) { + writel(BIT(priv->ce_bitpos), priv->baseaddr + OCM_FID0_OFST); + writel(0, priv->baseaddr + OCM_FID1_OFST); + } else { + writel(BIT(priv->ce_bitpos - UE_MIN_BITPOS_UPPER), + priv->baseaddr + OCM_FID1_OFST); + writel(0, priv->baseaddr + OCM_FID0_OFST); + } +} + +/** + * inject_uebitpos - set UE bit position0 + * @priv: Pointer to the EDAC private struct + * + * Set the first and second bit positions for UE Error generation + * Return: 0 on success else error code. + */ +static ssize_t inject_uebitpos(struct edac_priv *priv) +{ + u64 ue_bitpos = BIT(priv->ue_bitpos[0]) | BIT(priv->ue_bitpos[1]); + + if (priv->ue_bitpos[0] == priv->ue_bitpos[1]) { + edac_printk(KERN_ERR, EDAC_DEVICE, + "Bit positions should not be equal\n"); + return -EINVAL; + } + + if (priv->ue_bitpos[0] > UE_MAX_BITPOS_UPPER || + priv->ue_bitpos[1] > UE_MAX_BITPOS_UPPER) + return -EINVAL; + + writel((u32)ue_bitpos, priv->baseaddr + OCM_FID0_OFST); + writel((u32)(ue_bitpos >> 32), priv->baseaddr + OCM_FID1_OFST); + + return 0; +} + +static ssize_t inject_ce_write(struct file *file, const char __user *data, + size_t count, loff_t *ppos) +{ + struct edac_device_ctl_info *edac_dev = file->private_data; + struct edac_priv *priv = edac_dev->pvt_info; + int ret; + + if (!data) + return -EFAULT; + + ret = kstrtou8_from_user(data, count, 0, &priv->ce_bitpos); + if (ret) + return ret; + + if (priv->ce_bitpos > UE_MAX_BITPOS_UPPER) + return -EINVAL; + + inject_fault_count(priv); + inject_cebitpos(priv); + + return count; +} + +static const struct file_operations inject_ce_fops = { + .open = simple_open, + .write = inject_ce_write, + .llseek = generic_file_llseek, +}; + +static ssize_t inject_ue_write(struct file *file, const char __user *data, + size_t count, loff_t *ppos) +{ + struct edac_device_ctl_info *edac_dev = file->private_data; + struct edac_priv *priv = edac_dev->pvt_info; + char buf[6]; + u8 pos0, pos1, len; + int ret; + + if (!data) + return -EFAULT; + + len = min_t(size_t, count, sizeof(buf)); + if (copy_from_user(buf, data, len)) + return -EFAULT; + + for (pos0 = 0; buf[pos0] != ' ' && pos0 <= len; pos0++) + ; + + if (pos0 > len) + return -EINVAL; + + ret = kstrtou8_from_user(data, pos0, 0, &priv->ue_bitpos[0]); + if (ret) + return ret; + + for (pos1 = pos0 + 1; buf[pos1] != '\n' && pos1 <= len; pos1++) + ; + + if (pos1 > count) + return -EINVAL; + + ret = kstrtou8_from_user(&data[pos0 + 1], pos1 - pos0 - 1, 0, + &priv->ue_bitpos[1]); + if (ret) + return ret; + + inject_fault_count(priv); + ret = inject_uebitpos(priv); + if (ret) + return ret; + + return count; +} + +static const struct file_operations inject_ue_fops = { + .open = simple_open, + .write = inject_ue_write, + .llseek = generic_file_llseek, +}; + +static void setup_debugfs(struct edac_device_ctl_info *edac_dev) +{ + struct edac_priv *priv = edac_dev->pvt_info; + + priv->debugfs_dir = edac_debugfs_create_dir(edac_dev->dev_name); + if (!priv->debugfs_dir) + return; + + edac_debugfs_create_x32("inject_fault_count", 0644, priv->debugfs_dir, + &priv->fault_injection_cnt); + edac_debugfs_create_file("inject_ue_bitpos", 0644, priv->debugfs_dir, + edac_dev, &inject_ue_fops); + edac_debugfs_create_file("inject_ce_bitpos", 0644, priv->debugfs_dir, + edac_dev, &inject_ce_fops); +} +#endif + +static int edac_probe(struct platform_device *pdev) +{ + struct edac_priv *priv; + struct edac_device_ctl_info *dci; + void __iomem *baseaddr; + struct resource *res; + int irq, ret; + + baseaddr = devm_platform_get_and_ioremap_resource(pdev, 0, &res); + if (IS_ERR(baseaddr)) + return PTR_ERR(baseaddr); + + if (!get_eccstate(baseaddr)) { + edac_printk(KERN_INFO, EDAC_DEVICE, + "ECC not enabled\n"); + return -ENXIO; + } + + dci = edac_device_alloc_ctl_info(sizeof(*priv), ZYNQMP_OCM_EDAC_STRING, + 1, ZYNQMP_OCM_EDAC_STRING, 1, 0, NULL, 0, + edac_device_alloc_index()); + if (!dci) + return -ENOMEM; + + priv = dci->pvt_info; + platform_set_drvdata(pdev, dci); + dci->dev = &pdev->dev; + priv->baseaddr = baseaddr; + dci->mod_name = pdev->dev.driver->name; + dci->ctl_name = ZYNQMP_OCM_EDAC_STRING; + dci->dev_name = dev_name(&pdev->dev); + + irq = platform_get_irq(pdev, 0); + if (irq < 0) { + ret = irq; + goto free_dev_ctl; + } + + ret = devm_request_irq(&pdev->dev, irq, intr_handler, 0, + dev_name(&pdev->dev), dci); + if (ret) { + edac_printk(KERN_ERR, EDAC_DEVICE, "Failed to request Irq\n"); + goto free_dev_ctl; + } + + /* Enable UE, CE interrupts */ + writel(OCM_CEUE_MASK, priv->baseaddr + OCM_IEN_OFST); + +#ifdef CONFIG_EDAC_DEBUG + setup_debugfs(dci); +#endif + + ret = edac_device_add_device(dci); + if (ret) + goto free_dev_ctl; + + return 0; + +free_dev_ctl: + edac_device_free_ctl_info(dci); + + return ret; +} + +static int edac_remove(struct platform_device *pdev) +{ + struct edac_device_ctl_info *dci = platform_get_drvdata(pdev); + struct edac_priv *priv = dci->pvt_info; + + /* Disable UE, CE interrupts */ + writel(OCM_CEUE_MASK, priv->baseaddr + OCM_IDS_OFST); + +#ifdef CONFIG_EDAC_DEBUG + debugfs_remove_recursive(priv->debugfs_dir); +#endif + + edac_device_del_device(&pdev->dev); + edac_device_free_ctl_info(dci); + + return 0; +} + +static const struct of_device_id zynqmp_ocm_edac_match[] = { + { .compatible = "xlnx,zynqmp-ocmc-1.0"}, + { /* end of table */ } +}; + +MODULE_DEVICE_TABLE(of, zynqmp_ocm_edac_match); + +static struct platform_driver zynqmp_ocm_edac_driver = { + .driver = { + .name = "zynqmp-ocm-edac", + .of_match_table = zynqmp_ocm_edac_match, + }, + .probe = edac_probe, + .remove = edac_remove, +}; + +module_platform_driver(zynqmp_ocm_edac_driver); + +MODULE_AUTHOR("Advanced Micro Devices, Inc"); +MODULE_DESCRIPTION("Xilinx ZynqMP OCM ECC driver"); +MODULE_LICENSE("GPL"); -- 2.25.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v6 2/2] EDAC/zynqmp: Add EDAC support for Xilinx ZynqMP OCM 2022-11-02 7:06 ` [PATCH v6 2/2] EDAC/zynqmp: Add EDAC support " Sai Krishna Potthuri @ 2022-11-08 18:39 ` Borislav Petkov 2022-11-09 11:21 ` Potthuri, Sai Krishna 2022-11-25 15:12 ` Borislav Petkov 1 sibling, 1 reply; 14+ messages in thread From: Borislav Petkov @ 2022-11-08 18:39 UTC (permalink / raw) To: Sai Krishna Potthuri Cc: Rob Herring, Krzysztof Kozlowski, Michal Simek, Mauro Carvalho Chehab, Tony Luck, James Morse, Robert Richter, devicetree, linux-arm-kernel, linux-kernel, linux-edac, saikrishna12468, git, Shubhrajyoti Datta, kernel test robot On Wed, Nov 02, 2022 at 12:36:55PM +0530, Sai Krishna Potthuri wrote: > Add EDAC support for Xilinx ZynqMP OCM Controller, this driver So a while ago you said that this driver is for the on chip memory controller. Is it possible for such a system to have another memory controller too for which another EDAC driver gets loaded? Because the EDAC core - at least on x86 - assumes that a single driver runs on the system and I don't think I've ever had the case where we need multiple drivers. And in such case to audit it for concurrency issues. So I guess the question is, can a system have zynqmp_ocm_edac and say, synopsys_edac or some other EDAC driver loaded at the same time? Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH v6 2/2] EDAC/zynqmp: Add EDAC support for Xilinx ZynqMP OCM 2022-11-08 18:39 ` Borislav Petkov @ 2022-11-09 11:21 ` Potthuri, Sai Krishna 2022-11-09 18:08 ` Borislav Petkov 0 siblings, 1 reply; 14+ messages in thread From: Potthuri, Sai Krishna @ 2022-11-09 11:21 UTC (permalink / raw) To: Borislav Petkov Cc: Rob Herring, Krzysztof Kozlowski, Michal Simek, Mauro Carvalho Chehab, Tony Luck, James Morse, Robert Richter, devicetree, linux-arm-kernel, linux-kernel, linux-edac, saikrishna12468, git (AMD-Xilinx), Datta, Shubhrajyoti, kernel test robot Hi Boris, > -----Original Message----- > From: Borislav Petkov <bp@alien8.de> > Sent: Wednesday, November 9, 2022 12:09 AM > To: Potthuri, Sai Krishna <sai.krishna.potthuri@amd.com> > Cc: Rob Herring <robh+dt@kernel.org>; Krzysztof Kozlowski > <krzysztof.kozlowski+dt@linaro.org>; Michal Simek > <michal.simek@xilinx.com>; Mauro Carvalho Chehab > <mchehab@kernel.org>; Tony Luck <tony.luck@intel.com>; James Morse > <james.morse@arm.com>; Robert Richter <rric@kernel.org>; > devicetree@vger.kernel.org; linux-arm-kernel@lists.infradead.org; linux- > kernel@vger.kernel.org; linux-edac@vger.kernel.org; > saikrishna12468@gmail.com; git (AMD-Xilinx) <git@amd.com>; Datta, > Shubhrajyoti <shubhrajyoti.datta@amd.com>; kernel test robot > <lkp@intel.com> > Subject: Re: [PATCH v6 2/2] EDAC/zynqmp: Add EDAC support for Xilinx > ZynqMP OCM > > On Wed, Nov 02, 2022 at 12:36:55PM +0530, Sai Krishna Potthuri wrote: > > Add EDAC support for Xilinx ZynqMP OCM Controller, this driver > > So a while ago you said that this driver is for the on chip memory controller. > Is it possible for such a system to have another memory controller too for > which another EDAC driver gets loaded? > > Because the EDAC core - at least on x86 - assumes that a single driver runs on > the system and I don't think I've ever had the case where we need multiple > drivers. And in such case to audit it for concurrency issues. > > So I guess the question is, can a system have zynqmp_ocm_edac and say, > synopsys_edac or some other EDAC driver loaded at the same time? Yes, we have this scenario on Xilinx ZynqMP platform where we have both the drivers (zynqmp_ocm_edac - OCM Controller and synopsys_edac - DDR Memory Controller) probed at the same time. We tested this scenario on our platform (arm based), and we see both the controllers getting probed and tested by injecting errors. Probe log for both the controllers: xilinx-zcu102-20222:~$ dmesg | grep edac [ 1.642225] EDAC DEBUG: edac_mc_sysfs_init: device mc created [ 2.151781] EDAC DEBUG: edac_mc_alloc: allocating 2272 bytes for mci data (1 ranks, 1 csrows/channels) [ 2.151862] EDAC DEBUG: edac_mc_add_mc_with_groups: [ 2.151912] EDAC DEBUG: edac_create_sysfs_mci_device: device mc0 created [ 2.151945] EDAC DEBUG: edac_create_dimm_object: device rank0 created at location csrow 0 channel 0 [ 2.151979] EDAC DEBUG: edac_create_csrow_object: device csrow0 created [ 2.152020] EDAC MC0: Giving out device to module 1 controller synps_ddr_controller: DEV synps_edac (INTERRUPT) [ 2.161952] EDAC DEBUG: edac_device_register_sysfs_main_kobj: [ 2.162035] EDAC DEBUG: edac_device_add_device: [ 2.162039] EDAC DEBUG: find_edac_device_by_dev: [ 2.162043] EDAC DEBUG: edac_device_create_sysfs: idx=0 [ 2.162050] EDAC DEBUG: edac_device_create_instances: [ 2.162065] EDAC DEVICE0: Giving out device to module zynqmp-ocm-edac controller zynqmp_ocm: DEV ff960000.memory-controller (INTERRUPT) Regards Sai Krishna ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 2/2] EDAC/zynqmp: Add EDAC support for Xilinx ZynqMP OCM 2022-11-09 11:21 ` Potthuri, Sai Krishna @ 2022-11-09 18:08 ` Borislav Petkov 0 siblings, 0 replies; 14+ messages in thread From: Borislav Petkov @ 2022-11-09 18:08 UTC (permalink / raw) To: Potthuri, Sai Krishna, Datta, Shubhrajyoti Cc: Rob Herring, Krzysztof Kozlowski, Michal Simek, Mauro Carvalho Chehab, Tony Luck, James Morse, Robert Richter, devicetree, linux-arm-kernel, linux-kernel, linux-edac, saikrishna12468, git (AMD-Xilinx), kernel test robot On Wed, Nov 09, 2022 at 11:21:41AM +0000, Potthuri, Sai Krishna wrote: > > On Wed, Nov 02, 2022 at 12:36:55PM +0530, Sai Krishna Potthuri wrote: > > > Add EDAC support for Xilinx ZynqMP OCM Controller, this driver > > > > So a while ago you said that this driver is for the on chip memory controller. > > Is it possible for such a system to have another memory controller too for > > which another EDAC driver gets loaded? > > > > Because the EDAC core - at least on x86 - assumes that a single driver runs on > > the system and I don't think I've ever had the case where we need multiple > > drivers. And in such case to audit it for concurrency issues. > > > > So I guess the question is, can a system have zynqmp_ocm_edac and say, > > synopsys_edac or some other EDAC driver loaded at the same time? > Yes, we have this scenario on Xilinx ZynqMP platform where we have both > the drivers (zynqmp_ocm_edac - OCM Controller and synopsys_edac - DDR > Memory Controller) probed at the same time. Ok, Shubhrajyoti is on Cc too. I asked him the same question - what the possible drivers configuration would be and he gave me: Platform | Drivers / Controllers | ------------------------------------------------------------ ZynqMP | Synopsys and OCM | Versal | DDRMC and OCM | The ZynqMP platform needs Synopsys which is, let's say for simplicity, the main EDAC driver using edac_mc* etc. Looking at the patches, Versal is similar and uses the same APIs. OCM uses the edac_device* stuff so that should be ok. I say "should" because, again, I haven't played with multiple EDAC drivers system. But we'll see where that gets us. Ok, I guess architecture-wise this looks ok, I'll review the drivers later and we'll see. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 2/2] EDAC/zynqmp: Add EDAC support for Xilinx ZynqMP OCM 2022-11-02 7:06 ` [PATCH v6 2/2] EDAC/zynqmp: Add EDAC support " Sai Krishna Potthuri 2022-11-08 18:39 ` Borislav Petkov @ 2022-11-25 15:12 ` Borislav Petkov 2022-12-05 10:20 ` Potthuri, Sai Krishna 1 sibling, 1 reply; 14+ messages in thread From: Borislav Petkov @ 2022-11-25 15:12 UTC (permalink / raw) To: Sai Krishna Potthuri Cc: Rob Herring, Krzysztof Kozlowski, Michal Simek, Mauro Carvalho Chehab, Tony Luck, James Morse, Robert Richter, devicetree, linux-arm-kernel, linux-kernel, linux-edac, saikrishna12468, git, Shubhrajyoti Datta, kernel test robot On Wed, Nov 02, 2022 at 12:36:55PM +0530, Sai Krishna Potthuri wrote: > Add EDAC support for Xilinx ZynqMP OCM Controller, So this: > this driver reports CE and UE errors upon interrupt generation, and > also creates UE/CE debugfs entries for error injection when EDAC_DEBUG > config is enabled. I can see in the patch itself. Do not talk about what your patch does - that should hopefully be visible from the diff. Rather, talk about *why* you're doing what you're doing. Like, for example, you can explain here how this driver is going to co-exist with the other EDAC driver, i.e., the Synopsys one or the DDRMC. > Co-developed-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com> > Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com> > Signed-off-by: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com> > Reported-by: kernel test robot <lkp@intel.com> What exactly was reported by the robot? Pls put that in the commit message as "Fix issue <BLA> as reported by the robot." so that it is clear what that Reported-by: refers to. > --- > MAINTAINERS | 7 + > drivers/edac/Kconfig | 9 + > drivers/edac/Makefile | 1 + > drivers/edac/zynqmp_ocm_edac.c | 485 +++++++++++++++++++++++++++++++++ > 4 files changed, 502 insertions(+) > create mode 100644 drivers/edac/zynqmp_ocm_edac.c > > diff --git a/MAINTAINERS b/MAINTAINERS > index edc96cdb85e8..7a40caf536c2 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -21692,6 +21692,13 @@ F: Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp-dpdma.yaml > F: drivers/dma/xilinx/xilinx_dpdma.c > F: include/dt-bindings/dma/xlnx-zynqmp-dpdma.h > > +XILINX ZYNQMP OCM EDAC DRIVER > +M: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com> > +M: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com> > +S: Maintained > +F: Documentation/devicetree/bindings/memory-controllers/xlnx,zynqmp-ocmc-1.0.yaml > +F: drivers/edac/zynqmp_ocm_edac.c > + > XILINX ZYNQMP PSGTR PHY DRIVER > M: Anurag Kumar Vulisha <anurag.kumar.vulisha@xilinx.com> > M: Laurent Pinchart <laurent.pinchart@ideasonboard.com> > diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig > index 58ab63642e72..bc30b7d02062 100644 > --- a/drivers/edac/Kconfig > +++ b/drivers/edac/Kconfig > @@ -539,4 +539,13 @@ config EDAC_DMC520 > Support for error detection and correction on the > SoCs with ARM DMC-520 DRAM controller. > > +config EDAC_ZYNQMP_OCM > + tristate "Xilinx ZynqMP OCM Controller" > + depends on ARCH_ZYNQMP || COMPILE_TEST > + help > + This driver is targeted for Xilinx ZynqMP OCM (On Chip Memory) "This driver supports ...." > + controller to support for error detection and correction. > + This driver can also be built as a module. If so, the module > + will be called zynqmp_ocm_edac. > + > endif # EDAC > diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile > index 2d1641a27a28..634c1cec1588 100644 > --- a/drivers/edac/Makefile > +++ b/drivers/edac/Makefile > @@ -84,3 +84,4 @@ obj-$(CONFIG_EDAC_QCOM) += qcom_edac.o > obj-$(CONFIG_EDAC_ASPEED) += aspeed_edac.o > obj-$(CONFIG_EDAC_BLUEFIELD) += bluefield_edac.o > obj-$(CONFIG_EDAC_DMC520) += dmc520_edac.o > +obj-$(CONFIG_EDAC_ZYNQMP_OCM) += zynqmp_ocm_edac.o Is there going to be another ZynqMP EDAC driver? If not, this one could be simply called zynqmp_edac.c or? > diff --git a/drivers/edac/zynqmp_ocm_edac.c b/drivers/edac/zynqmp_ocm_edac.c > new file mode 100644 > index 000000000000..32f025b72380 > --- /dev/null > +++ b/drivers/edac/zynqmp_ocm_edac.c > +/* Interrupt masks */ > +#define OCM_CEINTR_MASK BIT(6) > +#define OCM_UEINTR_MASK BIT(7) > +#define OCM_ECC_ENABLE_MASK BIT(0) > +#define OCM_CEUE_MASK GENMASK(7, 6) Drop that one and use OCM_CEINTR_MASK | OCM_UEINTR_MASK everywhere. > +#define OCM_FICOUNT_MASK GENMASK(23, 0) > +#define OCM_NUM_UE_BITPOS 2 > +#define OCM_BASEVAL 0xFFFC0000 > +#define EDAC_DEVICE "ZynqMP-OCM" > + > +/** > + * struct ecc_error_info - ECC error log information > + * @addr: Fault generated at this address > + * @data0: Generated fault data (lower 32-bit) > + * @data1: Generated fault data (upper 32-bit) > + */ > +struct ecc_error_info { > + u32 addr; > + u32 data0; > + u32 data1; What's wrong with calling those fault_lo and fault_hi then? > +/** > + * get_error_info - Get the current ECC error info > + * @base: Pointer to the base address of the OCM memory controller > + * @p: Pointer to the OCM ECC status structure > + * @mask: Status register mask value > + * > + * Determines there is any ECC error or not > + * > + */ > +static void get_error_info(void __iomem *base, struct ecc_status *p, int mask) > +{ > + if (mask & OCM_CEINTR_MASK) { > + p->ce_cnt++; > + p->ceinfo.data0 = readl(base + CE_FFD0_OFST); > + p->ceinfo.data1 = readl(base + CE_FFD1_OFST); > + p->ceinfo.addr = (OCM_BASEVAL | readl(base + CE_FFA_OFST)); > + writel(ECC_CTRL_CLR_CE_ERR, base + OCM_ISR_OFST); > + } else { I guess you need to check OCM_UEINTR_MASK here? > + p->ue_cnt++; > + p->ueinfo.data0 = readl(base + UE_FFD0_OFST); > + p->ueinfo.data1 = readl(base + UE_FFD1_OFST); > + p->ueinfo.addr = (OCM_BASEVAL | readl(base + UE_FFA_OFST)); > + writel(ECC_CTRL_CLR_UE_ERR, base + OCM_ISR_OFST); > + } No, I didn't mean for you to drop that block. See comment in intr_handler() below. > +/** > + * handle_error - Handle controller error types CE and UE > + * @dci: Pointer to the EDAC device controller instance > + * @p: Pointer to the OCM ECC status structure > + * > + * Handles the controller ECC correctable and uncorrectable error. s/controller// - we know it is the controller. You probably should go through all text in here and tone down all the "controller" mentions. > + */ > +static void handle_error(struct edac_device_ctl_info *dci, struct ecc_status *p) > +{ > + struct edac_priv *priv = dci->pvt_info; > + struct ecc_error_info *pinf; > + > + if (p->ce_cnt) { > + pinf = &p->ceinfo; > + snprintf(priv->message, ZYNQMP_OCM_EDAC_MSG_SIZE, > + "\nOCM ECC error type :%s\nAddr: [0x%x]\nFault Data[0x%08x%08x]", > + "CE", pinf->addr, pinf->data1, pinf->data0); > + edac_device_handle_ce(dci, 0, 0, priv->message); > + } > + > + if (p->ue_cnt) { > + pinf = &p->ueinfo; > + snprintf(priv->message, ZYNQMP_OCM_EDAC_MSG_SIZE, > + "\nOCM ECC error type :%s\nAddr: [0x%x]\nFault Data[0x%08x%08x]", > + "UE", pinf->addr, pinf->data1, pinf->data0); > + edac_device_handle_ue(dci, 0, 0, priv->message); > + } > + > + memset(p, 0, sizeof(*p)); > +} > + > +/** > + * intr_handler - ISR routine > + * @irq: irq number > + * @dev_id: device id pointer > + * > + * Return: IRQ_NONE, if interrupt not set or IRQ_HANDLED otherwise > + */ > +static irqreturn_t intr_handler(int irq, void *dev_id) > +{ > + struct edac_device_ctl_info *dci = dev_id; > + struct edac_priv *priv = dci->pvt_info; > + int regval; > + > + regval = readl(priv->baseaddr + OCM_ISR_OFST); > + if (!(regval & OCM_CEUE_MASK)) > + return IRQ_NONE; What is that check for? If neither of the masks are set but the device still raises an error interrupt, then you need to WARN_ONCE() here so that people look at this and debug it as to why it still raised an interrupt. > + get_error_info(priv->baseaddr, &priv->stat, regval); > + > + priv->ce_cnt += priv->stat.ce_cnt; > + priv->ue_cnt += priv->stat.ue_cnt; > + handle_error(dci, &priv->stat); > + > + return IRQ_HANDLED; > +} > + > +/** > + * get_eccstate - Return the controller ECC status > + * @base: Pointer to the OCM memory controller base address > + * > + * Get the ECC enable/disable status for the controller > + * > + * Return: ECC status 0/1. > + */ > +static bool get_eccstate(void __iomem *base) > +{ > + return readl(base + ECC_CTRL_OFST) & OCM_ECC_ENABLE_MASK; > +} > + > +#ifdef CONFIG_EDAC_DEBUG > +/** > + * inject_fault_count - write fault injection count > + * @priv: Pointer to the EDAC private struct > + * > + * Update the fault injection count register, once the counter reaches > + * zero, it injects errors > + */ > +static void inject_fault_count(struct edac_priv *priv) > +{ > + u32 ficount = priv->fault_injection_cnt; > + > + ficount &= OCM_FICOUNT_MASK; > + if (ficount != priv->fault_injection_cnt) Do this: if (ficount & ~OCM_FICOUNT_MASK) { ficount &= OCM_FICOUNT_MASK; edac_printk(KERN_INFO, EDAC_DEVICE, "Fault injection count value truncated to: %d\n", ficount); } i.e., mask it *only* when it is larger. > + > + writel(ficount, priv->baseaddr + OCM_FIC_OFST); > +} > + > +/** > + * inject_cebitpos - Set CE bit position > + * @priv: Pointer to the EDAC private struct > + * > + * Set any one bit to inject CE error > + */ > +static void inject_cebitpos(struct edac_priv *priv) > +{ > + if (priv->ce_bitpos <= UE_MAX_BITPOS_LOWER) { > + writel(BIT(priv->ce_bitpos), priv->baseaddr + OCM_FID0_OFST); > + writel(0, priv->baseaddr + OCM_FID1_OFST); > + } else { > + writel(BIT(priv->ce_bitpos - UE_MIN_BITPOS_UPPER), > + priv->baseaddr + OCM_FID1_OFST); > + writel(0, priv->baseaddr + OCM_FID0_OFST); I had to stare at this a bit to figure out what you're doing: the injection registers are two 32-bit quantities and depending on where you inject, you need to select into which offset to write it. But looking more at this, all the proper range checks should happen in the debugfs callbacks, i.e., inject_ce_write() in this case. The actual injection function should only inject - that's it. And come to think of it, you don't need inject_cebitpos() or inject_uebitpos(). Your debugfs callbacks should: 1. check the range, error out and print a warning if range wrong 2. inject otherwise. and that's it. ... > +static ssize_t inject_ue_write(struct file *file, const char __user *data, > + size_t count, loff_t *ppos) > +{ > + struct edac_device_ctl_info *edac_dev = file->private_data; > + struct edac_priv *priv = edac_dev->pvt_info; > + char buf[6]; > + u8 pos0, pos1, len; > + int ret; The EDAC tree preferred ordering of variable declarations at the beginning of a function is reverse fir tree order:: struct long_struct_name *descriptive_name; unsigned long foo, bar; unsigned int tmp; int ret; The above is faster to parse than the reverse ordering:: int ret; unsigned int tmp; unsigned long foo, bar; struct long_struct_name *descriptive_name; And even more so than random ordering:: unsigned long foo, bar; int ret; struct long_struct_name *descriptive_name; unsigned int tmp; You're pretty much doing it but some functions' local vars still need re-sorting. > + > + if (!data) > + return -EFAULT; > + > + len = min_t(size_t, count, sizeof(buf)); > + if (copy_from_user(buf, data, len)) > + return -EFAULT; > + > + for (pos0 = 0; buf[pos0] != ' ' && pos0 <= len; pos0++) > + ; > + > + if (pos0 > len) > + return -EINVAL; > + > + ret = kstrtou8_from_user(data, pos0, 0, &priv->ue_bitpos[0]); > + if (ret) > + return ret; > + > + for (pos1 = pos0 + 1; buf[pos1] != '\n' && pos1 <= len; pos1++) > + ; > + > + if (pos1 > count) > + return -EINVAL; > + > + ret = kstrtou8_from_user(&data[pos0 + 1], pos1 - pos0 - 1, 0, > + &priv->ue_bitpos[1]); This looks like it is composing multiple values. I guess the two bits for an UE since UE is a two-bit error. No documentation? IOW, you need to document how this injection works. In a comment here somewhere, pls explain what the user is supposed to do when she wants to inject errors. Example: Documentation/firmware-guide/acpi/apei/einj.rst You don't have to do a separate file and go too much into detail but at least a simple injection recipe/example would be prudent to have. > +static int edac_probe(struct platform_device *pdev) > +{ > + struct edac_priv *priv; > + struct edac_device_ctl_info *dci; > + void __iomem *baseaddr; > + struct resource *res; > + int irq, ret; > + > + baseaddr = devm_platform_get_and_ioremap_resource(pdev, 0, &res); > + if (IS_ERR(baseaddr)) > + return PTR_ERR(baseaddr); > + > + if (!get_eccstate(baseaddr)) { > + edac_printk(KERN_INFO, EDAC_DEVICE, > + "ECC not enabled\n"); No need to break that line. > + return -ENXIO; > + } > + > + dci = edac_device_alloc_ctl_info(sizeof(*priv), ZYNQMP_OCM_EDAC_STRING, > + 1, ZYNQMP_OCM_EDAC_STRING, 1, 0, NULL, 0, > + edac_device_alloc_index()); > + if (!dci) > + return -ENOMEM; > + > + priv = dci->pvt_info; > + platform_set_drvdata(pdev, dci); > + dci->dev = &pdev->dev; > + priv->baseaddr = baseaddr; > + dci->mod_name = pdev->dev.driver->name; > + dci->ctl_name = ZYNQMP_OCM_EDAC_STRING; > + dci->dev_name = dev_name(&pdev->dev); > + > + irq = platform_get_irq(pdev, 0); > + if (irq < 0) { > + ret = irq; > + goto free_dev_ctl; > + } > + > + ret = devm_request_irq(&pdev->dev, irq, intr_handler, 0, > + dev_name(&pdev->dev), dci); > + if (ret) { > + edac_printk(KERN_ERR, EDAC_DEVICE, "Failed to request Irq\n"); > + goto free_dev_ctl; > + } > + > + /* Enable UE, CE interrupts */ > + writel(OCM_CEUE_MASK, priv->baseaddr + OCM_IEN_OFST); > + > +#ifdef CONFIG_EDAC_DEBUG > + setup_debugfs(dci); > +#endif Do this instead: diff --git a/drivers/edac/zynqmp_ocm_edac.c b/drivers/edac/zynqmp_ocm_edac.c index 32f025b72380..a2b8cf1eb986 100644 --- a/drivers/edac/zynqmp_ocm_edac.c +++ b/drivers/edac/zynqmp_ocm_edac.c @@ -428,9 +428,8 @@ static int edac_probe(struct platform_device *pdev) /* Enable UE, CE interrupts */ writel(OCM_CEUE_MASK, priv->baseaddr + OCM_IEN_OFST); -#ifdef CONFIG_EDAC_DEBUG - setup_debugfs(dci); -#endif + if (IS_ENABLED(CONFIG_EDAC_DEBUG)) + setup_debugfs(dci); ret = edac_device_add_device(dci); if (ret) below too. > + > + ret = edac_device_add_device(dci); > + if (ret) > + goto free_dev_ctl; > + > + return 0; > + > +free_dev_ctl: > + edac_device_free_ctl_info(dci); > + > + return ret; > +} > + > +static int edac_remove(struct platform_device *pdev) > +{ > + struct edac_device_ctl_info *dci = platform_get_drvdata(pdev); > + struct edac_priv *priv = dci->pvt_info; > + > + /* Disable UE, CE interrupts */ > + writel(OCM_CEUE_MASK, priv->baseaddr + OCM_IDS_OFST); > + > +#ifdef CONFIG_EDAC_DEBUG > + debugfs_remove_recursive(priv->debugfs_dir); > +#endif > + > + edac_device_del_device(&pdev->dev); > + edac_device_free_ctl_info(dci); > + > + return 0; > +} > + > +static const struct of_device_id zynqmp_ocm_edac_match[] = { > + { .compatible = "xlnx,zynqmp-ocmc-1.0"}, > + { /* end of table */ } > +}; > + > +MODULE_DEVICE_TABLE(of, zynqmp_ocm_edac_match); > + > +static struct platform_driver zynqmp_ocm_edac_driver = { > + .driver = { > + .name = "zynqmp-ocm-edac", > + .of_match_table = zynqmp_ocm_edac_match, > + }, > + .probe = edac_probe, > + .remove = edac_remove, > +}; > + > +module_platform_driver(zynqmp_ocm_edac_driver); > + > +MODULE_AUTHOR("Advanced Micro Devices, Inc"); > +MODULE_DESCRIPTION("Xilinx ZynqMP OCM ECC driver"); > +MODULE_LICENSE("GPL"); > -- > 2.25.1 > -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply related [flat|nested] 14+ messages in thread
* RE: [PATCH v6 2/2] EDAC/zynqmp: Add EDAC support for Xilinx ZynqMP OCM 2022-11-25 15:12 ` Borislav Petkov @ 2022-12-05 10:20 ` Potthuri, Sai Krishna 2022-12-05 13:16 ` Borislav Petkov 0 siblings, 1 reply; 14+ messages in thread From: Potthuri, Sai Krishna @ 2022-12-05 10:20 UTC (permalink / raw) To: Borislav Petkov Cc: Rob Herring, Krzysztof Kozlowski, Michal Simek, Mauro Carvalho Chehab, Tony Luck, James Morse, Robert Richter, devicetree, linux-arm-kernel, linux-kernel, linux-edac, saikrishna12468, git (AMD-Xilinx), Datta, Shubhrajyoti, kernel test robot [-- Attachment #1: Type: text/plain, Size: 18338 bytes --] Hi Boris, > -----Original Message----- > From: Borislav Petkov <bp@alien8.de> > Sent: Friday, November 25, 2022 8:42 PM > To: Potthuri, Sai Krishna <sai.krishna.potthuri@amd.com> > Cc: Rob Herring <robh+dt@kernel.org>; Krzysztof Kozlowski > <krzysztof.kozlowski+dt@linaro.org>; Michal Simek > <michal.simek@xilinx.com>; Mauro Carvalho Chehab > <mchehab@kernel.org>; Tony Luck <tony.luck@intel.com>; James Morse > <james.morse@arm.com>; Robert Richter <rric@kernel.org>; > devicetree@vger.kernel.org; linux-arm-kernel@lists.infradead.org; linux- > kernel@vger.kernel.org; linux-edac@vger.kernel.org; > saikrishna12468@gmail.com; git (AMD-Xilinx) <git@amd.com>; Datta, > Shubhrajyoti <shubhrajyoti.datta@amd.com>; kernel test robot > <lkp@intel.com> > Subject: Re: [PATCH v6 2/2] EDAC/zynqmp: Add EDAC support for Xilinx > ZynqMP OCM > > On Wed, Nov 02, 2022 at 12:36:55PM +0530, Sai Krishna Potthuri wrote: > > Add EDAC support for Xilinx ZynqMP OCM Controller, > > So this: > > > this driver reports CE and UE errors upon interrupt generation, and > > also creates UE/CE debugfs entries for error injection when EDAC_DEBUG > > config is enabled. > > I can see in the patch itself. > > Do not talk about what your patch does - that should hopefully be visible > from the diff. Rather, talk about *why* you're doing what you're doing. > > Like, for example, you can explain here how this driver is going to co-exist > with the other EDAC driver, i.e., the Synopsys one or the DDRMC. Ok, will update. > > > Co-developed-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com> > > Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com> > > Signed-off-by: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com> > > Reported-by: kernel test robot <lkp@intel.com> > > What exactly was reported by the robot? > > Pls put that in the commit message as > > "Fix issue <BLA> as reported by the robot." > > so that it is clear what that Reported-by: refers to. Ok, will update. > > > --- > > MAINTAINERS | 7 + > > drivers/edac/Kconfig | 9 + > > drivers/edac/Makefile | 1 + > > drivers/edac/zynqmp_ocm_edac.c | 485 > > +++++++++++++++++++++++++++++++++ > > 4 files changed, 502 insertions(+) > > create mode 100644 drivers/edac/zynqmp_ocm_edac.c > > > > diff --git a/MAINTAINERS b/MAINTAINERS index > > edc96cdb85e8..7a40caf536c2 100644 > > --- a/MAINTAINERS > > +++ b/MAINTAINERS > > @@ -21692,6 +21692,13 @@ F: > Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp- > dpdma.yaml > > F: drivers/dma/xilinx/xilinx_dpdma.c > > F: include/dt-bindings/dma/xlnx-zynqmp-dpdma.h > > > > +XILINX ZYNQMP OCM EDAC DRIVER > > +M: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com> > > +M: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com> > > +S: Maintained > > +F: Documentation/devicetree/bindings/memory- > controllers/xlnx,zynqmp-ocmc-1.0.yaml > > +F: drivers/edac/zynqmp_ocm_edac.c > > + > > XILINX ZYNQMP PSGTR PHY DRIVER > > M: Anurag Kumar Vulisha <anurag.kumar.vulisha@xilinx.com> > > M: Laurent Pinchart <laurent.pinchart@ideasonboard.com> > > diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig index > > 58ab63642e72..bc30b7d02062 100644 > > --- a/drivers/edac/Kconfig > > +++ b/drivers/edac/Kconfig > > @@ -539,4 +539,13 @@ config EDAC_DMC520 > > Support for error detection and correction on the > > SoCs with ARM DMC-520 DRAM controller. > > > > +config EDAC_ZYNQMP_OCM > > + tristate "Xilinx ZynqMP OCM Controller" > > + depends on ARCH_ZYNQMP || COMPILE_TEST > > + help > > + This driver is targeted for Xilinx ZynqMP OCM (On Chip Memory) > > "This driver supports ...." Ok, will update. > > > + controller to support for error detection and correction. > > + This driver can also be built as a module. If so, the module > > + will be called zynqmp_ocm_edac. > > + > > endif # EDAC > > diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile index > > 2d1641a27a28..634c1cec1588 100644 > > --- a/drivers/edac/Makefile > > +++ b/drivers/edac/Makefile > > @@ -84,3 +84,4 @@ obj-$(CONFIG_EDAC_QCOM) += > qcom_edac.o > > obj-$(CONFIG_EDAC_ASPEED) += aspeed_edac.o > > obj-$(CONFIG_EDAC_BLUEFIELD) += bluefield_edac.o > > obj-$(CONFIG_EDAC_DMC520) += dmc520_edac.o > > +obj-$(CONFIG_EDAC_ZYNQMP_OCM) += > zynqmp_ocm_edac.o > > Is there going to be another ZynqMP EDAC driver? > > If not, this one could be simply called zynqmp_edac.c or? As we communicated earlier for ZynqMP platform we have both Synopsys (for DDRMC) and zynqmp_ocm_edac (for OCM) drivers. Just to be clear about what this driver is targeted for, we used ocm as part of file name. Ok, zynqmp_edac.c looks fine, will update. > > > diff --git a/drivers/edac/zynqmp_ocm_edac.c > > b/drivers/edac/zynqmp_ocm_edac.c new file mode 100644 index > > 000000000000..32f025b72380 > > --- /dev/null > > +++ b/drivers/edac/zynqmp_ocm_edac.c > > > +/* Interrupt masks */ > > +#define OCM_CEINTR_MASK BIT(6) > > +#define OCM_UEINTR_MASK BIT(7) > > +#define OCM_ECC_ENABLE_MASK BIT(0) > > +#define OCM_CEUE_MASK GENMASK(7, 6) > > Drop that one and use > > OCM_CEINTR_MASK | OCM_UEINTR_MASK > > everywhere. Ok, will update. > > > +#define OCM_FICOUNT_MASK GENMASK(23, 0) > > +#define OCM_NUM_UE_BITPOS 2 > > +#define OCM_BASEVAL 0xFFFC0000 > > +#define EDAC_DEVICE "ZynqMP-OCM" > > + > > +/** > > + * struct ecc_error_info - ECC error log information > > + * @addr: Fault generated at this address > > + * @data0: Generated fault data (lower 32-bit) > > + * @data1: Generated fault data (upper 32-bit) > > + */ > > +struct ecc_error_info { > > + u32 addr; > > + u32 data0; > > + u32 data1; > > What's wrong with calling those fault_lo and fault_hi then? Ok, will update. > > > +/** > > + * get_error_info - Get the current ECC error info > > + * @base: Pointer to the base address of the OCM memory controller > > + * @p: Pointer to the OCM ECC status structure > > + * @mask: Status register mask value > > + * > > + * Determines there is any ECC error or not > > + * > > + */ > > +static void get_error_info(void __iomem *base, struct ecc_status *p, > > +int mask) { > > + if (mask & OCM_CEINTR_MASK) { > > + p->ce_cnt++; > > + p->ceinfo.data0 = readl(base + CE_FFD0_OFST); > > + p->ceinfo.data1 = readl(base + CE_FFD1_OFST); > > + p->ceinfo.addr = (OCM_BASEVAL | readl(base + > CE_FFA_OFST)); > > + writel(ECC_CTRL_CLR_CE_ERR, base + OCM_ISR_OFST); > > + } else { > > I guess you need to check OCM_UEINTR_MASK here? Since we are dealing other interrupts in intr_handler(), this can be simple else. > > > + p->ue_cnt++; > > + p->ueinfo.data0 = readl(base + UE_FFD0_OFST); > > + p->ueinfo.data1 = readl(base + UE_FFD1_OFST); > > + p->ueinfo.addr = (OCM_BASEVAL | readl(base + > UE_FFA_OFST)); > > + writel(ECC_CTRL_CLR_UE_ERR, base + OCM_ISR_OFST); > > + } > > No, I didn't mean for you to drop that block. See comment in > intr_handler() below. Ok, will handle the warning in intr_handler() if it raises for other interrupts. > > > +/** > > + * handle_error - Handle controller error types CE and UE > > + * @dci: Pointer to the EDAC device controller instance > > + * @p: Pointer to the OCM ECC status structure > > + * > > + * Handles the controller ECC correctable and uncorrectable error. > > s/controller// - we know it is the controller. You probably should go through > all text in here and tone down all the "controller" mentions. Ok, will update. > > > + */ > > +static void handle_error(struct edac_device_ctl_info *dci, struct > > +ecc_status *p) { > > + struct edac_priv *priv = dci->pvt_info; > > + struct ecc_error_info *pinf; > > + > > + if (p->ce_cnt) { > > + pinf = &p->ceinfo; > > + snprintf(priv->message, ZYNQMP_OCM_EDAC_MSG_SIZE, > > + "\nOCM ECC error type :%s\nAddr: [0x%x]\nFault > Data[0x%08x%08x]", > > + "CE", pinf->addr, pinf->data1, pinf->data0); > > + edac_device_handle_ce(dci, 0, 0, priv->message); > > + } > > + > > + if (p->ue_cnt) { > > + pinf = &p->ueinfo; > > + snprintf(priv->message, ZYNQMP_OCM_EDAC_MSG_SIZE, > > + "\nOCM ECC error type :%s\nAddr: [0x%x]\nFault > Data[0x%08x%08x]", > > + "UE", pinf->addr, pinf->data1, pinf->data0); > > + edac_device_handle_ue(dci, 0, 0, priv->message); > > + } > > + > > + memset(p, 0, sizeof(*p)); > > +} > > + > > +/** > > + * intr_handler - ISR routine > > + * @irq: irq number > > + * @dev_id: device id pointer > > + * > > + * Return: IRQ_NONE, if interrupt not set or IRQ_HANDLED otherwise > > +*/ static irqreturn_t intr_handler(int irq, void *dev_id) { > > + struct edac_device_ctl_info *dci = dev_id; > > + struct edac_priv *priv = dci->pvt_info; > > + int regval; > > + > > + regval = readl(priv->baseaddr + OCM_ISR_OFST); > > + if (!(regval & OCM_CEUE_MASK)) > > + return IRQ_NONE; > > What is that check for? > > If neither of the masks are set but the device still raises an error interrupt, > then you need to WARN_ONCE() here so that people look at this and debug it > as to why it still raised an interrupt. Ok, will update. > > > + get_error_info(priv->baseaddr, &priv->stat, regval); > > + > > + priv->ce_cnt += priv->stat.ce_cnt; > > + priv->ue_cnt += priv->stat.ue_cnt; > > + handle_error(dci, &priv->stat); > > + > > + return IRQ_HANDLED; > > +} > > + > > +/** > > + * get_eccstate - Return the controller ECC status > > + * @base: Pointer to the OCM memory controller base address > > + * > > + * Get the ECC enable/disable status for the controller > > + * > > + * Return: ECC status 0/1. > > + */ > > +static bool get_eccstate(void __iomem *base) { > > + return readl(base + ECC_CTRL_OFST) & OCM_ECC_ENABLE_MASK; } > > + > > +#ifdef CONFIG_EDAC_DEBUG > > +/** > > + * inject_fault_count - write fault injection count > > + * @priv: Pointer to the EDAC private struct > > + * > > + * Update the fault injection count register, once the counter > > +reaches > > + * zero, it injects errors > > + */ > > +static void inject_fault_count(struct edac_priv *priv) { > > + u32 ficount = priv->fault_injection_cnt; > > + > > + ficount &= OCM_FICOUNT_MASK; > > + if (ficount != priv->fault_injection_cnt) > > Do this: > > if (ficount & ~OCM_FICOUNT_MASK) { > ficount &= OCM_FICOUNT_MASK; > edac_printk(KERN_INFO, EDAC_DEVICE, "Fault injection > count value truncated to: %d\n", ficount); > } > > i.e., mask it *only* when it is larger. Ok, will update. > > > + > > + writel(ficount, priv->baseaddr + OCM_FIC_OFST); } > > + > > +/** > > + * inject_cebitpos - Set CE bit position > > + * @priv: Pointer to the EDAC private struct > > + * > > + * Set any one bit to inject CE error */ static void > > +inject_cebitpos(struct edac_priv *priv) { > > + if (priv->ce_bitpos <= UE_MAX_BITPOS_LOWER) { > > + writel(BIT(priv->ce_bitpos), priv->baseaddr + > OCM_FID0_OFST); > > + writel(0, priv->baseaddr + OCM_FID1_OFST); > > + } else { > > + writel(BIT(priv->ce_bitpos - UE_MIN_BITPOS_UPPER), > > + priv->baseaddr + OCM_FID1_OFST); > > + writel(0, priv->baseaddr + OCM_FID0_OFST); > > I had to stare at this a bit to figure out what you're doing: the injection > registers are two 32-bit quantities and depending on where you inject, you > need to select into which offset to write it. > > But looking more at this, all the proper range checks should happen in the > debugfs callbacks, i.e., inject_ce_write() in this case. > > The actual injection function should only inject - that's it. > > And come to think of it, you don't need inject_cebitpos() or inject_uebitpos(). > > Your debugfs callbacks should: > > 1. check the range, error out and print a warning if range wrong 2. inject > otherwise. > > and that's it. Ok, will re-arrange the logic. > > ... > > > +static ssize_t inject_ue_write(struct file *file, const char __user *data, > > + size_t count, loff_t *ppos) { > > + struct edac_device_ctl_info *edac_dev = file->private_data; > > + struct edac_priv *priv = edac_dev->pvt_info; > > + char buf[6]; > > + u8 pos0, pos1, len; > > + int ret; > > The EDAC tree preferred ordering of variable declarations at the beginning of > a function is reverse fir tree order:: > > struct long_struct_name *descriptive_name; > unsigned long foo, bar; > unsigned int tmp; > int ret; > > The above is faster to parse than the reverse ordering:: > > int ret; > unsigned int tmp; > unsigned long foo, bar; > struct long_struct_name *descriptive_name; > > And even more so than random ordering:: > > unsigned long foo, bar; > int ret; > struct long_struct_name *descriptive_name; > unsigned int tmp; > > You're pretty much doing it but some functions' local vars still need re- > sorting. Ok, will update. > > > + > > + if (!data) > > + return -EFAULT; > > + > > + len = min_t(size_t, count, sizeof(buf)); > > + if (copy_from_user(buf, data, len)) > > + return -EFAULT; > > + > > + for (pos0 = 0; buf[pos0] != ' ' && pos0 <= len; pos0++) > > + ; > > + > > + if (pos0 > len) > > + return -EINVAL; > > + > > + ret = kstrtou8_from_user(data, pos0, 0, &priv->ue_bitpos[0]); > > + if (ret) > > + return ret; > > + > > + for (pos1 = pos0 + 1; buf[pos1] != '\n' && pos1 <= len; pos1++) > > + ; > > + > > + if (pos1 > count) > > + return -EINVAL; > > + > > + ret = kstrtou8_from_user(&data[pos0 + 1], pos1 - pos0 - 1, 0, > > + &priv->ue_bitpos[1]); > > This looks like it is composing multiple values. I guess the two bits for an UE > since UE is a two-bit error. > > No documentation? > > IOW, you need to document how this injection works. In a comment here > somewhere, pls explain what the user is supposed to do when she wants to > inject errors. > > Example: > > Documentation/firmware-guide/acpi/apei/einj.rst > > You don't have to do a separate file and go too much into detail but at least a > simple injection recipe/example would be prudent to have. Ok, will update API documentation like below. echo <fault_injection count> > /sys/kernel/debug/edac/ff960000.memory-controller/inject_fault_count echo <bit pos0> <bit pos1> > /sys/kernel/debug/edac/ff960000.memory-controller/inject_ue_bitpos > > > +static int edac_probe(struct platform_device *pdev) { > > + struct edac_priv *priv; > > + struct edac_device_ctl_info *dci; > > + void __iomem *baseaddr; > > + struct resource *res; > > + int irq, ret; > > + > > + baseaddr = devm_platform_get_and_ioremap_resource(pdev, 0, > &res); > > + if (IS_ERR(baseaddr)) > > + return PTR_ERR(baseaddr); > > + > > + if (!get_eccstate(baseaddr)) { > > + edac_printk(KERN_INFO, EDAC_DEVICE, > > + "ECC not enabled\n"); > > No need to break that line. Ok, will update. > > > + return -ENXIO; > > + } > > + > > + dci = edac_device_alloc_ctl_info(sizeof(*priv), > ZYNQMP_OCM_EDAC_STRING, > > + 1, ZYNQMP_OCM_EDAC_STRING, 1, > 0, NULL, 0, > > + edac_device_alloc_index()); > > + if (!dci) > > + return -ENOMEM; > > + > > + priv = dci->pvt_info; > > + platform_set_drvdata(pdev, dci); > > + dci->dev = &pdev->dev; > > + priv->baseaddr = baseaddr; > > + dci->mod_name = pdev->dev.driver->name; > > + dci->ctl_name = ZYNQMP_OCM_EDAC_STRING; > > + dci->dev_name = dev_name(&pdev->dev); > > + > > + irq = platform_get_irq(pdev, 0); > > + if (irq < 0) { > > + ret = irq; > > + goto free_dev_ctl; > > + } > > + > > + ret = devm_request_irq(&pdev->dev, irq, intr_handler, 0, > > + dev_name(&pdev->dev), dci); > > + if (ret) { > > + edac_printk(KERN_ERR, EDAC_DEVICE, "Failed to request > Irq\n"); > > + goto free_dev_ctl; > > + } > > + > > + /* Enable UE, CE interrupts */ > > + writel(OCM_CEUE_MASK, priv->baseaddr + OCM_IEN_OFST); > > + > > +#ifdef CONFIG_EDAC_DEBUG > > + setup_debugfs(dci); > > +#endif > > Do this instead: > > diff --git a/drivers/edac/zynqmp_ocm_edac.c > b/drivers/edac/zynqmp_ocm_edac.c index 32f025b72380..a2b8cf1eb986 > 100644 > --- a/drivers/edac/zynqmp_ocm_edac.c > +++ b/drivers/edac/zynqmp_ocm_edac.c > @@ -428,9 +428,8 @@ static int edac_probe(struct platform_device *pdev) > /* Enable UE, CE interrupts */ > writel(OCM_CEUE_MASK, priv->baseaddr + OCM_IEN_OFST); > > -#ifdef CONFIG_EDAC_DEBUG > - setup_debugfs(dci); > -#endif > + if (IS_ENABLED(CONFIG_EDAC_DEBUG)) > + setup_debugfs(dci); > > ret = edac_device_add_device(dci); > if (ret) > > below too. Ok, will update. > > > + > > + ret = edac_device_add_device(dci); > > + if (ret) > > + goto free_dev_ctl; > > + > > + return 0; > > + > > +free_dev_ctl: > > + edac_device_free_ctl_info(dci); > > + > > + return ret; > > +} > > + > > +static int edac_remove(struct platform_device *pdev) { > > + struct edac_device_ctl_info *dci = platform_get_drvdata(pdev); > > + struct edac_priv *priv = dci->pvt_info; > > + > > + /* Disable UE, CE interrupts */ > > + writel(OCM_CEUE_MASK, priv->baseaddr + OCM_IDS_OFST); > > + > > +#ifdef CONFIG_EDAC_DEBUG > > + debugfs_remove_recursive(priv->debugfs_dir); > > +#endif > > + > > + edac_device_del_device(&pdev->dev); > > + edac_device_free_ctl_info(dci); > > + > > + return 0; > > +} > > + > > +static const struct of_device_id zynqmp_ocm_edac_match[] = { > > + { .compatible = "xlnx,zynqmp-ocmc-1.0"}, > > + { /* end of table */ } > > +}; > > + > > +MODULE_DEVICE_TABLE(of, zynqmp_ocm_edac_match); > > + > > +static struct platform_driver zynqmp_ocm_edac_driver = { > > + .driver = { > > + .name = "zynqmp-ocm-edac", > > + .of_match_table = zynqmp_ocm_edac_match, > > + }, > > + .probe = edac_probe, > > + .remove = edac_remove, > > +}; > > + > > +module_platform_driver(zynqmp_ocm_edac_driver); > > + > > +MODULE_AUTHOR("Advanced Micro Devices, Inc"); > > +MODULE_DESCRIPTION("Xilinx ZynqMP OCM ECC driver"); > > +MODULE_LICENSE("GPL"); > > -- > > 2.25.1 > > > > -- > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/notes-about-netiquette [-- Attachment #2: winmail.dat --] [-- Type: application/ms-tnef, Size: 22587 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 2/2] EDAC/zynqmp: Add EDAC support for Xilinx ZynqMP OCM 2022-12-05 10:20 ` Potthuri, Sai Krishna @ 2022-12-05 13:16 ` Borislav Petkov 2022-12-05 14:49 ` Michal Simek 0 siblings, 1 reply; 14+ messages in thread From: Borislav Petkov @ 2022-12-05 13:16 UTC (permalink / raw) To: Potthuri, Sai Krishna Cc: Rob Herring, Krzysztof Kozlowski, Michal Simek, Mauro Carvalho Chehab, Tony Luck, James Morse, Robert Richter, devicetree, linux-arm-kernel, linux-kernel, linux-edac, saikrishna12468, git (AMD-Xilinx), Datta, Shubhrajyoti, kernel test robot On Mon, Dec 05, 2022 at 10:20:11AM +0000, Potthuri, Sai Krishna wrote: > As we communicated earlier for ZynqMP platform we have both Synopsys > (for DDRMC) and zynqmp_ocm_edac (for OCM) drivers. Just to be clear > about what this driver is targeted for, we used ocm as part of file > name. Ok, zynqmp_edac.c looks fine, will update. Yeah, we can always rename later, when another driver is needed. For now, let's keep things simple. > Ok, will update API documentation like below. > echo <fault_injection count> > /sys/kernel/debug/edac/ff960000.memory-controller/inject_fault_count ^^^^^^^^^^^^^^^^^^^^^^^^^^ Any particular reason this should not be called simply "mc0" or so? At least this is how we call them on x86... > echo <bit pos0> <bit pos1> > /sys/kernel/debug/edac/ff960000.memory-controller/inject_ue_bitpos echo <bit0>,<bit1> > ... I guess. The ',' or ':' or some other separator which is not blank space would make it more obvious that the two bits belong together and you won't have to scan further for the second value but simply have a single string which you split at the separator. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 2/2] EDAC/zynqmp: Add EDAC support for Xilinx ZynqMP OCM 2022-12-05 13:16 ` Borislav Petkov @ 2022-12-05 14:49 ` Michal Simek 2022-12-05 15:19 ` Borislav Petkov 0 siblings, 1 reply; 14+ messages in thread From: Michal Simek @ 2022-12-05 14:49 UTC (permalink / raw) To: Borislav Petkov, Potthuri, Sai Krishna Cc: Rob Herring, Krzysztof Kozlowski, Michal Simek, Mauro Carvalho Chehab, Tony Luck, James Morse, Robert Richter, devicetree, linux-arm-kernel, linux-kernel, linux-edac, saikrishna12468, git (AMD-Xilinx), Datta, Shubhrajyoti, kernel test robot On 12/5/22 14:16, Borislav Petkov wrote: > On Mon, Dec 05, 2022 at 10:20:11AM +0000, Potthuri, Sai Krishna wrote: >> As we communicated earlier for ZynqMP platform we have both Synopsys >> (for DDRMC) and zynqmp_ocm_edac (for OCM) drivers. Just to be clear >> about what this driver is targeted for, we used ocm as part of file >> name. Ok, zynqmp_edac.c looks fine, will update. > > Yeah, we can always rename later, when another driver is needed. For > now, let's keep things simple. > >> Ok, will update API documentation like below. >> echo <fault_injection count> > /sys/kernel/debug/edac/ff960000.memory-controller/inject_fault_count > ^^^^^^^^^^^^^^^^^^^^^^^^^^ > > Any particular reason this should not be called simply "mc0" or so? > > At least this is how we call them on x86... Some history around this. Based on https://github.com/devicetree-org/devicetree-specification/releases/download/v0.3/devicetree-specification-v0.3.pdf Chapter 2.2.2 Generic Names Recommendation memory-controller name is recommended. Thanks, Michal ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 2/2] EDAC/zynqmp: Add EDAC support for Xilinx ZynqMP OCM 2022-12-05 14:49 ` Michal Simek @ 2022-12-05 15:19 ` Borislav Petkov 0 siblings, 0 replies; 14+ messages in thread From: Borislav Petkov @ 2022-12-05 15:19 UTC (permalink / raw) To: Michal Simek Cc: Potthuri, Sai Krishna, Rob Herring, Krzysztof Kozlowski, Michal Simek, Mauro Carvalho Chehab, Tony Luck, James Morse, Robert Richter, devicetree, linux-arm-kernel, linux-kernel, linux-edac, saikrishna12468, git (AMD-Xilinx), Datta, Shubhrajyoti, kernel test robot On Mon, Dec 05, 2022 at 03:49:33PM +0100, Michal Simek wrote: > Some history around this. Based on > https://github.com/devicetree-org/devicetree-specification/releases/download/v0.3/devicetree-specification-v0.3.pdf > Chapter 2.2.2 Generic Names Recommendation > memory-controller name is recommended. Sure, fair enough. Except that this is debugfs so not really user ABI like sysfs. Rather, I'd aim here for simplicity. But your call in the end. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2022-12-05 15:20 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-11-02 7:06 [PATCH v6 0/2] EDAC: Add support for Xilinx ZynqMP OCM EDAC Sai Krishna Potthuri 2022-11-02 7:06 ` [PATCH v6 1/2] dt-bindings: edac: Add bindings for Xilinx ZynqMP OCM Sai Krishna Potthuri 2022-11-10 9:10 ` (subset) " Krzysztof Kozlowski 2022-11-10 9:12 ` Krzysztof Kozlowski 2022-11-10 10:06 ` Krzysztof Kozlowski 2022-11-02 7:06 ` [PATCH v6 2/2] EDAC/zynqmp: Add EDAC support " Sai Krishna Potthuri 2022-11-08 18:39 ` Borislav Petkov 2022-11-09 11:21 ` Potthuri, Sai Krishna 2022-11-09 18:08 ` Borislav Petkov 2022-11-25 15:12 ` Borislav Petkov 2022-12-05 10:20 ` Potthuri, Sai Krishna 2022-12-05 13:16 ` Borislav Petkov 2022-12-05 14:49 ` Michal Simek 2022-12-05 15:19 ` Borislav Petkov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).