linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Cc: Michal Simek <michal.simek@xilinx.com>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	Tony Luck <tony.luck@intel.com>,
	James Morse <james.morse@arm.com>,
	Robert Richter <rric@kernel.org>,
	Dinh Nguyen <dinguyen@kernel.org>,
	Serge Semin <fancer.lancer@gmail.com>,
	Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>,
	Michail Ivanov <Michail.Ivanov@baikalelectronics.ru>,
	Pavel Parkhomenko <Pavel.Parkhomenko@baikalelectronics.ru>,
	Punnaiah Choudary Kalluri  <punnaiah.choudary.kalluri@xilinx.com>,
	Manish Narani <manish.narani@xilinx.com>,
	Rob Herring <robh+dt@kernel.org>,
	Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>,
	Rob Herring <robh@kernel.org>,
	Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>,
	devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
	Sherry Sun <sherry.sun@nxp.com>,
	Shubhrajyoti Datta <Shubhrajyoti.datta@xilinx.com>
Subject: Re: [PATCH RESEND v3 01/17] EDAC/synopsys: Fix native uMCTL2 IRQs handling procedure
Date: Mon, 31 Oct 2022 18:24:52 +0100	[thread overview]
Message-ID: <Y2AE5A74TIZwJ/7Q@zn.tnic> (raw)
In-Reply-To: <20220929232712.12202-2-Sergey.Semin@baikalelectronics.ru>

On Fri, Sep 30, 2022 at 02:26:56AM +0300, Serge Semin wrote:
> The generic DW uMCTL2 DDRC v3.x support was added in commit f7824ded4149
> ("EDAC/synopsys: Add support for version 3 of the Synopsys EDAC DDR"). It
> hasn't been done quiet well there with respect to the IRQs handling
> procedure. An attempt to fix that was introduced in the recent commit
> 4bcffe941758 ("EDAC/synopsys: Re-enable the error interrupts on v3 hw").
> Alas again it didn't provide quite complete solution.

Because?

Btw, for the future, you should make sure you add those commit authors
to Cc so that they can get a chance to comment. Adding the folks from
that commit to Cc.

> First of all the commit f7824ded4149 ("EDAC/synopsys: Add support for
> version 3 of the Synopsys EDAC DDR") log says that v3.80a "has UE/CE auto
> cleared". They aren't in none of the IP-core versions.
	    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

What does that sentence mean exactly? The UE/CE auto clearing
functionality is not in that silicon?

> The IRQ status can be cleared by means of setting the ECCCLR/ECCCTL
> register self-cleared flags 0-3.

I'm guessing that's this reg:

/* ECC control register */
#define ECC_CTRL_OFST                   0xC4

?

> The pending IRQ clearance is done in the respective get_error_info()
> method of the driver. Thus defining a quirk flag with the
> "DDR_ECC_INTR_SELF_CLEAR" name was at least very inaccurate if not to
> say misleading.
>
> So was adding the comments about the "ce/ue bits automatically
> cleared".

Aah, you mean that the ->get_error_info() functions are doing the
clearing even if something should be doing self clearing. And that
DDR_ECC_INTR_SELF_CLEAR thing is queried when enabling the error
interrupt which is just bad naming because it looks like that quirk
controls what register to write/read.

> Second, disabling the being handled IRQ in the handler doesn't make sense
> in Linux since the IC line is masked during that procedure anyway. So
> disabling the IRQ in one part of the handler and enabling it at the end of
> the method is simply redundant. (See, the ZynqMP-specific code with the
> QoS IRQ CSR didn't do that originally.)

So what is this commit message of

  4bcffe941758 ("EDAC/synopsys: Re-enable the error interrupts on v3 hw")

then talking about:

"Then the interrupt handler will be called only once."

How is that interrupt supposed to be reenabled?

> Finally calling the zynqmp_get_error_info() method concurrently with the
> enable_irq()/disable_irq() functions causes the IRQs mask state race
> condition. Starting from DW uMCTL2 DDRC IP-core v3.10a [1] the ECCCLR
> register has been renamed to ECCCTL and has been equipped with CE/UE IRQs
> enable/disable flags [2].

Aha, that sounds like the right thing to toggle.

> So the CSR now serves for the IRQ status and control functions used
> concurrently during the IRQ handling and the IRQ disabling/enabling.
> Thus the corresponding critical section must be protected with the
> IRQ-safe spin-lock.

> So let's fix all the problems noted above. First the
> DDR_ECC_INTR_SELF_CLEAR flag is renamed to SYNPS_ZYNQMP_IRQ_REGS.

"Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy to
do frotz", as if you are giving orders to the codebase to change its
behaviour."

In this case, pls formulate it something like this:

"So fix all these problems noted above: rename DDR_ECC_INTR_SELF_CLEAR
to SYNPS_ZYNQMP_IRQ_REGS to denote that, ..."

And so on.

> Its semantic is now the opposite: the quirk means having the ZynqMP
> IRQ CSRs available on the platform.

Yes, that makes more sense.

> Second the DDR_UE_MASK and DDR_CE_MASK macros
> are renamed to imply being used in the framework of the ECCCLR/ECCCTL CSRs
> accesses. Third all the misleading comments are removed. Finally the
> ECC_CLR_OFST register IOs are now protected with the IRQ-safe spin-lock
> taken in order to prevent the IRQ status clearance and IRQ enable/disable
> race condition.
> 
> [1] DesignWare Cores Enhanced Universal DDR Memory and Protocol
> Controllers (uMCTL2/uPCTL2), Release Notes, Version 3.91a, October 2020,
> p. 27.
> [2] DesignWare® Cores Enhanced Universal DDR Memory Controller (uMCTL2),
> Databook Version 3.91a, October 2020, p.818-819.

If those are not publicly accessible, then there's no point to put them
in here.

> Fixes: f7824ded4149 ("EDAC/synopsys: Add support for version 3 of the Synopsys EDAC DDR")
> Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>

Does this need to go to stable@ and thus older kernels?

> ---
>  drivers/edac/synopsys_edac.c | 76 +++++++++++++++++++++++-------------
>  1 file changed, 48 insertions(+), 28 deletions(-)

> @@ -300,6 +299,7 @@ struct synps_ecc_status {
>  /**
>   * struct synps_edac_priv - DDR memory controller private instance data.
>   * @baseaddr:		Base address of the DDR controller.
> + * @lock:		Concurrent CSRs access lock.
>   * @message:		Buffer for framing the event specific info.
>   * @stat:		ECC status information.
>   * @p_data:		Platform data.
> @@ -314,6 +314,7 @@ struct synps_ecc_status {
>   */
>  struct synps_edac_priv {
>  	void __iomem *baseaddr;
> +	spinlock_t lock;

That lock needs to be named properly and have a comment above it what it
protects.

>  	char message[SYNPS_EDAC_MSG_SIZE];
>  	struct synps_ecc_status stat;
>  	const struct synps_platform_data *p_data;

...

> @@ -516,24 +523,42 @@ static void handle_error(struct mem_ctl_info *mci, struct synps_ecc_status *p)
>  
>  static void enable_intr(struct synps_edac_priv *priv)
>  {
> +	unsigned long flags;
> +
>  	/* Enable UE/CE Interrupts */
> -	if (priv->p_data->quirks & DDR_ECC_INTR_SELF_CLEAR)
> -		writel(DDR_UE_MASK | DDR_CE_MASK,
> -		       priv->baseaddr + ECC_CLR_OFST);
> -	else
> +	if (priv->p_data->quirks & SYNPS_ZYNQMP_IRQ_REGS) {
>  		writel(DDR_QOSUE_MASK | DDR_QOSCE_MASK,
>  		       priv->baseaddr + DDR_QOS_IRQ_EN_OFST);
>  
> +		return;
> +	}
> +
> +	/* IRQs Enable/Disable feature has been available since v3.10a */

How does this comment help here?

If it is available since a version number, why doesn't the below check a
version number? IOW, what is the relevance of that comment here?

In any case, I need to hear from this driver's maintainer too.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

  reply	other threads:[~2022-10-31 17:25 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-29 23:26 [PATCH RESEND v3 00/17] EDAC/mc/synopsys: Various fixes and cleanups Serge Semin
2022-09-29 23:26 ` [PATCH RESEND v3 01/17] EDAC/synopsys: Fix native uMCTL2 IRQs handling procedure Serge Semin
2022-10-31 17:24   ` Borislav Petkov [this message]
2023-09-01 11:24     ` Serge Semin
2022-09-29 23:26 ` [PATCH RESEND v3 02/17] EDAC/synopsys: Fix generic device type detection procedure Serge Semin
2022-09-29 23:26 ` [PATCH RESEND v3 03/17] EDAC/synopsys: Fix mci->scrub_cap field setting Serge Semin
2022-09-29 23:26 ` [PATCH RESEND v3 04/17] EDAC/synopsys: Drop erroneous ADDRMAP4.addrmap_col_b10 parse Serge Semin
2022-09-29 23:27 ` [PATCH RESEND v3 05/17] EDAC/synopsys: Fix reading errors count before ECC status Serge Semin
2022-09-29 23:27 ` [PATCH RESEND v3 06/17] EDAC/synopsys: Use platform device devm ioremap method Serge Semin
2022-09-29 23:27 ` [PATCH RESEND v3 07/17] EDAC/synopsys: Drop internal CE and UE counters Serge Semin
2022-09-29 23:27 ` [PATCH RESEND v3 08/17] EDAC/synopsys: Drop local to_mci macro implementation Serge Semin
2022-09-29 23:27 ` [PATCH RESEND v3 09/17] EDAC/synopsys: Drop struct ecc_error_info.blknr field Serge Semin
2022-09-29 23:27 ` [PATCH RESEND v3 10/17] EDAC/synopsys: Shorten out struct ecc_error_info.bankgrpnr field name Serge Semin
2022-09-29 23:27 ` [PATCH RESEND v3 11/17] EDAC/synopsys: Drop redundant info from error message Serge Semin
2022-09-29 23:27 ` [PATCH RESEND v3 12/17] EDAC/mc: Init DIMM labels in MC registration method Serge Semin
2022-09-29 23:27 ` [PATCH RESEND v3 13/17] EDAC/mc: Add MC unique index allocation procedure Serge Semin
2022-10-12 17:29   ` Borislav Petkov
2022-10-12 20:01     ` Serge Semin
2022-10-12 20:33       ` Borislav Petkov
2022-10-12 20:44         ` Tony Luck
2022-10-12 21:31           ` Serge Semin
2022-10-12 22:30         ` Serge Semin
2022-10-13  9:38           ` Borislav Petkov
2022-09-29 23:27 ` [PATCH RESEND v3 14/17] EDAC/synopsys: Detach Zynq DDRC controller support Serge Semin
2022-09-30 14:42   ` Borislav Petkov
2022-10-06 12:17     ` Serge Semin
2022-10-06 13:25       ` Borislav Petkov
2022-10-08  0:42         ` Serge Semin
2022-10-12 17:28           ` Borislav Petkov
2022-10-12 19:27             ` Serge Semin
2022-10-12 19:44               ` Borislav Petkov
2022-10-12 20:55                 ` Serge Semin
2022-09-29 23:27 ` [PATCH RESEND v3 15/17] EDAC/synopsys: Drop unused platform-specific setup API Serge Semin
2022-09-29 23:27 ` [PATCH RESEND v3 16/17] EDAC/synopsys: Unify the driver entities naming Serge Semin
2022-09-29 23:27 ` [PATCH RESEND v3 17/17] EDAC/synopsys: Convert to using BIT/GENMASK/FIELD_x macros Serge Semin
2022-09-30 14:29 ` [PATCH RESEND v3 00/17] EDAC/mc/synopsys: Various fixes and cleanups Borislav Petkov
2022-10-06  7:13   ` Michal Simek
2023-08-18 23:06 ` Serge Semin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y2AE5A74TIZwJ/7Q@zn.tnic \
    --to=bp@alien8.de \
    --cc=Alexey.Malahov@baikalelectronics.ru \
    --cc=Michail.Ivanov@baikalelectronics.ru \
    --cc=Pavel.Parkhomenko@baikalelectronics.ru \
    --cc=Sergey.Semin@baikalelectronics.ru \
    --cc=Shubhrajyoti.datta@xilinx.com \
    --cc=devicetree@vger.kernel.org \
    --cc=dinguyen@kernel.org \
    --cc=fancer.lancer@gmail.com \
    --cc=james.morse@arm.com \
    --cc=krzysztof.kozlowski+dt@linaro.org \
    --cc=krzysztof.kozlowski@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manish.narani@xilinx.com \
    --cc=mchehab@kernel.org \
    --cc=michal.simek@xilinx.com \
    --cc=punnaiah.choudary.kalluri@xilinx.com \
    --cc=robh+dt@kernel.org \
    --cc=robh@kernel.org \
    --cc=rric@kernel.org \
    --cc=sherry.sun@nxp.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).