linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: Serge Semin <fancer.lancer@gmail.com>
Cc: Michal Simek <michal.simek@amd.com>,
	Alexander Stein <alexander.stein@ew.tq-group.com>,
	Tony Luck <tony.luck@intel.com>,
	James Morse <james.morse@arm.com>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	Robert Richter <rric@kernel.org>,
	Dinh Nguyen <dinguyen@kernel.org>,
	Punnaiah Choudary Kalluri <punnaiah.choudary.kalluri@xilinx.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-arm-kernel@lists.infradead.org, linux-edac@vger.kernel.org,
	linux-kernel@vger.kernel.org, Sherry Sun <sherry.sun@nxp.com>,
	Borislav Petkov <bp@suse.de>
Subject: Re: [PATCH v5 01/20] EDAC/synopsys: Fix ECC status data and IRQ disable race condition
Date: Sun, 21 Apr 2024 12:07:30 +0200	[thread overview]
Message-ID: <20240421100712.GAZiTlUOm1hrLQvaMi@fat_crate.local> (raw)
In-Reply-To: <szcie4giwjykne4su6uu5wsmtsl3e3jd53rjfiwir6hm3ju7as@6eqh2xmj35ie>

On Tue, Apr 16, 2024 at 01:06:11PM +0300, Serge Semin wrote:
> It looks indeed crazy because the method is called enable_intr() and
> is called in the IRQ handler. Right, re-enabling the IRQ in the handler
> doesn't look good. But under the hood it was just a way to fix the
> problem described in the commit you cited. enable_intr() just gets
> back the IRQ Enable flags cleared a bit before in the
> zynqmp_get_error_info() method.
> 
> The root cause of the problem is that the IRQ status/clear flags:
> ECCCLR.ecc_corrected_err_clr	(R/W1C)
> ECCCLR.ecc_uncorrected_err_clr	(R/W1C)
> ECCCLR.ecc_corr_err_cnt_clr	(R/W1C)
> ECCCLR.ecc_uncorr_err_cnt_clr	(R/W1C)
> etc
> 
> and the IRQ enable/disable flags (since v3.10a):
> ECCCLR.ecc_corrected_err_intr_en	(R/W)
> ECCCLR.ecc_uncorrected_err_intr_en	(R/W)
> 
> reside in a single register - ECCCLR (Synopsys has renamed it to
> ECCCTL since v3.10a due to adding the IRQ En/Dis flags).
> 
> Thus any concurrent access to that CSR like "Clear IRQ
> status/counters" and "IRQ disable/enable" need to be protected from
> the race condition.

Ok, let's pick this apart one-by-one. I'll return to the rest you're
explaining as needed.

So, can writes to the status/counter bits while writing the *same* bit
to the IRQ enable/disable bit prevent any race conditions?

Meaning, you only change the status and counter bits and you preserve
the same value in the IRQ disable/enable bit?

IOW, I'm thinking of shadowing that ECCCTL in software so that we update
it from the shadowed value.

Because, AFAIU, the spinlock won't help if you grab it, clear the
status/counter bits and disable the interrupt in the process. You want
to only clear the status/counter bits and leave the interrupt enabled.

Right?

IOW, in one single write you do:

ECCCLR.ecc_corrected_err_clr=1
ECCCLR.ecc_uncorrected_err_clr=1
ECCCLR.ecc_corr_err_cnt_clr=1
ECCCLR.ecc_uncorr_err_cnt_clr=1
ECCCLR.ecc_corrected_err_intr_en=1
ECCCLR.ecc_uncorrected_err_intr_en=1

?

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

  reply	other threads:[~2024-04-21 10:08 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-22 18:12 [PATCH v5 00/20] EDAC/mc/synopsys: Various fixes and cleanups Serge Semin
2024-02-22 18:12 ` [PATCH v5 01/20] EDAC/synopsys: Fix ECC status data and IRQ disable race condition Serge Semin
2024-04-15 18:36   ` Borislav Petkov
2024-04-16 10:06     ` Serge Semin
2024-04-21 10:07       ` Borislav Petkov [this message]
2024-04-25 12:52         ` Serge Semin
2024-05-06 10:20           ` Borislav Petkov
2024-05-06 11:27             ` Serge Semin
2024-05-06 12:12               ` Borislav Petkov
2024-02-22 18:12 ` [PATCH v5 02/20] EDAC/synopsys: Fix generic device type detection procedure Serge Semin
2024-02-22 18:12 ` [PATCH v5 03/20] EDAC/synopsys: Fix mci->scrub_cap field setting Serge Semin
2024-02-22 18:12 ` [PATCH v5 04/20] EDAC/synopsys: Drop erroneous ADDRMAP4.addrmap_col_b10 parse Serge Semin
2024-02-22 18:12 ` [PATCH v5 05/20] EDAC/synopsys: Fix reading errors count before ECC status Serge Semin
2024-02-22 18:12 ` [PATCH v5 06/20] EDAC/synopsys: Fix misleading IRQ self-cleared quirk flag Serge Semin
2024-02-22 18:12 ` [PATCH v5 07/20] EDAC/synopsys: Use platform device devm ioremap method Serge Semin
2024-02-22 18:12 ` [PATCH v5 08/20] EDAC/synopsys: Drop internal CE and UE counters Serge Semin
2024-02-22 18:12 ` [PATCH v5 09/20] EDAC/synopsys: Drop local to_mci() macro definition Serge Semin
2024-02-22 18:12 ` [PATCH v5 10/20] EDAC/synopsys: Drop struct ecc_error_info.blknr field Serge Semin
2024-02-22 18:12 ` [PATCH v5 11/20] EDAC/synopsys: Shorten out struct ecc_error_info.bankgrpnr field name Serge Semin
2024-02-22 18:12 ` [PATCH v5 12/20] EDAC/synopsys: Drop redundant info from the error messages Serge Semin
2024-02-22 18:12 ` [PATCH v5 13/20] EDAC/mc: Init DIMM labels in MC registration method Serge Semin
2024-02-22 18:12 ` [PATCH v5 14/20] EDAC/mc: Add generic unique MC index allocation procedure Serge Semin
2024-02-22 18:13 ` [PATCH v5 15/20] EDAC/mc: Re-use " Serge Semin
2024-02-22 18:13 ` [PATCH v5 16/20] EDAC/synopsys: Detach Zynq A05 DDRC support to separate driver Serge Semin
2024-02-22 18:13 ` [PATCH v5 17/20] EDAC/synopsys: Drop unused platform-specific setup API Serge Semin
2024-02-22 18:13 ` [PATCH v5 18/20] EDAC/synopsys: Unify CSRs macro declarations Serge Semin
2024-02-22 18:13 ` [PATCH v5 19/20] EDAC/synopsys: Unify struct/macro/function prefixes Serge Semin
2024-02-22 18:13 ` [PATCH v5 20/20] EDAC/synopsys: Convert to using BIT/GENMASK/FIELD_x macros Serge Semin
2024-03-06  5:27 ` [PATCH v5 00/20] EDAC/mc/synopsys: Various fixes and cleanups Shubhrajyoti Datta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240421100712.GAZiTlUOm1hrLQvaMi@fat_crate.local \
    --to=bp@alien8.de \
    --cc=alexander.stein@ew.tq-group.com \
    --cc=arnd@arndb.de \
    --cc=bp@suse.de \
    --cc=dinguyen@kernel.org \
    --cc=fancer.lancer@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=james.morse@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab@kernel.org \
    --cc=michal.simek@amd.com \
    --cc=punnaiah.choudary.kalluri@xilinx.com \
    --cc=rric@kernel.org \
    --cc=sherry.sun@nxp.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).