From: "“William Roche" <email@example.com>
Cc: Borislav Petkov <firstname.lastname@example.org>, Tony Luck <email@example.com>,
Subject: [PATCH v1] RAS/CEC: Memory Corrected Errors consistent event filtering
Date: Fri, 26 Mar 2021 14:30:29 -0400 [thread overview]
Message-ID: <firstname.lastname@example.org> (raw)
From: William Roche <email@example.com>
The Corrected Error events collected by the cec_add_elem() have to be
consistently filtered out.
We fix the case where the value of find_elem() to find the slot of a pfn
was mistakenly used as the return value of the function.
Now the MCE notifiers chain relying on MCE_HANDLED_CEC would only report
filtered corrected errors that reached the action threshold.
Signed-off-by: William Roche <firstname.lastname@example.org>
Some machines are reporting Corrected Errors events without any
information about a PFN Soft-offlining or Invalid pfn (report given by
the EDAC module or the mcelog daemon).
A research showed that it reflected the first occurrence of a CE error
on the system which should have been filtered by the RAS_CEC component.
We could also notice that if 2 PFNs are impacted by CE errors, the PFN
on the non-zero slot gets its CE errors reported every-time instead of
being filtered out.
This problem has appeared with the introduction of commit
de0e0624d86ff9fc512dedb297f8978698abf21a where the filtering logic has
Could you please review this small suggested fix ?
Thanks in advance for any feedback you could have.
drivers/ras/cec.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c
index ddecf25..fdb9762 100644
@@ -313,7 +313,7 @@ static int cec_add_elem(u64 pfn)
struct ce_array *ca = &ce_arr;
unsigned int to = 0;
- int count, ret = 0;
+ int count, ret;
* We can be called very early on the identify_cpu() path where we are
@@ -372,6 +372,9 @@ static int cec_add_elem(u64 pfn)
+ /* action threshold not reached */
+ ret = 0;
if (ca->decay_count >= CLEAN_ELEMS)
next reply other threads:[~2021-03-26 18:32 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-26 18:30 “William Roche [this message]
2021-03-26 19:02 ` [PATCH v1] RAS/CEC: Memory Corrected Errors consistent event filtering Borislav Petkov
2021-03-26 22:24 ` William Roche
2021-03-26 22:43 ` Borislav Petkov
2021-03-29 9:44 ` William Roche
2021-04-01 16:12 ` Borislav Petkov
2021-04-02 16:00 ` William Roche
2021-04-02 17:07 ` Borislav Petkov
2021-04-06 15:28 ` [PATCH v2] " “William Roche
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).