linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "“William Roche" <william.roche@oracle.com>
To: linux-kernel@vger.kernel.org
Cc: Borislav Petkov <bp@alien8.de>, Tony Luck <tony.luck@intel.com>,
	linux-edac@vger.kernel.org, william.roche@oracle.com
Subject: [PATCH v1] RAS/CEC: Memory Corrected Errors consistent event filtering
Date: Fri, 26 Mar 2021 14:30:29 -0400	[thread overview]
Message-ID: <1616783429-6793-1-git-send-email-william.roche@oracle.com> (raw)

From: William Roche <william.roche@oracle.com>

The Corrected Error events collected by the cec_add_elem() have to be
consistently filtered out.
We fix the case where the value of find_elem() to find the slot of a pfn
was mistakenly used as the return value of the function.
Now the MCE notifiers chain relying on MCE_HANDLED_CEC would only report
filtered corrected errors that reached the action threshold.

Signed-off-by: William Roche <william.roche@oracle.com>
---

Notes:
    Some machines are reporting Corrected Errors events without any
    information about a PFN Soft-offlining or Invalid pfn (report given by
    the EDAC module or the mcelog daemon).
    
    A research showed that it reflected the first occurrence of a CE error
    on the system which should have been filtered by the RAS_CEC component.
    We could also notice that if 2 PFNs are impacted by CE errors, the PFN
    on the non-zero slot gets its CE errors reported every-time instead of
    being filtered out.
    
    This problem has appeared with the introduction of commit
    de0e0624d86ff9fc512dedb297f8978698abf21a where the filtering logic has
    been modified.
    
    Could you please review this small suggested fix ?
    
    Thanks in advance for any feedback you could have.
    William.

 drivers/ras/cec.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c
index ddecf25..fdb9762 100644
--- a/drivers/ras/cec.c
+++ b/drivers/ras/cec.c
@@ -313,7 +313,7 @@ static int cec_add_elem(u64 pfn)
 {
 	struct ce_array *ca = &ce_arr;
 	unsigned int to = 0;
-	int count, ret = 0;
+	int count, ret;
 
 	/*
 	 * We can be called very early on the identify_cpu() path where we are
@@ -372,6 +372,9 @@ static int cec_add_elem(u64 pfn)
 		goto unlock;
 	}
 
+	/* action threshold not reached */
+	ret = 0;
+
 	ca->decay_count++;
 
 	if (ca->decay_count >= CLEAN_ELEMS)
-- 
1.8.3.1


             reply	other threads:[~2021-03-26 18:32 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-26 18:30 “William Roche [this message]
2021-03-26 19:02 ` [PATCH v1] RAS/CEC: Memory Corrected Errors consistent event filtering Borislav Petkov
2021-03-26 22:24   ` William Roche
2021-03-26 22:43     ` Borislav Petkov
2021-03-29  9:44       ` William Roche
2021-04-01 16:12         ` Borislav Petkov
2021-04-02 16:00           ` William Roche
2021-04-02 17:07             ` Borislav Petkov
2021-04-06 15:28               ` [PATCH v2] " “William Roche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1616783429-6793-1-git-send-email-william.roche@oracle.com \
    --to=william.roche@oracle.com \
    --cc=bp@alien8.de \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).