linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: "“William Roche" <william.roche@oracle.com>
Cc: linux-kernel@vger.kernel.org, Tony Luck <tony.luck@intel.com>,
	linux-edac@vger.kernel.org
Subject: Re: [PATCH v1] RAS/CEC: Memory Corrected Errors consistent event filtering
Date: Fri, 26 Mar 2021 20:02:42 +0100	[thread overview]
Message-ID: <20210326190242.GI25229@zn.tnic> (raw)
In-Reply-To: <1616783429-6793-1-git-send-email-william.roche@oracle.com>

On Fri, Mar 26, 2021 at 02:30:29PM -0400, “William Roche wrote:
> From: William Roche <william.roche@oracle.com>
> 
> The Corrected Error events collected by the cec_add_elem() have to be
> consistently filtered out.
> We fix the case where the value of find_elem() to find the slot of a pfn
> was mistakenly used as the return value of the function.
> Now the MCE notifiers chain relying on MCE_HANDLED_CEC would only report
> filtered corrected errors that reached the action threshold.
> 
> Signed-off-by: William Roche <william.roche@oracle.com>
> ---
> 
> Notes:
>     Some machines are reporting Corrected Errors events without any
>     information about a PFN Soft-offlining or Invalid pfn (report given by
>     the EDAC module or the mcelog daemon).
>     
>     A research showed that it reflected the first occurrence of a CE error
>     on the system which should have been filtered by the RAS_CEC component.
>     We could also notice that if 2 PFNs are impacted by CE errors, the PFN
>     on the non-zero slot gets its CE errors reported every-time instead of
>     being filtered out.
>     
>     This problem has appeared with the introduction of commit
>     de0e0624d86ff9fc512dedb297f8978698abf21a where the filtering logic has
>     been modified.
>     
>     Could you please review this small suggested fix ?
>     
>     Thanks in advance for any feedback you could have.
>     William.
> 
>  drivers/ras/cec.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)

AFAIU, I think you want something like the below untested hunk:

You set it to 0 when it cannot find an element and that "ret = 1" we can
remove because callers don't care about the offlining threshold - the
only caller that looks at its retval wants to know whether it added the
VA successfully to note that it handled the error.

Makes sense?

---
diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c
index ddecf25b5dd4..a29994d726d8 100644
--- a/drivers/ras/cec.c
+++ b/drivers/ras/cec.c
@@ -341,6 +341,8 @@ static int cec_add_elem(u64 pfn)
 
 		ca->array[to] = pfn << PAGE_SHIFT;
 		ca->n++;
+
+		ret = 0;
 	}
 
 	/* Add/refresh element generation and increment count */
@@ -363,12 +365,6 @@ static int cec_add_elem(u64 pfn)
 
 		del_elem(ca, to);
 
-		/*
-		 * Return a >0 value to callers, to denote that we've reached
-		 * the offlining threshold.
-		 */
-		ret = 1;
-
 		goto unlock;
 	}
---

Thx.
 

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

  reply	other threads:[~2021-03-26 19:03 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-26 18:30 [PATCH v1] RAS/CEC: Memory Corrected Errors consistent event filtering “William Roche
2021-03-26 19:02 ` Borislav Petkov [this message]
2021-03-26 22:24   ` William Roche
2021-03-26 22:43     ` Borislav Petkov
2021-03-29  9:44       ` William Roche
2021-04-01 16:12         ` Borislav Petkov
2021-04-02 16:00           ` William Roche
2021-04-02 17:07             ` Borislav Petkov
2021-04-06 15:28               ` [PATCH v2] " “William Roche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210326190242.GI25229@zn.tnic \
    --to=bp@alien8.de \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tony.luck@intel.com \
    --cc=william.roche@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).