linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: Jean-Frederic <jfgaudreault@gmail.com>
Cc: "Ghannam, Yazen" <Yazen.Ghannam@amd.com>,
	"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Subject: Re: [GIT PULL] EDAC pile for 5.4 -> AMD family 17h, model 70h support
Date: Sat, 19 Oct 2019 10:25:54 +0200	[thread overview]
Message-ID: <20191019082554.GB5571@zn.tnic> (raw)
In-Reply-To: <f5820b41-c97a-b6be-df97-bbff85a7e5ee@gmail.com>

On Fri, Oct 18, 2019 at 07:08:32PM -0400, Jean-Frederic wrote:
> I don't know if there has been any new information related to these last
> points, I am really looking to understand if ECC error reporting will be
> working in this new Kernel 5.4 for AMD Ryzen 3900x (or are we saying maybe
> this issue could be related to the motherboard?)

Look here on page 6:

https://www.amd.com/system/files/2017-06/AMD-EPYC-Brings-New-RAS-Capability.pdf

It hints at what PFEH does. Roughly speaking, the firmware gets to see
the errors first and because it knows the platform much better, it
can take much more adequate recovery for those actions than the OS.
Sometimes.

 [ I believe if the error cannot be handled by the firmware, it gets
   reported to the OS but I'll let Yazen comment on that. ]

In any case, you have RAS protection on your platform - it is just done
by the firmware and not by EDAC. And that is perfectly fine - EDAC is
used when there's no firmware support.

I know, I know, we don't trust the firmware to do it right and so on,
but it is what it is. Like other stuff we have to rely on the firmware
to do right.

> In any case, I think EDAC needs to be able to tell us (like at boot time)
> if the ECC error reporting is working on the system or not, because right
> now (in 5.4) everything appear to load successfully (according to dmesg)
> with all the memory information identified, and edac-util tool appear
> to be working (and returning zeros).

EDAC loads fine but there are simply no errors to report.

> Also, since this was working on the previous generation as mentioned before

See above.

> (i.e. AMD RYZEN 2700X and ASUS PRIME 470 to be more specific), I thought
> it would be natural that it works on the newer gen, given the
> information/hype provided around launch time.Asus also confirmed to me
> through their support that this new motherboard supports ecc. It also has
> an ECC option in the bios, as I've mentioned, to enable or disable ecc.

Again, you have RAS protection if your DIMMs are ECC ones. It is just
not done by the kernel but by the firmware. And that can be a better way
to do it *if* the firmware is doing its job right.

Makes more sense now?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

  reply	other threads:[~2019-10-19  8:26 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAEVokG7TeAbmkhaxiTpsxhv1pQzqRpU=mR8gVjixb5kXo3s2Eg@mail.gmail.com>
     [not found] ` <20190924092644.GC19317@zn.tnic>
2019-10-05 16:52   ` [GIT PULL] EDAC pile for 5.4 -> AMD family 17h, model 70h support Jeff God
2019-10-07  7:16     ` Borislav Petkov
2019-10-07 12:58       ` Jeff God
2019-10-08 11:50         ` Borislav Petkov
2019-10-08 19:42           ` Ghannam, Yazen
2019-10-08 23:08             ` Jeff God
2019-10-09 10:30               ` Borislav Petkov
2019-10-09 20:31                 ` Ghannam, Yazen
2019-10-09 23:54                   ` Jeff God
2019-10-10  9:56                     ` Borislav Petkov
2019-10-10 12:48                       ` Jean-Frederic
2019-10-10 13:41                         ` Borislav Petkov
2019-10-10 19:00                           ` Ghannam, Yazen
2019-10-11  1:04                             ` Jean-Frederic
2019-10-18 23:08                               ` Jean-Frederic
2019-10-19  8:25                                 ` Borislav Petkov [this message]
2019-10-19 16:12                                   ` Jean-Frederic
2019-10-21 14:24                                     ` Ghannam, Yazen
2020-01-04 20:03                                     ` Jean-Frederic
2020-01-04 21:47                                       ` Jean-Frederic
2019-10-10  9:54                   ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191019082554.GB5571@zn.tnic \
    --to=bp@alien8.de \
    --cc=Yazen.Ghannam@amd.com \
    --cc=jfgaudreault@gmail.com \
    --cc=linux-edac@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).