From: James Morse <james.morse@arm.com>
To: "Hawa, Hanna" <hhhawa@amazon.com>
Cc: robh+dt@kernel.org, mark.rutland@arm.com, bp@alien8.de,
mchehab@kernel.org, davem@davemloft.net,
gregkh@linuxfoundation.org, nicolas.ferre@microchip.com,
paulmck@linux.ibm.com, dwmw@amazon.co.uk, benh@amazon.com,
ronenk@amazon.com, talel@amazon.com, jonnyc@amazon.com,
hanochu@amazon.com, linux-edac@vger.kernel.org,
devicetree@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] edac: add support for Amazon's Annapurna Labs EDAC
Date: Wed, 19 Jun 2019 18:22:37 +0100 [thread overview]
Message-ID: <44da6863-eb79-a61b-a4bf-9e8c6cacc2b8@arm.com> (raw)
In-Reply-To: <bbb9b41d-8ffa-d4c5-c199-2400695cce8d@amazon.com>
Hi Hawa,
On 17/06/2019 14:00, Hawa, Hanna wrote:
>> I don't think it can, on a second reading, it looks to be even more complicated than I
>> thought! That bit is described as disabling forwarding of uncorrected data, but it looks
>> like the uncorrected data never actually reaches the other end. (I'm unsure what 'flush'
>> means in this context.)
>> I was looking for reasons you could 'know' that any reported error was corrected. This was
>> just a bad suggestion!
> Is there interrupt for un-correctable error?
The answer here is somewhere between 'not really' and 'maybe'.
There is a signal you may have wired-up as an interrupt, but its not usable from linux.
A.8.2 "Asychronous error signals" of the A57 TRM [0] has:
| nINTERRIRQ output Error indicator for an L2 RAM double-bit ECC error.
("7.6 Asynchronous errors" has more on this).
Errors cause L2ECTLR[30] to get set, and this value output as a signal, you may have wired
it up as an interrupt.
If you did, beware its level sensitive, and can only be cleared by writing to L2ECTLR_EL1.
You shouldn't allow linux to access this register as it could mess with the L2
configuration, which could also affect your EL3 and any secure-world software.
The arrival of this interrupt doesn't tell you which L2 tripped the error, and you can
only clear it if you write to L2ECTLR_EL1 on a CPU attached to the right L2. So this isn't
actually a shared (peripheral) interrupt.
This stuff is expected to be used by firmware, which can know the affinity constraints of
signals coming in as interrupts.
> Does 'asynchronous errors' in L2 used to report UE?
From "7.2.4 Error correction code" single-bit errors are always corrected.
A.8.2 quoted above gives the behaviour for double-bit errors.
> In case no interrupt, can we use die-notifier subsystem to check if any error had occur
> while system shutdown?
notify_die() would imply a synchronous exception that killed a thread. SError are a whole
lot worse. Before v8.2 these are all treated as 'uncontained': unknown memory corruption.
Which in your L2 case is exactly what happened. The arch code will panic().
If your driver can print something useful to help debug the panic(), then a panic_notifier
sounds appropriate. But you can't rely on these notifiers being called, as kdump has some
hooks that affect if/when they run.
(KVM will 'contain' SError that come from a guest to the guest, as we know a distinct set
of memory was in use. You may see fatal error counters increasing without the system
panic()ing)
contained/uncontained is part of the terminology from the v8.2 RAS spec [1].
Thanks,
James
[0]
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0488c/DDI0488C_cortex_a57_mpcore_r1p0_trm.pdf
[1]
https://static.docs.arm.com/ddi0587/ca/ARM_DDI_0587C_a_RAS.pdf?_ga=2.148234679.1686960568.1560964184-897392434.1556719556
prev parent reply other threads:[~2019-06-19 17:22 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-30 10:15 [PATCH 0/2] Add support for Amazon's Annapurna Labs EDAC for L1/L2 Hanna Hawa
2019-05-30 10:15 ` [PATCH 1/2] dt-bindings: EDAC: add Amazon Annapurna Labs EDAC binding Hanna Hawa
2019-05-30 11:54 ` Greg KH
2019-05-31 0:35 ` Borislav Petkov
2019-05-30 10:15 ` [PATCH 2/2] edac: add support for Amazon's Annapurna Labs EDAC Hanna Hawa
2019-05-30 11:57 ` Greg KH
2019-05-30 12:52 ` hhhawa
2019-05-30 13:04 ` Joe Perches
2019-05-30 18:19 ` Boris Petkov
2019-05-31 1:15 ` Herrenschmidt, Benjamin
2019-05-31 5:14 ` Borislav Petkov
2019-06-05 15:13 ` James Morse
2019-06-06 7:53 ` Hawa, Hanna
2019-06-06 10:03 ` Borislav Petkov
2019-06-06 10:33 ` James Morse
2019-06-06 11:22 ` Borislav Petkov
2019-06-06 11:37 ` Shenhar, Talel
2019-06-07 15:11 ` James Morse
2019-06-08 0:22 ` Benjamin Herrenschmidt
2019-06-08 0:16 ` Benjamin Herrenschmidt
2019-06-08 9:05 ` Borislav Petkov
2019-06-11 5:50 ` Benjamin Herrenschmidt
2019-06-11 7:21 ` Benjamin Herrenschmidt
2019-06-11 11:56 ` Borislav Petkov
2019-06-11 22:25 ` Benjamin Herrenschmidt
2019-06-12 3:48 ` Borislav Petkov
2019-06-12 8:29 ` Benjamin Herrenschmidt
2019-06-12 10:42 ` Borislav Petkov
2019-06-12 23:54 ` Benjamin Herrenschmidt
2019-06-13 7:44 ` Borislav Petkov
2019-06-14 10:53 ` Borislav Petkov
2019-06-12 10:42 ` Mauro Carvalho Chehab
2019-06-12 11:00 ` Borislav Petkov
2019-06-12 11:42 ` Mauro Carvalho Chehab
2019-06-12 11:57 ` Benjamin Herrenschmidt
2019-06-12 12:25 ` Borislav Petkov
2019-06-12 12:35 ` Hawa, Hanna
2019-06-12 15:34 ` Borislav Petkov
2019-06-12 23:57 ` Benjamin Herrenschmidt
2019-06-12 23:56 ` Benjamin Herrenschmidt
2019-06-11 7:29 ` Hawa, Hanna
2019-06-11 11:59 ` Borislav Petkov
2019-06-11 11:47 ` Borislav Petkov
2019-06-03 6:56 ` Hawa, Hanna
2019-06-05 15:16 ` James Morse
2019-06-11 19:56 ` Hawa, Hanna
2019-06-13 17:05 ` James Morse
2019-06-14 10:49 ` James Morse
2019-06-17 13:00 ` Hawa, Hanna
2019-06-19 17:22 ` James Morse [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44da6863-eb79-a61b-a4bf-9e8c6cacc2b8@arm.com \
--to=james.morse@arm.com \
--cc=benh@amazon.com \
--cc=bp@alien8.de \
--cc=davem@davemloft.net \
--cc=devicetree@vger.kernel.org \
--cc=dwmw@amazon.co.uk \
--cc=gregkh@linuxfoundation.org \
--cc=hanochu@amazon.com \
--cc=hhhawa@amazon.com \
--cc=jonnyc@amazon.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mchehab@kernel.org \
--cc=nicolas.ferre@microchip.com \
--cc=paulmck@linux.ibm.com \
--cc=robh+dt@kernel.org \
--cc=ronenk@amazon.com \
--cc=talel@amazon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).