linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
To: Robin Murphy <robin.murphy@arm.com>,
	Juerg Haefliger <juerg.haefliger@canonical.com>,
	stefan.wahren@i2se.com, Florian Fainelli <f.fainelli@gmail.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Robin Murphy <robin.murphy@arm.con>
Cc: bcm-kernel-feedback-list@broadcom.com,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-pm@vger.kernel.org
Subject: Re: bcm2711_thermal: Kernel panic - not syncing: Asynchronous SError Interrupt
Date: Wed, 10 Feb 2021 17:55:36 +0100	[thread overview]
Message-ID: <c6774af169854dc1d4efa272b439e80cea8cd8ff.camel@suse.de> (raw)
In-Reply-To: <35e17dc9-c88d-582f-607d-1d90b20868fa@arm.com>

[-- Attachment #1: Type: text/plain, Size: 3619 bytes --]

Hi Robin,

On Wed, 2021-02-10 at 16:25 +0000, Robin Murphy wrote:
> On 2021-02-10 13:15, Nicolas Saenz Julienne wrote:
> > [ Add Robin, Catalin and Florian in case they want to chime in ]
> > 
> > Hi Juerg, thanks for the report!
> > 
> > On Wed, 2021-02-10 at 11:48 +0100, Juerg Haefliger wrote:
> > > Trying to dump the BCM2711 registers kills the kernel:
> > > 
> > > # cat /sys/kernel/debug/regmap/dummy-avs-monitor\@fd5d2000/range
> > > 0-efc
> > > # cat /sys/kernel/debug/regmap/dummy-avs-monitor\@fd5d2000/registers
> > > 
> > > [   62.857661] SError Interrupt on CPU1, code 0xbf000002 -- SError
> > 
> > So ESR's IDS (bit 24) is set, which means it's an 'Implementation Defined
> > SError,' hence IIUC the rest of the error code is meaningless to anyone outside
> > of Broadcom/RPi.
> 
> It's imp-def from the architecture's PoV, but the implementation in this 
> case is Cortex-A72, where 0x000002 means an attributable, containable 
> Slave Error:
> 
> https://developer.arm.com/documentation/100095/0003/system-control/aarch64-register-descriptions/exception-syndrome-register--el1-and-el3?lang=en
> 
> In other words, the thing at the other end of an interconnect 
> transaction said "no" :)
> 
> (The fact that Cortex-A72 gets too far ahead of itself to take it as a 
> synchronous external abort is a mild annoyance, but hey...)

Thanks for both your clarifications! Reading arm documentation is a skill on
its own.

> > The regmap is created through the following syscon device:
> > 
> > 	avs_monitor: avs-monitor@7d5d2000 {
> > 		compatible = "brcm,bcm2711-avs-monitor",
> > 			     "syscon", "simple-mfd";
> > 		reg = <0x7d5d2000 0xf00>;
> > 
> > 		thermal: thermal {
> > 			compatible = "brcm,bcm2711-thermal";
> > 			#thermal-sensor-cells = <0>;
> > 		};
> > 	};
> > 
> > I've done some tests with devmem, and the whole <0x7d5d2000 0xf00> range is
> > full of addresses that trigger this same error. Also note that as per Florian's
> > comments[1]: "AVS_RO_REGISTERS_0: 0x7d5d2200 - 0x7d5d22e3." But from what I can
> > tell, at least 0x7d5d22b0 seems to be faulty too.
> > 
> > Any ideas/comments? My guess is that those addresses are marked somehow as
> > secure, and only for VC4 to access (VC4 is RPi4's co-processor). Ultimately,
> > the solution is to narrow the register range exposed by avs-monitor to whatever
> > bcm2711-thermal needs (which is ATM a single 32bit register).
> 
> When a peripheral decodes a region of address space, nobody says it has 
> to accept accesses to *every* address in that space; registers may be 
> sparsely populated, and although some devices might be "nice" and make 
> unused areas behave as RAZ/WI, others may throw slave errors if you poke 
> at the wrong places. As you note, in a TrustZone-aware device some 
> registers may only exist in one or other of the Secure/Non-Secure 
> address spaces.
> 
> Even when there is a defined register at a given address, it still 
> doesn't necessarily accept all possible types of access; it wouldn't be 
> particularly friendly, but a device *could* have, say, some registers 
> that support 32-bit accesses and others that only support 16-bit 
> accesses, and thus throw slave errors if you do the wrong thing in the 
> wrong place.
> 
> It really all depends on the device itself.

All in all, assuming there is no special device quirk to apply, the feeling I'm
getting is to just let the error be. As you hint, firmware has no blame here,
and debugfs is a 'best effort, zero guarantees' interface after all.

Regards,
Nicolas


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2021-02-10 16:57 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-10 10:48 bcm2711_thermal: Kernel panic - not syncing: Asynchronous SError Interrupt Juerg Haefliger
2021-02-10 13:15 ` Nicolas Saenz Julienne
2021-02-10 14:54   ` Juerg Haefliger
2021-02-10 16:25   ` Robin Murphy
2021-02-10 16:55     ` Nicolas Saenz Julienne [this message]
2021-02-10 22:59       ` Florian Fainelli
2022-07-27  8:05         ` Juerg Haefliger
2022-07-27 21:51           ` Florian Fainelli
2022-07-28  9:06             ` Juerg Haefliger
2022-08-01 15:34               ` Florian Fainelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c6774af169854dc1d4efa272b439e80cea8cd8ff.camel@suse.de \
    --to=nsaenzjulienne@suse.de \
    --cc=bcm-kernel-feedback-list@broadcom.com \
    --cc=catalin.marinas@arm.com \
    --cc=f.fainelli@gmail.com \
    --cc=juerg.haefliger@canonical.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=robin.murphy@arm.con \
    --cc=stefan.wahren@i2se.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).