linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Packham <Chris.Packham@alliedtelesis.co.nz>
To: "Pali Rohár" <pali@kernel.org>
Cc: "jason@lakedaemon.net" <jason@lakedaemon.net>,
	Joshua Scott <Joshua.Scott@alliedtelesis.co.nz>,
	"gregory.clement@bootlin.com" <gregory.clement@bootlin.com>,
	"andrew@lunn.ch" <andrew@lunn.ch>,
	"sebastian.hesselbarth@gmail.com"
	<sebastian.hesselbarth@gmail.com>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] ARM: mvebu: Enable MBUS error propagation
Date: Thu, 3 Jun 2021 21:03:55 +0000	[thread overview]
Message-ID: <e75b74fa-b393-fede-1a19-fd43dbb7aba4@alliedtelesis.co.nz> (raw)
In-Reply-To: <20210603125525.nkswvixbabkgq5or@pali>


On 4/06/21 12:55 am, Pali Rohár wrote:
> On Wednesday 08 January 2020 19:42:12 Chris Packham wrote:
>> Hi Gregory,
>>
>> On Wed, 2020-01-08 at 11:22 +0100, Gregory CLEMENT wrote:
>>> Hello Chris,
>>>
>>>> U-boot disables MBUS error propagation for Armada-385. The effect of
>>>> this on the kernel is that any access to a mapped but inaccessible
>>>> address causes the system to hang.
>>>>
>>>> By enabling MBUS error propagation the kernel can raise a Bus Error and
>>>> panic to restart the system.
>>> Unless I miss it, it seems that nobody comment this patch: sorry for the
>>> delay.
>>>
>> Thanks for the response.
>>
>>>> Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
>>>> ---
>>>>
>>>> Notes:
>>>>      We've encountered an issue where rogue accesses to PCI-e space cause an
>>>>      Armada-385 system to lockup. We've found that enabling MBUS error
>>>>      propagation lets us get a bus error which at least gives us a panic to
>>>>      help identify what was accessed.
>>>>      
>>>>      U-boot clears the IO Err Prop Enable Bit[1] but so far no-one seems to
>>>>      know why.
>>>>      
>>>>      I wasn't sure where to put this code. There is similar code for kirwood
>>>>      in the equivalent dt_init function. On Armada-XP the register is part of
>>>>      the Core Coherency Fabric block (for A385 it's documented as part of the
>>>>      CCF block).
>>> What about adding a new set of register to the mvebu mbus driver?
>>>
>> After more testing we found that some previously "good" boards started
>> throwing up panics with this change. I think that this might require
>> handling some of the PCI-e interrupts (for correctable errors) via the
>> EDAC subsystem.
>>
>> We're still working with Marvell to track down exactly why this is
>> happening on our system.
> Hello Chris! Have you somehow solved this issue? Or do you have some
> contacts in Marvell for A385 PCIe subsystem?

The problem seemed to be a specific CPU Erratum for the A385 which I 
brought into upstream u-boot[1]. But even then we had to update our base 
u-boot version from 2016.11 to 2019.10 so there may have been more going 
on than just that one change.

[1] - 
https://source.denx.de/u-boot/u-boot/-/commit/ad91fdfff0bd6ea471afe838e0f6d58ed898694e

>>> In this case it will be called even earlier allowing to see bus error
>>> earlier.
>>>
>>> In any case, you should separate the device tree change from the code
>>> change and at least provide 2 patches.
>> Agreed. If/when something solid eventuates we'll do it as a proper
>> series.
>>
>>> Gregory
>>>
>>>>      
>>>>      --
>>>>      [1] - https://scanmail.trustwave.com/?c=20988&d=yNG44EMQR-PfAPR99zig1LzptmqzHo11i7uAj4AGuQ&u=https%3a%2f%2fgitlab%2edenx%2ede%2fu-boot%2fu-boot%2fblob%2fmaster%2farch%2farm%2fmach-mvebu%2fcpu%2ec%23L489
^^^ sorry, as a result of a reasonably high profile cyber attack on a 
local institution URLs in our incoming mail are intercepted.
>>>>
>>>>   arch/arm/boot/dts/armada-38x.dtsi |  5 +++++
>>>>   arch/arm/mach-mvebu/board-v7.c    | 27 +++++++++++++++++++++++++++
>>>>   2 files changed, 32 insertions(+)
>>>>
>>>> diff --git a/arch/arm/boot/dts/armada-38x.dtsi b/arch/arm/boot/dts/armada-38x.dtsi
>>>> index 3f4bb44d85f0..3214c67433eb 100644
>>>> --- a/arch/arm/boot/dts/armada-38x.dtsi
>>>> +++ b/arch/arm/boot/dts/armada-38x.dtsi
>>>> @@ -386,6 +386,11 @@
>>>>   				      <0x20250 0x8>;
>>>>   			};
>>>>   
>>>> +			ioerrc: io-err-control@20200 {
>>>> +				compatible = "marvell,io-err-control";
>>>> +				reg = <0x20200 0x4>;
>>>> +			};
>>>> +
>>>>   			mpic: interrupt-controller@20a00 {
>>>>   				compatible = "marvell,mpic";
>>>>   				reg = <0x20a00 0x2d0>, <0x21070 0x58>;
>>>> diff --git a/arch/arm/mach-mvebu/board-v7.c b/arch/arm/mach-mvebu/board-v7.c
>>>> index d2df5ef9382b..fb7718386ef9 100644
>>>> --- a/arch/arm/mach-mvebu/board-v7.c
>>>> +++ b/arch/arm/mach-mvebu/board-v7.c
>>>> @@ -138,10 +138,36 @@ static void __init i2c_quirk(void)
>>>>   	}
>>>>   }
>>>>   
>>>> +#define MBUS_ERR_PROP_EN BIT(8)
>>>> +
>>>> +/*
>>>> + * U-boot disables MBUS error propagation. Re-enable it so we
>>>> + * can handle them as Bus Errors.
>>>> + */
>>>> +static void __init enable_mbus_error_propagation(void)
>>>> +{
>>>> +	struct device_node *np =
>>>> +		of_find_compatible_node(NULL, NULL, "marvell,io-err-control");
>>>> +
>>>> +	if (np) {
>>>> +		void __iomem *reg;
>>>> +
>>>> +		reg = of_iomap(np, 0);
>>>> +		if (reg) {
>>>> +			u32 val;
>>>> +
>>>> +			val = readl_relaxed(reg);
>>>> +			writel_relaxed(val | MBUS_ERR_PROP_EN, reg);
>>>> +		}
>>>> +		of_node_put(np);
>>>> +	}
>>>> +}
>>>> +
>>>>   static void __init mvebu_dt_init(void)
>>>>   {
>>>>   	if (of_machine_is_compatible("marvell,armadaxp"))
>>>>   		i2c_quirk();
>>>> +	enable_mbus_error_propagation();
>>>>   }
>>>>   
>>>>   static void __init armada_370_xp_dt_fixup(void)
>>>> @@ -191,6 +217,7 @@ DT_MACHINE_START(ARMADA_38X_DT, "Marvell Armada 380/385 (Device Tree)")
>>>>   	.l2c_aux_val	= 0,
>>>>   	.l2c_aux_mask	= ~0,
>>>>   	.init_irq       = mvebu_init_irq,
>>>> +	.init_machine	= mvebu_dt_init,
>>>>   	.restart	= mvebu_restart,
>>>>   	.dt_compat	= armada_38x_dt_compat,
>>>>   MACHINE_END
>>>> -- 
>>>> 2.24.0
>>>>
>>>

  reply	other threads:[~2021-06-03 21:04 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-24  9:35 [PATCH] ARM: mvebu: Enable MBUS error propagation Chris Packham
2020-01-08 10:22 ` Gregory CLEMENT
2020-01-08 19:42   ` Chris Packham
2021-06-03 12:55     ` Pali Rohár
2021-06-03 21:03       ` Chris Packham [this message]
2022-04-22  9:27         ` Pali Rohár
2022-04-26  1:17           ` Chris Packham

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e75b74fa-b393-fede-1a19-fd43dbb7aba4@alliedtelesis.co.nz \
    --to=chris.packham@alliedtelesis.co.nz \
    --cc=Joshua.Scott@alliedtelesis.co.nz \
    --cc=andrew@lunn.ch \
    --cc=gregory.clement@bootlin.com \
    --cc=jason@lakedaemon.net \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pali@kernel.org \
    --cc=sebastian.hesselbarth@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).