From: Suzuki K Poulose <suzuki.poulose@arm.com>
To: stephan@gerhold.net, mathieu.poirier@linaro.org, Sudeep.Holla@arm.com
Cc: david.brown@linaro.org, saiprakash.ranjan@codeaurora.org,
agross@kernel.org, linux-arm-kernel@lists.infradead.org,
linux-arm-msm@vger.kernel.org
Subject: Re: Coresight causes synchronous external abort on msm8916
Date: Fri, 21 Jun 2019 17:16:28 +0100 [thread overview]
Message-ID: <14bd9196-538f-f641-59e1-0c04960890aa@arm.com> (raw)
In-Reply-To: <20190621160631.GA34922@gerhold.net>
Hi Stephan
On 21/06/2019 17:06, Stephan Gerhold wrote:
> Hi all,
>
> Thanks for all your replies!
>
> On Wed, Jun 19, 2019 at 02:16:38PM -0600, Mathieu Poirier wrote:
>> On Wed, 19 Jun 2019 at 12:39, Stephan Gerhold <stephan@gerhold.net> wrote:
>>>
>>> Hi,
>>>
>>> On Wed, Jun 19, 2019 at 09:49:03AM +0100, Suzuki K Poulose wrote:
>>>> Hi Stephan,
>>>>
>>>> On 18/06/2019 21:26, Stephan Gerhold wrote:
>>>>> Hi,
>>>>>
>>>>> I'm trying to run mainline Linux on a smartphone with MSM8916 SoC.
>>>>> It works surprisingly well, but the coresight devices seem to cause the
>>>>> following crash shortly after userspace starts:
>>>>>
>>>>> Internal error: synchronous external abort: 96000010 [#1] PREEMPT SMP
>>>>
>>>> ...
>>>>
>>>>
>>>>>
>>>>> In this case I'm using a simple device tree similar to apq8016-sbc,
>>>>> but it also happens using something as simple as msm8916-mtp.dts
>>>>> on this particular device.
>>>>> (Attached: dmesg log with msm8916-mtp.dts and arm64 defconfig)
>>>>>
>>>>> I can avoid the crash and boot without any further problems by disabling
>>>>> every coresight device defined in msm8916.dtsi, e.g.:
>>>>>
>>>>> tpiu@820000 { status = "disabled"; };
>>>>
>>>> ...
>>>>
>>>>>
>>>>> I don't have any use for coresight at the moment,
>>>>> but it seems somewhat odd to put this in the device specific dts.
>>>>>
>>>>> Any idea what could be causing this crash?
>>>>
>>>> This is mostly due to the missing power domain support. The CoreSight
>>>> components are usually in a debug power domain. So unless that is turned on,
>>>> (either by specifying proper power domain ids for power management protocol
>>>> supported by the firmware OR via other hacks - e.g, connecting a DS-5 to
>>>> keep the debug power domain turned on , this works on Juno -).
>>>
>>> Interesting, thanks a lot!
>>>
>>> In this case I'm wondering how it works on the Dragonboard 410c.
>>
>> There can be two problems:
>>
>> 1) CPUidle is enabled on your platform and as I pointed out before,
>> that won't work. There are patches circulating[1] to fix that problem
>> but it still needs a little bit of work.
>
> I tried disabling cpuidle (see [1]), but unfortunately it did not help.
>
> [1]: https://lore.kernel.org/linux-arm-msm/20190619173743.GA937@gerhold.net/
>
>>
>> 2) As Suzuki pointed out the debug power domain may not be enabled by
>> default on your platform, something I would understand if it is a
>> production device. There is nothing I can do on that front.
>
> Indeed, this is a production device.
> The downstream (production) kernel does not seem to have coresight
> enabled, so it is very well possible that the debug power domain is not
> enabled by the firmware.
>
>>
>> [1]. https://www.spinics.net/lists/arm-kernel/msg735707.html
>>
>>> Does it enable these power domains in the firmware?
>>> (Assuming it boots without this error...)
>>
>> The debug power domain is enabled by default on the 410c and the board
>> boots without error.
>
> Good to know, thank you!
>
>>
>>>
>>> If coresight is not working properly on all/most msm8916 devices,
>>> shouldn't coresight be disabled by default in msm8916.dtsi?
>>
>> It is in the defconfig for arm64, as such it shouldn't bother you.
>
> Indeed, I already have CONFIG_CORESIGHT disabled.
> At the moment, I'm using arm64 defconfig as-is, with no modifications.
>
> So the error happens in the AMBA bus code even when CONFIG_CORESIGHT is
> disabled, as Suzuki suspected [2].
>
> [2]: https://lore.kernel.org/linux-arm-msm/6bb74dcc-62e4-5310-5884-9c4b82ce5be9@arm.com/
>
>>
>>> At least until those power domains can be set up by the kernel.
>>>
>>> If this is a device-specific issue, what would be an acceptable solution
>>> for mainline?
>>> Can I turn on these power domains from the kernel?
>>
>> Yes, if you have the SoC's TRM.
>
> I guess "TRM" refers to Technical Reference Manual?
> Unfortunately, I don't have access to any documentation that is not
> publicly available on the Internet.
>
>>
>>> Or is it fine to disable coresight for this device with the snippet above?
>>>
>>> I'm not actually trying to use coresight, I just want the device to boot :)
>>> And since I am considering submitting my device tree for inclusion in
>>> mainline, I want to ask in advance how I should tackle this problem.
>>
>> Simply don't enable coresight in the kernel config if the code isn't
>> mature enough to properly handle the relevant power domains using the
>> PM runtime API.
>
> The error occurs without CONFIG_CORESIGHT, and I believe there is no
> way to disable CONFIG_AMBA (it is selected by CONFIG_ARM64 and included
> in arm64 defconfig).
>
> So, assuming it is the debug power domain, I believe I can make the
> device boot successfully by either:
>
> (a) Turning on the debug power domain:
> It seems like the kernel cannot do this on msm8916 at the moment(?)
> (msm8916.dtsi does not declare any power domain in the coresight
> device tree nodes)
>
> I cannot modify the firmware of this device,
> so I'm afraid I have absolutely no idea how to turn it on. :/
>
> (b) Preventing the crash:
> Is there some way to:
>
> (1) Add a check in the AMBA bus code to verify if the power
> domain is actually turned on?
No, there isn't, unless the DT tells you that device is disabled, just like
your patch does.
> or
> (2) Recover from the "synchronous external abort" and continue
> booting after printing an error/warning?
> (At the moment, userspace seems to continue for a while,
> but stops working at some point after the error...)
Unfortunately, no. There is no way to do that from the kernel.
>
> Otherwise, there is still the option to prevent the AMBA bus code
> from running by disabling the affected device tree nodes.
> That's what the debug@850000 { status = "disabled"; }; ... snippet
> from my first mail [3] does, and it is the only way to make the
> kernel boot successfully at the moment.
For your board, I would say, this is the best option and the reasonable
solution.
>
> It wouldn't affect any other device if placed in the DTS for my
> device (i.e. *not* in the shared msm8916.dtsi).
Ultimately, the device tree is based on the assumption that you are running with
a firmware that supports the power domain and thus is fine for upstream. If
someone is using a firmware that doesn't support this, it is better to disable
the nodes, just like you did.
Personally I would leave the upstream DTS as it is and expect the user to
fixup his DTS for the firmware.
Kind regards
Suzuki
next prev parent reply other threads:[~2019-06-21 16:16 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-18 20:26 Coresight causes synchronous external abort on msm8916 Stephan Gerhold
2019-06-18 20:40 ` Mathieu Poirier
2019-06-19 17:39 ` Stephan Gerhold
2019-06-19 8:49 ` Suzuki K Poulose
2019-06-19 18:39 ` Stephan Gerhold
2019-06-19 20:16 ` Mathieu Poirier
2019-06-20 8:53 ` Suzuki K Poulose
2019-06-20 9:38 ` Sudeep Holla
2019-06-21 16:06 ` Stephan Gerhold
2019-06-21 16:16 ` Suzuki K Poulose [this message]
2019-06-21 16:30 ` Sudeep Holla
2019-06-20 6:29 ` Sai Prakash Ranjan
2019-06-20 9:06 ` Suzuki K Poulose
2019-06-20 9:51 ` Sai Prakash Ranjan
2019-06-20 10:08 ` Suzuki K Poulose
2019-06-20 10:10 ` Sai Prakash Ranjan
2019-06-20 15:00 ` Mathieu Poirier
2019-06-20 9:35 ` Sudeep Holla
2019-06-21 16:10 ` Stephan Gerhold
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=14bd9196-538f-f641-59e1-0c04960890aa@arm.com \
--to=suzuki.poulose@arm.com \
--cc=Sudeep.Holla@arm.com \
--cc=agross@kernel.org \
--cc=david.brown@linaro.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=mathieu.poirier@linaro.org \
--cc=saiprakash.ranjan@codeaurora.org \
--cc=stephan@gerhold.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).