linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Suzuki K Poulose <suzuki.poulose@arm.com>
To: stephan@gerhold.net, mathieu.poirier@linaro.org, Sudeep.Holla@arm.com
Cc: david.brown@linaro.org, saiprakash.ranjan@codeaurora.org,
	agross@kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-arm-msm@vger.kernel.org
Subject: Re: Coresight causes synchronous external abort on msm8916
Date: Fri, 21 Jun 2019 17:16:28 +0100	[thread overview]
Message-ID: <14bd9196-538f-f641-59e1-0c04960890aa@arm.com> (raw)
In-Reply-To: <20190621160631.GA34922@gerhold.net>

Hi Stephan

On 21/06/2019 17:06, Stephan Gerhold wrote:
> Hi all,
> 
> Thanks for all your replies!
> 
> On Wed, Jun 19, 2019 at 02:16:38PM -0600, Mathieu Poirier wrote:
>> On Wed, 19 Jun 2019 at 12:39, Stephan Gerhold <stephan@gerhold.net> wrote:
>>>
>>> Hi,
>>>
>>> On Wed, Jun 19, 2019 at 09:49:03AM +0100, Suzuki K Poulose wrote:
>>>> Hi Stephan,
>>>>
>>>> On 18/06/2019 21:26, Stephan Gerhold wrote:
>>>>> Hi,
>>>>>
>>>>> I'm trying to run mainline Linux on a smartphone with MSM8916 SoC.
>>>>> It works surprisingly well, but the coresight devices seem to cause the
>>>>> following crash shortly after userspace starts:
>>>>>
>>>>>       Internal error: synchronous external abort: 96000010 [#1] PREEMPT SMP
>>>>
>>>> ...
>>>>
>>>>
>>>>>
>>>>> In this case I'm using a simple device tree similar to apq8016-sbc,
>>>>> but it also happens using something as simple as msm8916-mtp.dts
>>>>> on this particular device.
>>>>>     (Attached: dmesg log with msm8916-mtp.dts and arm64 defconfig)
>>>>>
>>>>> I can avoid the crash and boot without any further problems by disabling
>>>>> every coresight device defined in msm8916.dtsi, e.g.:
>>>>>
>>>>>      tpiu@820000 { status = "disabled"; };
>>>>
>>>> ...
>>>>
>>>>>
>>>>> I don't have any use for coresight at the moment,
>>>>> but it seems somewhat odd to put this in the device specific dts.
>>>>>
>>>>> Any idea what could be causing this crash?
>>>>
>>>> This is mostly due to the missing power domain support. The CoreSight
>>>> components are usually in a debug power domain. So unless that is turned on,
>>>> (either by specifying proper power domain ids for power management protocol
>>>> supported by the firmware OR via other hacks - e.g, connecting a DS-5 to
>>>> keep the debug power domain turned on , this works on Juno -).
>>>
>>> Interesting, thanks a lot!
>>>
>>> In this case I'm wondering how it works on the Dragonboard 410c.
>>
>> There can be two problems:
>>
>> 1) CPUidle is enabled on your platform and as I pointed out before,
>> that won't work.  There are patches circulating[1] to fix that problem
>> but it still needs a little bit of work.
> 
> I tried disabling cpuidle (see [1]), but unfortunately it did not help.
> 
> [1]: https://lore.kernel.org/linux-arm-msm/20190619173743.GA937@gerhold.net/
> 
>>
>> 2) As Suzuki pointed out the debug power domain may not be enabled by
>> default on your platform, something I would understand if it is a
>> production device.  There is nothing I can do on that front.
> 
> Indeed, this is a production device.
> The downstream (production) kernel does not seem to have coresight
> enabled, so it is very well possible that the debug power domain is not
> enabled by the firmware.
> 
>>
>> [1]. https://www.spinics.net/lists/arm-kernel/msg735707.html
>>
>>> Does it enable these power domains in the firmware?
>>>    (Assuming it boots without this error...)
>>
>> The debug power domain is enabled by default on the 410c and the board
>> boots without error.
> 
> Good to know, thank you!
> 
>>
>>>
>>> If coresight is not working properly on all/most msm8916 devices,
>>> shouldn't coresight be disabled by default in msm8916.dtsi?
>>
>> It is in the defconfig for arm64, as such it shouldn't bother you.
> 
> Indeed, I already have CONFIG_CORESIGHT disabled.
> At the moment, I'm using arm64 defconfig as-is, with no modifications.
> 
> So the error happens in the AMBA bus code even when CONFIG_CORESIGHT is
> disabled, as Suzuki suspected [2].
> 
> [2]: https://lore.kernel.org/linux-arm-msm/6bb74dcc-62e4-5310-5884-9c4b82ce5be9@arm.com/
> 
>>
>>> At least until those power domains can be set up by the kernel.
>>>
>>> If this is a device-specific issue, what would be an acceptable solution
>>> for mainline?
>>> Can I turn on these power domains from the kernel?
>>
>> Yes, if you have the SoC's TRM.
> 
> I guess "TRM" refers to Technical Reference Manual?
> Unfortunately, I don't have access to any documentation that is not
> publicly available on the Internet.
> 
>>
>>> Or is it fine to disable coresight for this device with the snippet above?
>>>
>>> I'm not actually trying to use coresight, I just want the device to boot :)
>>> And since I am considering submitting my device tree for inclusion in
>>> mainline, I want to ask in advance how I should tackle this problem.
>>
>> Simply don't enable coresight in the kernel config if the code isn't
>> mature enough to properly handle the relevant power domains using the
>> PM runtime API.
> 
> The error occurs without CONFIG_CORESIGHT, and I believe there is no
> way to disable CONFIG_AMBA (it is selected by CONFIG_ARM64 and included
> in arm64 defconfig).
> 
> So, assuming it is the debug power domain, I believe I can make the
> device boot successfully by either:
> 
>   (a) Turning on the debug power domain:
>       It seems like the kernel cannot do this on msm8916 at the moment(?)
>       (msm8916.dtsi does not declare any power domain in the coresight
>        device tree nodes)
> 
>       I cannot modify the firmware of this device,
>       so I'm afraid I have absolutely no idea how to turn it on. :/
> 
>   (b) Preventing the crash:
>       Is there some way to:
> 
>        (1) Add a check in the AMBA bus code to verify if the power
>            domain is actually turned on?

No, there isn't, unless the DT tells you that device is disabled, just like
your patch does.

>       or
>        (2) Recover from the "synchronous external abort" and continue
>            booting after printing an error/warning?
>            (At the moment, userspace seems to continue for a while,
>             but stops working at some point after the error...)

Unfortunately, no. There is no way to do that from the kernel.

> 
>       Otherwise, there is still the option to prevent the AMBA bus code
>       from running by disabling the affected device tree nodes.
>       That's what the debug@850000 { status = "disabled"; }; ... snippet
>       from my first mail [3] does, and it is the only way to make the
>       kernel boot successfully at the moment.

For your board, I would say, this is the best option and the reasonable
solution.

> 
>       It wouldn't affect any other device if placed in the DTS for my
>       device (i.e. *not* in the shared msm8916.dtsi).

Ultimately, the device tree is based on the assumption that you are running with
a firmware that supports the power domain and thus is fine for upstream. If
someone is using a firmware that doesn't support this, it is better to disable
the nodes, just like you did.

Personally I would leave the upstream DTS as it is and expect the user to
fixup his DTS for the firmware.

Kind regards
Suzuki

  reply	other threads:[~2019-06-21 16:16 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-18 20:26 Coresight causes synchronous external abort on msm8916 Stephan Gerhold
2019-06-18 20:40 ` Mathieu Poirier
2019-06-19 17:39   ` Stephan Gerhold
2019-06-19  8:49 ` Suzuki K Poulose
2019-06-19 18:39   ` Stephan Gerhold
2019-06-19 20:16     ` Mathieu Poirier
2019-06-20  8:53       ` Suzuki K Poulose
2019-06-20  9:38         ` Sudeep Holla
2019-06-21 16:06       ` Stephan Gerhold
2019-06-21 16:16         ` Suzuki K Poulose [this message]
2019-06-21 16:30           ` Sudeep Holla
2019-06-20  6:29     ` Sai Prakash Ranjan
2019-06-20  9:06       ` Suzuki K Poulose
2019-06-20  9:51         ` Sai Prakash Ranjan
2019-06-20 10:08           ` Suzuki K Poulose
2019-06-20 10:10             ` Sai Prakash Ranjan
2019-06-20 15:00         ` Mathieu Poirier
2019-06-20  9:35     ` Sudeep Holla
2019-06-21 16:10       ` Stephan Gerhold

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=14bd9196-538f-f641-59e1-0c04960890aa@arm.com \
    --to=suzuki.poulose@arm.com \
    --cc=Sudeep.Holla@arm.com \
    --cc=agross@kernel.org \
    --cc=david.brown@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=mathieu.poirier@linaro.org \
    --cc=saiprakash.ranjan@codeaurora.org \
    --cc=stephan@gerhold.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).