linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
To: Amit Pundir <amit.pundir@linaro.org>
Cc: Linux regressions mailing list <regressions@lists.linux.dev>,
	Mark Brown <broonie@kernel.org>,
	Doug Anderson <dianders@chromium.org>,
	Bjorn Andersson <andersson@kernel.org>,
	Andy Gross <agross@kernel.org>, Rob Herring <robh+dt@kernel.org>,
	Konrad Dybcio <konrad.dybcio@linaro.org>,
	Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>,
	Caleb Connolly <caleb.connolly@linaro.org>,
	Conor Dooley <conor+dt@kernel.org>,
	linux-arm-msm <linux-arm-msm@vger.kernel.org>,
	dt <devicetree@vger.kernel.org>,
	lkml <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] arm64: dts: qcom: sdm845-db845c: Move LVS regulator nodes up
Date: Fri, 16 Jun 2023 10:27:11 +0200	[thread overview]
Message-ID: <12d6b687-5e5a-bd7c-ff5c-007a74753edb@linaro.org> (raw)
In-Reply-To: <CAMi1Hd33_Ccxkf9C5_QBO3tvOZcGnYh+_CKcACUtoY2qAuOzRA@mail.gmail.com>

On 15/06/2023 18:09, Amit Pundir wrote:
> On Thu, 15 Jun 2023 at 20:33, Krzysztof Kozlowski
> <krzysztof.kozlowski@linaro.org> wrote:
>>
>> On 15/06/2023 15:47, Amit Pundir wrote:
>>> On Thu, 15 Jun 2023 at 00:38, Amit Pundir <amit.pundir@linaro.org> wrote:
>>>>
>>>> On Thu, 15 Jun 2023 at 00:17, Krzysztof Kozlowski
>>>> <krzysztof.kozlowski@linaro.org> wrote:
>>>>>
>>>>> On 14/06/2023 20:18, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>>>> On 02.06.23 18:12, Amit Pundir wrote:
>>>>>>> Move lvs1 and lvs2 regulator nodes up in the rpmh-regulators
>>>>>>> list to workaround a boot regression uncovered by the upstream
>>>>>>> commit ad44ac082fdf ("regulator: qcom-rpmh: Revert "regulator:
>>>>>>> qcom-rpmh: Use PROBE_FORCE_SYNCHRONOUS"").
>>>>>>>
>>>>>>> Without this fix DB845c fail to boot at times because one of the
>>>>>>> lvs1 or lvs2 regulators fail to turn ON in time.
>>>>>>
>>>>>> /me waves friendly
>>>>>>
>>>>>> FWIW, as it's not obvious: this...
>>>>>>
>>>>>>> Link: https://lore.kernel.org/all/CAMi1Hd1avQDcDQf137m2auz2znov4XL8YGrLZsw5edb-NtRJRw@mail.gmail.com/
>>>>>>
>>>>>> ...is a report about a regression. One that we could still solve before
>>>>>> 6.4 is out. One I'll likely will point Linus to, unless a fix comes into
>>>>>> sight.
>>>>>>
>>>>>> When I noticed the reluctant replies to this patch I earlier today asked
>>>>>> in the thread with the report what the plan forward was:
>>>>>> https://lore.kernel.org/all/CAD%3DFV%3DV-h4EUKHCM9UivsFHRsJPY5sAiwXV3a1hUX9DUMkkxdg@mail.gmail.com/
>>>>>>
>>>>>> Dough there replied:
>>>>>>
>>>>>> ```
>>>>>> Of the two proposals made (the revert vs. the reordering of the dts),
>>>>>> the reordering of the dts seems better. It only affects the one buggy
>>>>>> board (rather than preventing us to move to async probe for everyone)
>>>>>> and it also has a chance of actually fixing something (changing the
>>>>>> order that regulators probe in rpmh-regulator might legitimately work
>>>>>> around the problem). That being said, just like the revert the dts
>>>>>> reordering is still just papering over the problem and is fragile /
>>>>>> not guaranteed to work forever.
>>>>>> ```
>>>>>>
>>>>>> Papering over obviously is not good, but has anyone a better idea to fix
>>>>>> this? Or is "not fixing" for some reason an viable option here?
>>>>>>
>>>>>
>>>>> I understand there is a regression, although kernel is not mainline
>>>>> (hash df7443a96851 is unknown) and the only solutions were papering the
>>>>> problem. Reverting commit is a temporary workaround. Moving nodes in DTS
>>>>> is not acceptable because it hides actual problem and only solves this
>>>>> one particular observed problem, while actual issue is still there. It
>>>>> would be nice to be able to reproduce it on real mainline with normal
>>>>> operating system (not AOSP) - with ramdiks/without/whatever. So far no
>>>>> one did it, right?
>>>>
>>>> No, I did not try non-AOSP system yet. I'll try it tomorrow, if that
>>>> helps. With mainline hash.
>>>
>>> Hi, here is the crash report on db845c running vanilla v6.4-rc6 with a
>>> debian build https://bugs.linaro.org/attachment.cgi?id=1142
>>>
>>> And fwiw here is the db845c crash log with AOSP running vanilla
>>> v6.4-rc6 https://bugs.linaro.org/attachment.cgi?id=1141
>>>
>>> Regards,
>>> Amit Pundir
>>>
>>> PS: rootfs in this bug report doesn't matter much because I'm loading
>>> all the kernel modules from a ramdisk and in the case of a crash the
>>> UFS doesn't probe anyway.
>>
>> I just tried current next with defconfig (I could not find your config,
>> neither here, nor in your previous mail thread nor in bugzilla). Also
>> with REGULATOR_QCOM_RPMH as module.
>>
>> I tried also v6.4-rc6 - also defconfig with default and module
>> REGULATOR_QCOM_RPMH.
>>
>> All the cases work on my RB3 - no warnings reported.
>>
>> If you do not use defconfig, then in all reports please mention the
>> differences (the best) or at least attach it.
> 
> Argh.. Sorry about that. Big mistake from my side. I did want to
> upload my defconfig but forgot. Defconfig plays a key role because, as
> I mentioned in one of my previous email, it is a timing/race bug and
> if I do any much changes in my defconfig (i.e. enable ftrace for
> example or as little as add printk in qcom_rpmh_regulator code) then I
> can't reproduce this bug. So needless to say that I can't reproduce
> this bug with default arm64 defconfig.
> 
> Please find my custom (but upstream) defconfig here
> https://bugs.linaro.org/attachment.cgi?id=1143 and prebuilt binaries
> here https://people.linaro.org/~amit.pundir/db845c-userdebug/rpmh_bug/.
> "fastboot flash boot ./boot.img-6.4-rc6 reboot" and/or a few (<5)
> reboots should be enough to trigger the crash.
> 
> I have downloaded the initrd from here
> https://snapshots.linaro.org/96boards/dragonboard845c/linaro/debian/569/initrd.img-5.15.0-qcomlt-arm64
> but edited ramdisk/init to run "load_module" function early in the
> boot and ramdisk/conf/initramfs.conf has "MODULES=list" instead of
> "MODULES=most", where all the kernel modules are listed at
> /etc/initramfs-tools/modules.

So you have interconnect as module - this is not a supported setup. It
might work with if all the modules are loaded very early or might not.
Pinctrl is another driver which should be built-in.

With your defconfig I see regular issue - console and system dies
because of lack of interconnects, most likely. I don't see your WARNs -
I just see usual hang.

See:
https://lore.kernel.org/all/20221021032702.1340963-1-krzysztof.kozlowski@linaro.org/

If you want them to really be modules, then you need to fix all the
dependencies (SOFTDEP?), probe ordering glitches. It's not a problem of
DTS. Just because something can be built as module, does not mean it
will work. We don't test it, we don't work with them as modules.

It's kind of the same as here:
https://lore.kernel.org/all/ac328b6a-a8e2-873d-4015-814cb4f5588e@canonical.com/

I understand that we might have here regression, if these were working
as modules, but I don't think we ever really committed to it. We can as
well make it non-module to solve the regression.

Best regards,
Krzysztof


  parent reply	other threads:[~2023-06-16  8:27 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-02 16:12 [PATCH] arm64: dts: qcom: sdm845-db845c: Move LVS regulator nodes up Amit Pundir
2023-06-06 23:34 ` Doug Anderson
2023-06-07  7:49   ` Krzysztof Kozlowski
2023-06-07  9:17     ` Amit Pundir
2023-06-07 10:16       ` Krzysztof Kozlowski
2023-06-08 17:26         ` Amit Pundir
2023-06-08 17:44           ` Doug Anderson
2023-06-07  7:46 ` Krzysztof Kozlowski
2023-06-14 18:18 ` Linux regression tracking (Thorsten Leemhuis)
2023-06-14 18:47   ` Krzysztof Kozlowski
2023-06-14 19:08     ` Amit Pundir
2023-06-15 13:47       ` Amit Pundir
2023-06-15 15:03         ` Krzysztof Kozlowski
2023-06-15 16:09           ` Amit Pundir
2023-06-15 16:15             ` Amit Pundir
2023-06-16  8:27             ` Krzysztof Kozlowski [this message]
2023-06-16 17:09               ` Amit Pundir
2023-06-17  7:21                 ` Krzysztof Kozlowski
2023-06-19  7:06                   ` Amit Pundir
2023-06-14 19:44     ` Doug Anderson
2023-06-20 15:59       ` Bjorn Andersson
2023-06-22  7:47         ` Linux regression tracking (Thorsten Leemhuis)
2023-06-22 11:48           ` Amit Pundir
2023-07-07  5:08             ` Amit Pundir
2023-07-14 11:04               ` Linux regression tracking #update (Thorsten Leemhuis)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=12d6b687-5e5a-bd7c-ff5c-007a74753edb@linaro.org \
    --to=krzysztof.kozlowski@linaro.org \
    --cc=agross@kernel.org \
    --cc=amit.pundir@linaro.org \
    --cc=andersson@kernel.org \
    --cc=broonie@kernel.org \
    --cc=caleb.connolly@linaro.org \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=dianders@chromium.org \
    --cc=konrad.dybcio@linaro.org \
    --cc=krzysztof.kozlowski+dt@linaro.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=regressions@lists.linux.dev \
    --cc=robh+dt@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).