linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* Re: next/master boot: 254 boots: 16 failed, 231 passed with 4 offline, 1 untried/unknown, 2 conflicts (next-20190726)
       [not found] <5d3aef79.1c69fb81.111b9.a701@mx.google.com>
@ 2019-07-26 13:48 ` Mark Brown
  2019-07-29  0:56   ` Bjorn Andersson
  2019-08-19  4:35   ` Bjorn Andersson
  0 siblings, 2 replies; 3+ messages in thread
From: Mark Brown @ 2019-07-26 13:48 UTC (permalink / raw)
  To: Andy Gross, Bjorn Andersson
  Cc: linux-arm-msm, linux-kernel, linux-arm-kernel, kernel-build-reports


[-- Attachment #1.1: Type: text/plain, Size: 990 bytes --]

On Fri, Jul 26, 2019 at 05:18:01AM -0700, kernelci.org bot wrote:

The past few versions of -next failed to boot on apq8096-db820c:

>     defconfig:
>         gcc-8:
>             apq8096-db820c: 1 failed lab

with an RCU stall towards the end of boot:

00:03:40.521336  [   18.487538] qcom_q6v5_pas adsp-pil: adsp-pil supply px not found, using dummy regulator
00:04:01.523104  [   39.499613] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
00:04:01.533371  [   39.499657] rcu: 	2-...!: (0 ticks this GP) idle=9ca/1/0x4000000000000000 softirq=1450/1450 fqs=50
00:04:01.537544  [   39.504689] 	(detected by 0, t=5252 jiffies, g=2425, q=619)
00:04:01.541727  [   39.513539] Task dump for CPU 2:
00:04:01.547929  [   39.519096] seq             R  running task        0   199    198 0x00000000

Full details and logs at:

	https://kernelci.org/boot/id/5d3aa7ea59b5142ba868890f/

The last version that worked was from the 15th and there seem to be
similar issues in mainline since -rc1.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: next/master boot: 254 boots: 16 failed, 231 passed with 4 offline, 1 untried/unknown, 2 conflicts (next-20190726)
  2019-07-26 13:48 ` next/master boot: 254 boots: 16 failed, 231 passed with 4 offline, 1 untried/unknown, 2 conflicts (next-20190726) Mark Brown
@ 2019-07-29  0:56   ` Bjorn Andersson
  2019-08-19  4:35   ` Bjorn Andersson
  1 sibling, 0 replies; 3+ messages in thread
From: Bjorn Andersson @ 2019-07-29  0:56 UTC (permalink / raw)
  To: Mark Brown
  Cc: linux-arm-msm, Andy Gross, linux-kernel, linux-arm-kernel,
	kernel-build-reports

On Fri 26 Jul 06:48 PDT 2019, Mark Brown wrote:

> On Fri, Jul 26, 2019 at 05:18:01AM -0700, kernelci.org bot wrote:
> 
> The past few versions of -next failed to boot on apq8096-db820c:
> 
> >     defconfig:
> >         gcc-8:
> >             apq8096-db820c: 1 failed lab
> 
> with an RCU stall towards the end of boot:
> 
> 00:03:40.521336  [   18.487538] qcom_q6v5_pas adsp-pil: adsp-pil supply px not found, using dummy regulator
> 00:04:01.523104  [   39.499613] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> 00:04:01.533371  [   39.499657] rcu: 	2-...!: (0 ticks this GP) idle=9ca/1/0x4000000000000000 softirq=1450/1450 fqs=50
> 00:04:01.537544  [   39.504689] 	(detected by 0, t=5252 jiffies, g=2425, q=619)
> 00:04:01.541727  [   39.513539] Task dump for CPU 2:
> 00:04:01.547929  [   39.519096] seq             R  running task        0   199    198 0x00000000
> 
> Full details and logs at:
> 
> 	https://kernelci.org/boot/id/5d3aa7ea59b5142ba868890f/
> 
> The last version that worked was from the 15th and there seem to be
> similar issues in mainline since -rc1.

Thanks for the report Mark, afaict the problem showed up in v5.3-rc1 as
well.

I think the problem is that the regulator supplying the GPU power
domain(s) isn't enabled - and I think there's a lack of agreement of how
this should be controlled.

But we have a partial fix for this floating around, I will give it a
spin.

Regards,
Bjorn

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: next/master boot: 254 boots: 16 failed, 231 passed with 4 offline, 1 untried/unknown, 2 conflicts (next-20190726)
  2019-07-26 13:48 ` next/master boot: 254 boots: 16 failed, 231 passed with 4 offline, 1 untried/unknown, 2 conflicts (next-20190726) Mark Brown
  2019-07-29  0:56   ` Bjorn Andersson
@ 2019-08-19  4:35   ` Bjorn Andersson
  1 sibling, 0 replies; 3+ messages in thread
From: Bjorn Andersson @ 2019-08-19  4:35 UTC (permalink / raw)
  To: Mark Brown
  Cc: linux-arm-msm, Andy Gross, linux-kernel, linux-arm-kernel,
	kernel-build-reports

On Fri 26 Jul 06:48 PDT 2019, Mark Brown wrote:

> On Fri, Jul 26, 2019 at 05:18:01AM -0700, kernelci.org bot wrote:
> 
> The past few versions of -next failed to boot on apq8096-db820c:
> 
> >     defconfig:
> >         gcc-8:
> >             apq8096-db820c: 1 failed lab
> 
> with an RCU stall towards the end of boot:
> 
> 00:03:40.521336  [   18.487538] qcom_q6v5_pas adsp-pil: adsp-pil supply px not found, using dummy regulator
> 00:04:01.523104  [   39.499613] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> 00:04:01.533371  [   39.499657] rcu: 	2-...!: (0 ticks this GP) idle=9ca/1/0x4000000000000000 softirq=1450/1450 fqs=50
> 00:04:01.537544  [   39.504689] 	(detected by 0, t=5252 jiffies, g=2425, q=619)
> 00:04:01.541727  [   39.513539] Task dump for CPU 2:
> 00:04:01.547929  [   39.519096] seq             R  running task        0   199    198 0x00000000
> 
> Full details and logs at:
> 
> 	https://kernelci.org/boot/id/5d3aa7ea59b5142ba868890f/
> 
> The last version that worked was from the 15th and there seem to be
> similar issues in mainline since -rc1.

As you might have seen this problem has come and gone on the
apq8096-db820c and I've finally managed to narrow it down a little bit.

The problem first appears on next-20190701, with the introduction of
CONFIG_RANDOMIZE_BASE in the defconfig, but after further efforts I've
concluded that disabling kpti removes or hides the problem.

With kpti=no on the command line I've now successfully booted the db820c
100+ times without problems (a clear improvement from the 75% failure
rate with kpti=yes).


Unfortunately I'm not yet certain why this is causing issues and I'm
also seeing the same rcu stall on SDA845 under certain (erroneous?)
conditions (where I don't expect them). 

Regards,
Bjorn

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-08-19  4:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <5d3aef79.1c69fb81.111b9.a701@mx.google.com>
2019-07-26 13:48 ` next/master boot: 254 boots: 16 failed, 231 passed with 4 offline, 1 untried/unknown, 2 conflicts (next-20190726) Mark Brown
2019-07-29  0:56   ` Bjorn Andersson
2019-08-19  4:35   ` Bjorn Andersson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).