linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908
@ 2019-01-24  5:36 John Stultz
  2019-01-24  8:10 ` Ard Biesheuvel
  2019-01-28 16:54 ` Valentin Schneider
  0 siblings, 2 replies; 6+ messages in thread
From: John Stultz @ 2019-01-24  5:36 UTC (permalink / raw)
  To: linux-arm-kernel; +Cc: Catalin Marinas, Will Deacon, Valentin Schneider

Trying to boot hikey960 with v5.0-rc3 (and Valentin's 2 recent lockdep
fixes), with lock debugging enabled, I'm seeing the BUG below.

Wanted to see if anyone else had seen this or had ideas at what was going on.

thanks
-john

[    0.183376] Detected VIPT I-cache on CPU4
[    0.183388] CPU features: SANITY CHECK: Unexpected variation in
SYS_ID_AA64MMFR0_EL1. Boot CPU: 0x00000000001122, CPU4:
0x00000000101122
[    0.183399] CPU features: Unsupported CPU feature variation detected.
[    0.183417] BUG: sleeping function called from invalid context at
kernel/locking/mutex.c:908
[    0.183421] in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/4
[    0.183451] no locks held by swapper/4/0.
[    0.183454] irq event stamp: 0
[    0.183459] hardirqs last  enabled at (0): [<0000000000000000>]
      (null)
[    0.183474] hardirqs last disabled at (0): [<ffffff80100f2af0>]
copy_process.isra.3.part.4+0x450/0x17b0
[    0.183479] softirqs last  enabled at (0): [<ffffff80100f2af0>]
copy_process.isra.3.part.4+0x450/0x17b0
[    0.183483] softirqs last disabled at (0): [<0000000000000000>]
      (null)
[    0.183485] Preemption disabled at:
[    0.183493] [<ffffff801009b680>] secondary_start_kernel+0xc0/0x1f8
[    0.183501] CPU: 4 PID: 0 Comm: swapper/4 Tainted: G S
  5.0.0-rc3-00002-g4daf5b1 #372
[    0.183504] Hardware name: HiKey960 (DT)
[    0.183508] Call trace:
[    0.183515]  dump_backtrace+0x0/0x150
[    0.183518]  show_stack+0x24/0x30
[    0.183526]  dump_stack+0xa0/0xd4
[    0.183532]  ___might_sleep+0x1e4/0x240
[    0.183535]  __might_sleep+0x58/0x90
[    0.183540]  __mutex_lock+0x64/0x890
[    0.183543]  mutex_lock_nested+0x3c/0x50
[    0.183549]  static_key_enable_cpuslocked+0x64/0xd8
[    0.183556]  arch_timer_check_ool_workaround+0x208/0x260
[    0.183559]  arch_timer_starting_cpu+0xfc/0x2f8
[    0.183563]  cpuhp_invoke_callback+0xd8/0xce0
[    0.183567]  notify_cpu_starting+0x84/0xa0
[    0.183570]  secondary_start_kernel+0xec/0x1f8
[    0.183601] arch_timer: Enabling local workaround for ARM erratum 858921
[    0.183618] arch_timer: CPU4: Trapping CNTVCT access
[    0.183625] CPU4: Booted secondary processor 0x0000000100 [0x410fd091]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908
  2019-01-24  5:36 BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908 John Stultz
@ 2019-01-24  8:10 ` Ard Biesheuvel
  2019-01-24 10:05   ` Marc Zyngier
  2019-01-28 16:54 ` Valentin Schneider
  1 sibling, 1 reply; 6+ messages in thread
From: Ard Biesheuvel @ 2019-01-24  8:10 UTC (permalink / raw)
  To: John Stultz, Suzuki K. Poulose, Marc Zyngier
  Cc: Catalin Marinas, Will Deacon, Valentin Schneider, linux-arm-kernel

On Thu, 24 Jan 2019 at 06:36, John Stultz <john.stultz@linaro.org> wrote:
>
> Trying to boot hikey960 with v5.0-rc3 (and Valentin's 2 recent lockdep
> fixes), with lock debugging enabled, I'm seeing the BUG below.
>
> Wanted to see if anyone else had seen this or had ideas at what was going on.
>
> thanks
> -john
>
> [    0.183376] Detected VIPT I-cache on CPU4
> [    0.183388] CPU features: SANITY CHECK: Unexpected variation in
> SYS_ID_AA64MMFR0_EL1. Boot CPU: 0x00000000001122, CPU4:
> 0x00000000101122
> [    0.183399] CPU features: Unsupported CPU feature variation detected.

This means that some CPUs support 16k pages and some don't. This is
hardly relevant if you are running with 4k pages, but as Suzuki
pointed out (IIRC), it could be relevant for KVM, since you cannot
migrate a VM using 16k pages from one CPU to the other.

Suzuki, has there been any progress since this last came up?

The output below is from a separate issue, where it seems that we are
enabling a static key from atomic context, which apparently isn't
supported.

Marc might know more about this.


> [    0.183417] BUG: sleeping function called from invalid context at
> kernel/locking/mutex.c:908
> [    0.183421] in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/4
> [    0.183451] no locks held by swapper/4/0.
> [    0.183454] irq event stamp: 0
> [    0.183459] hardirqs last  enabled at (0): [<0000000000000000>]
>       (null)
> [    0.183474] hardirqs last disabled at (0): [<ffffff80100f2af0>]
> copy_process.isra.3.part.4+0x450/0x17b0
> [    0.183479] softirqs last  enabled at (0): [<ffffff80100f2af0>]
> copy_process.isra.3.part.4+0x450/0x17b0
> [    0.183483] softirqs last disabled at (0): [<0000000000000000>]
>       (null)
> [    0.183485] Preemption disabled at:
> [    0.183493] [<ffffff801009b680>] secondary_start_kernel+0xc0/0x1f8
> [    0.183501] CPU: 4 PID: 0 Comm: swapper/4 Tainted: G S
>   5.0.0-rc3-00002-g4daf5b1 #372
> [    0.183504] Hardware name: HiKey960 (DT)
> [    0.183508] Call trace:
> [    0.183515]  dump_backtrace+0x0/0x150
> [    0.183518]  show_stack+0x24/0x30
> [    0.183526]  dump_stack+0xa0/0xd4
> [    0.183532]  ___might_sleep+0x1e4/0x240
> [    0.183535]  __might_sleep+0x58/0x90
> [    0.183540]  __mutex_lock+0x64/0x890
> [    0.183543]  mutex_lock_nested+0x3c/0x50
> [    0.183549]  static_key_enable_cpuslocked+0x64/0xd8
> [    0.183556]  arch_timer_check_ool_workaround+0x208/0x260
> [    0.183559]  arch_timer_starting_cpu+0xfc/0x2f8
> [    0.183563]  cpuhp_invoke_callback+0xd8/0xce0
> [    0.183567]  notify_cpu_starting+0x84/0xa0
> [    0.183570]  secondary_start_kernel+0xec/0x1f8
> [    0.183601] arch_timer: Enabling local workaround for ARM erratum 858921
> [    0.183618] arch_timer: CPU4: Trapping CNTVCT access
> [    0.183625] CPU4: Booted secondary processor 0x0000000100 [0x410fd091]
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908
  2019-01-24  8:10 ` Ard Biesheuvel
@ 2019-01-24 10:05   ` Marc Zyngier
  2019-01-25 17:23     ` John Stultz
  0 siblings, 1 reply; 6+ messages in thread
From: Marc Zyngier @ 2019-01-24 10:05 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Suzuki K. Poulose, Catalin Marinas, Will Deacon, John Stultz,
	Valentin Schneider, linux-arm-kernel

On Thu, 24 Jan 2019 08:10:53 +0000,
Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> 
> On Thu, 24 Jan 2019 at 06:36, John Stultz <john.stultz@linaro.org> wrote:
> >
> > Trying to boot hikey960 with v5.0-rc3 (and Valentin's 2 recent lockdep
> > fixes), with lock debugging enabled, I'm seeing the BUG below.
> >
> > Wanted to see if anyone else had seen this or had ideas at what was going on.
> >
> > thanks
> > -john
> >
> > [    0.183376] Detected VIPT I-cache on CPU4
> > [    0.183388] CPU features: SANITY CHECK: Unexpected variation in
> > SYS_ID_AA64MMFR0_EL1. Boot CPU: 0x00000000001122, CPU4:
> > 0x00000000101122
> > [    0.183399] CPU features: Unsupported CPU feature variation detected.
> 
> This means that some CPUs support 16k pages and some don't. This is
> hardly relevant if you are running with 4k pages, but as Suzuki
> pointed out (IIRC), it could be relevant for KVM, since you cannot
> migrate a VM using 16k pages from one CPU to the other.

I think we now have some code to present sanitised ID registers to the
guest. It won't prevent it from running though. I contend that life is
too short for caring about these systems.

> 
> Suzuki, has there been any progress since this last came up?
> 
> The output below is from a separate issue, where it seems that we are
> enabling a static key from atomic context, which apparently isn't
> supported.
> 
> Marc might know more about this.

I thought that what Valentin's fix was addressing. Or was I mistaken?

Thanks,

	M.
	
> 
> 
> > [    0.183417] BUG: sleeping function called from invalid context at
> > kernel/locking/mutex.c:908
> > [    0.183421] in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/4
> > [    0.183451] no locks held by swapper/4/0.
> > [    0.183454] irq event stamp: 0
> > [    0.183459] hardirqs last  enabled at (0): [<0000000000000000>]
> >       (null)
> > [    0.183474] hardirqs last disabled at (0): [<ffffff80100f2af0>]
> > copy_process.isra.3.part.4+0x450/0x17b0
> > [    0.183479] softirqs last  enabled at (0): [<ffffff80100f2af0>]
> > copy_process.isra.3.part.4+0x450/0x17b0
> > [    0.183483] softirqs last disabled at (0): [<0000000000000000>]
> >       (null)
> > [    0.183485] Preemption disabled at:
> > [    0.183493] [<ffffff801009b680>] secondary_start_kernel+0xc0/0x1f8
> > [    0.183501] CPU: 4 PID: 0 Comm: swapper/4 Tainted: G S
> >   5.0.0-rc3-00002-g4daf5b1 #372
> > [    0.183504] Hardware name: HiKey960 (DT)
> > [    0.183508] Call trace:
> > [    0.183515]  dump_backtrace+0x0/0x150
> > [    0.183518]  show_stack+0x24/0x30
> > [    0.183526]  dump_stack+0xa0/0xd4
> > [    0.183532]  ___might_sleep+0x1e4/0x240
> > [    0.183535]  __might_sleep+0x58/0x90
> > [    0.183540]  __mutex_lock+0x64/0x890
> > [    0.183543]  mutex_lock_nested+0x3c/0x50
> > [    0.183549]  static_key_enable_cpuslocked+0x64/0xd8
> > [    0.183556]  arch_timer_check_ool_workaround+0x208/0x260
> > [    0.183559]  arch_timer_starting_cpu+0xfc/0x2f8
> > [    0.183563]  cpuhp_invoke_callback+0xd8/0xce0
> > [    0.183567]  notify_cpu_starting+0x84/0xa0
> > [    0.183570]  secondary_start_kernel+0xec/0x1f8
> > [    0.183601] arch_timer: Enabling local workaround for ARM erratum 858921
> > [    0.183618] arch_timer: CPU4: Trapping CNTVCT access
> > [    0.183625] CPU4: Booted secondary processor 0x0000000100 [0x410fd091]
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

-- 
Jazz is not dead, it just smell funny.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908
  2019-01-24 10:05   ` Marc Zyngier
@ 2019-01-25 17:23     ` John Stultz
  0 siblings, 0 replies; 6+ messages in thread
From: John Stultz @ 2019-01-25 17:23 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Suzuki K. Poulose, Catalin Marinas, Ard Biesheuvel, Will Deacon,
	Valentin Schneider, linux-arm-kernel

On Thu, Jan 24, 2019 at 2:05 AM Marc Zyngier <marc.zyngier@arm.com> wrote:
>
> On Thu, 24 Jan 2019 08:10:53 +0000,
> Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> >
> > On Thu, 24 Jan 2019 at 06:36, John Stultz <john.stultz@linaro.org> wrote:
> > >
> > > Trying to boot hikey960 with v5.0-rc3 (and Valentin's 2 recent lockdep
> > > fixes), with lock debugging enabled, I'm seeing the BUG below.
> > >
> > > Wanted to see if anyone else had seen this or had ideas at what was going on.
> > >
> > > thanks
> > > -john
> > >
> > > [    0.183376] Detected VIPT I-cache on CPU4
> > > [    0.183388] CPU features: SANITY CHECK: Unexpected variation in
> > > SYS_ID_AA64MMFR0_EL1. Boot CPU: 0x00000000001122, CPU4:
> > > 0x00000000101122
> > > [    0.183399] CPU features: Unsupported CPU feature variation detected.
> >
> > This means that some CPUs support 16k pages and some don't. This is
> > hardly relevant if you are running with 4k pages, but as Suzuki
> > pointed out (IIRC), it could be relevant for KVM, since you cannot
> > migrate a VM using 16k pages from one CPU to the other.
>
> I think we now have some code to present sanitised ID registers to the
> guest. It won't prevent it from running though. I contend that life is
> too short for caring about these systems.
>
> >
> > Suzuki, has there been any progress since this last came up?
> >
> > The output below is from a separate issue, where it seems that we are
> > enabling a static key from atomic context, which apparently isn't
> > supported.
> >
> > Marc might know more about this.
>
> I thought that what Valentin's fix was addressing. Or was I mistaken?

This is something additional. I've already added Valentin's patches,
which  addresses the separate:
  WARNING: CPU: 4 PID: 0 at kernel/cpu.c:316 lockdep_assert_cpus_held+0x3c/0x48
splat.

thanks
-john

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908
  2019-01-24  5:36 BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908 John Stultz
  2019-01-24  8:10 ` Ard Biesheuvel
@ 2019-01-28 16:54 ` Valentin Schneider
  2019-01-29 19:48   ` Valentin Schneider
  1 sibling, 1 reply; 6+ messages in thread
From: Valentin Schneider @ 2019-01-28 16:54 UTC (permalink / raw)
  To: John Stultz, linux-arm-kernel; +Cc: Catalin Marinas, Will Deacon

Hi John,

Sorry I had missed that thread, Marc brought it up to me

On 24/01/2019 05:36, John Stultz wrote:
> Trying to boot hikey960 with v5.0-rc3 (and Valentin's 2 recent lockdep
> fixes), with lock debugging enabled, I'm seeing the BUG below.
> 
> Wanted to see if anyone else had seen this or had ideas at what was going on.
> 
> thanks
> -john
> 
> [    0.183376] Detected VIPT I-cache on CPU4
> [    0.183388] CPU features: SANITY CHECK: Unexpected variation in
> SYS_ID_AA64MMFR0_EL1. Boot CPU: 0x00000000001122, CPU4:
> 0x00000000101122
> [    0.183399] CPU features: Unsupported CPU feature variation detected.
> [    0.183417] BUG: sleeping function called from invalid context at
> kernel/locking/mutex.c:908
> [    0.183421] in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/4
> [    0.183451] no locks held by swapper/4/0.
> [    0.183454] irq event stamp: 0
> [    0.183459] hardirqs last  enabled at (0): [<0000000000000000>]
>       (null)
> [    0.183474] hardirqs last disabled at (0): [<ffffff80100f2af0>]
> copy_process.isra.3.part.4+0x450/0x17b0
> [    0.183479] softirqs last  enabled at (0): [<ffffff80100f2af0>]
> copy_process.isra.3.part.4+0x450/0x17b0
> [    0.183483] softirqs last disabled at (0): [<0000000000000000>]
>       (null)
> [    0.183485] Preemption disabled at:
> [    0.183493] [<ffffff801009b680>] secondary_start_kernel+0xc0/0x1f8
> [    0.183501] CPU: 4 PID: 0 Comm: swapper/4 Tainted: G S
>   5.0.0-rc3-00002-g4daf5b1 #372
> [    0.183504] Hardware name: HiKey960 (DT)
> [    0.183508] Call trace:
> [    0.183515]  dump_backtrace+0x0/0x150
> [    0.183518]  show_stack+0x24/0x30
> [    0.183526]  dump_stack+0xa0/0xd4
> [    0.183532]  ___might_sleep+0x1e4/0x240
> [    0.183535]  __might_sleep+0x58/0x90
> [    0.183540]  __mutex_lock+0x64/0x890
> [    0.183543]  mutex_lock_nested+0x3c/0x50
> [    0.183549]  static_key_enable_cpuslocked+0x64/0xd8
> [    0.183556]  arch_timer_check_ool_workaround+0x208/0x260
> [    0.183559]  arch_timer_starting_cpu+0xfc/0x2f8
> [    0.183563]  cpuhp_invoke_callback+0xd8/0xce0
> [    0.183567]  notify_cpu_starting+0x84/0xa0
> [    0.183570]  secondary_start_kernel+0xec/0x1f8
> [    0.183601] arch_timer: Enabling local workaround for ARM erratum 858921
> [    0.183618] arch_timer: CPU4: Trapping CNTVCT access
> [    0.183625] CPU4: Booted secondary processor 0x0000000100 [0x410fd091]
> 

Urgh, I was going to say I don't get that warning with the lockdep patches,
but I just realized you actually get a different warning because you have
DEBUG_ATOMIC_SLEEP. I'll see if I can make sense of this one, it reminds me
of [1] although we already have the _cpuslocked variant.

FWIW I don't have the heart to delete the rest, so if you want to compare
John's log with mine you can see it's a warning in the same area but not
the same actual warning.

[1]: https://lkml.org/lkml/2018/4/29/106

-----8<-----

Just gave it a spin on v5.0-rc4 - I still have the CPU features warning,
but none of the timer workarounds.

This is what I get with just an extra defconfig patch to get to load my
userspace [2]:

[    0.780221] Detected VIPT I-cache on CPU4
[    0.780232] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_AA64MMFR0_EL1. Boot CPU: 0x00000000001122, CPU4: 0x00000000101122
[    0.780241] CPU features: Unsupported CPU feature variation detected.
[    0.780275] WARNING: CPU: 4 PID: 0 at kernel/cpu.c:316 lockdep_assert_cpus_held+0x30/0x40
[    0.780279] Modules linked in:
[    0.780289] CPU: 4 PID: 0 Comm: swapper/4 Tainted: G S                5.0.0-rc4-00001-g99274c821251 #29
[    0.780292] Hardware name: HiKey960 (DT)
[    0.780296] pstate: 600001c5 (nZCv dAIF -PAN -UAO)
[    0.780300] pc : lockdep_assert_cpus_held+0x30/0x40
[    0.780304] lr : lockdep_assert_cpus_held+0x2c/0x40
[    0.780306] sp : ffff0000128d3e50
[    0.780308] x29: ffff0000128d3e50 x28: 0000000000000000 
[    0.780313] x27: 0000000000000001 x26: ffff8000bf970fd0 
[    0.780317] x25: ffff000010acf8e8 x24: 0000000000000001 
[    0.780321] x23: ffff000010ef6000 x22: ffff000010acfe10 
[    0.780324] x21: 0000000000000001 x20: 0000000000000003 
[    0.780328] x19: ffff00001225f9e8 x18: ffffffffffffffff 
[    0.780332] x17: 0000000000000000 x16: 0000000000000000 
[    0.780335] x15: ffff000011484000 x14: 00000000fffffff0 
[    0.780340] x13: 00000000ca811012 x12: ffff8000bcb2e1e8 
[    0.780343] x11: ffff00001220d000 x10: ffff000011463808 
[    0.780347] x9 : 0000000000000000 x8 : ffff000011463808 
[    0.780351] x7 : ffff000010177950 x6 : ffff000011478c78 
[    0.780355] x5 : ffff000011463830 x4 : 0000000000000001 
[    0.780358] x3 : 0000000000000000 x2 : 0000000000000001 
[    0.780362] x1 : 00000000ffffffff x0 : 0000000000000000 
[    0.780367] Call trace:
[    0.780371]  lockdep_assert_cpus_held+0x30/0x40
[    0.780378]  static_key_enable_cpuslocked+0x20/0xc8
[    0.780385]  arch_timer_check_ool_workaround+0xe0/0x220
[    0.780391]  arch_timer_starting_cpu+0xdc/0x2d0
[    0.780394]  cpuhp_invoke_callback+0xd8/0xcc8
[    0.780398]  notify_cpu_starting+0x70/0xa8
[    0.780404]  secondary_start_kernel+0x120/0x1d8
[    0.780407] irq event stamp: 0
[    0.780410] hardirqs last  enabled at (0): [<0000000000000000>]           (null)
[    0.780415] hardirqs last disabled at (0): [<ffff0000100e4fc0>] copy_process.isra.3.part.4+0x540/0x1850
[    0.780419] softirqs last  enabled at (0): [<ffff0000100e4fc0>] copy_process.isra.3.part.4+0x540/0x1850
[    0.780421] softirqs last disabled at (0): [<0000000000000000>]           (null)
[    0.780428] ---[ end trace 651209311577d561 ]---

(and same bizz for CPUs [5-7])

This is what I get when the 2 lockdep patches are applied:

[    0.780276] Detected VIPT I-cache on CPU4
[    0.780286] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_AA64MMFR0_EL1. Boot CPU: 0x00000000001122, CPU4: 0x00000000101122
[    0.780296] CPU features: Unsupported CPU feature variation detected.
[    0.780333] arch_timer: Enabling local workaround for ARM erratum 858921
[    0.780349] arch_timer: CPU4: Trapping CNTVCT access
[    0.780356] CPU4: Booted secondary processor 0x0000000100 [0x410fd091]
[    0.810727] Detected VIPT I-cache on CPU5
[    0.810734] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_AA64MMFR0_EL1. Boot CPU: 0x00000000001122, CPU5: 0x00000000101122
[    0.810745] arch_timer: Enabling local workaround for ARM erratum 858921
[    0.810754] arch_timer: CPU5: Trapping CNTVCT access
[    0.810759] CPU5: Booted secondary processor 0x0000000101 [0x410fd091]
[    0.842867] Detected VIPT I-cache on CPU6
[    0.842873] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_AA64MMFR0_EL1. Boot CPU: 0x00000000001122, CPU6: 0x00000000101122
[    0.842884] arch_timer: Enabling local workaround for ARM erratum 858921
[    0.842893] arch_timer: CPU6: Trapping CNTVCT access
[    0.842899] CPU6: Booted secondary processor 0x0000000102 [0x410fd091]
[    0.874936] Detected VIPT I-cache on CPU7
[    0.874943] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_AA64MMFR0_EL1. Boot CPU: 0x00000000001122, CPU7: 0x00000000101122
[    0.874954] arch_timer: Enabling local workaround for ARM erratum 858921
[    0.874963] arch_timer: CPU7: Trapping CNTVCT access
[    0.874968] CPU7: Booted secondary processor 0x0000000103 [0x410fd091]
[    0.875135] smp: Brought up 1 node, 8 CPUs
[    1.063435] SMP: Total of 8 processors activated.
[    1.068232] CPU features: detected: 32-bit EL0 Support
[    1.073501] CPU features: detected: CRC32 instructions
[    1.079747] CPU: All CPU(s) started at EL2

I did double-check and I have lockdep in both cases:

zcat /proc/config.gz | grep LOCKDEP
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_LOCKDEP=y
# CONFIG_DEBUG_LOCKDEP is not set

[2]: 

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index c8432e24207e..fb84abfd4caa 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -219,10 +219,10 @@ CONFIG_BLK_DEV_SD=y
 CONFIG_SCSI_SAS_ATA=y
 CONFIG_SCSI_HISI_SAS=y
 CONFIG_SCSI_HISI_SAS_PCI=y
-CONFIG_SCSI_UFSHCD=m
-CONFIG_SCSI_UFSHCD_PLATFORM=m
+CONFIG_SCSI_UFSHCD=y
+CONFIG_SCSI_UFSHCD_PLATFORM=y
+CONFIG_SCSI_UFS_HISI=y
 CONFIG_SCSI_UFS_QCOM=m
-CONFIG_SCSI_UFS_HISI=m
 CONFIG_ATA=y
 CONFIG_SATA_AHCI=y
 CONFIG_SATA_AHCI_PLATFORM=y

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908
  2019-01-28 16:54 ` Valentin Schneider
@ 2019-01-29 19:48   ` Valentin Schneider
  0 siblings, 0 replies; 6+ messages in thread
From: Valentin Schneider @ 2019-01-29 19:48 UTC (permalink / raw)
  To: John Stultz, linux-arm-kernel
  Cc: Marc Zyngier, Suzuki.Poulose, Will Deacon, ard.biesheuvel

On 28/01/2019 16:54, Valentin Schneider wrote:
> Hi John,
> 
> Sorry I had missed that thread, Marc brought it up to me
> 
> On 24/01/2019 05:36, John Stultz wrote:
>> Trying to boot hikey960 with v5.0-rc3 (and Valentin's 2 recent lockdep
>> fixes), with lock debugging enabled, I'm seeing the BUG below.
>>
>> Wanted to see if anyone else had seen this or had ideas at what was going on.
>>
>> thanks
>> -john
>>
>> [    0.183376] Detected VIPT I-cache on CPU4
>> [    0.183388] CPU features: SANITY CHECK: Unexpected variation in
>> SYS_ID_AA64MMFR0_EL1. Boot CPU: 0x00000000001122, CPU4:
>> 0x00000000101122
>> [    0.183399] CPU features: Unsupported CPU feature variation detected.
>> [    0.183417] BUG: sleeping function called from invalid context at
>> kernel/locking/mutex.c:908
>> [    0.183421] in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/4
>> [    0.183451] no locks held by swapper/4/0.
>> [    0.183454] irq event stamp: 0
>> [    0.183459] hardirqs last  enabled at (0): [<0000000000000000>]
>>       (null)
>> [    0.183474] hardirqs last disabled at (0): [<ffffff80100f2af0>]
>> copy_process.isra.3.part.4+0x450/0x17b0
>> [    0.183479] softirqs last  enabled at (0): [<ffffff80100f2af0>]
>> copy_process.isra.3.part.4+0x450/0x17b0
>> [    0.183483] softirqs last disabled at (0): [<0000000000000000>]
>>       (null)
>> [    0.183485] Preemption disabled at:
>> [    0.183493] [<ffffff801009b680>] secondary_start_kernel+0xc0/0x1f8
>> [    0.183501] CPU: 4 PID: 0 Comm: swapper/4 Tainted: G S
>>   5.0.0-rc3-00002-g4daf5b1 #372
>> [    0.183504] Hardware name: HiKey960 (DT)
>> [    0.183508] Call trace:
>> [    0.183515]  dump_backtrace+0x0/0x150
>> [    0.183518]  show_stack+0x24/0x30
>> [    0.183526]  dump_stack+0xa0/0xd4
>> [    0.183532]  ___might_sleep+0x1e4/0x240
>> [    0.183535]  __might_sleep+0x58/0x90
>> [    0.183540]  __mutex_lock+0x64/0x890
>> [    0.183543]  mutex_lock_nested+0x3c/0x50
>> [    0.183549]  static_key_enable_cpuslocked+0x64/0xd8
>> [    0.183556]  arch_timer_check_ool_workaround+0x208/0x260
>> [    0.183559]  arch_timer_starting_cpu+0xfc/0x2f8
>> [    0.183563]  cpuhp_invoke_callback+0xd8/0xce0
>> [    0.183567]  notify_cpu_starting+0x84/0xa0
>> [    0.183570]  secondary_start_kernel+0xec/0x1f8
>> [    0.183601] arch_timer: Enabling local workaround for ARM erratum 858921
>> [    0.183618] arch_timer: CPU4: Trapping CNTVCT access
>> [    0.183625] CPU4: Booted secondary processor 0x0000000100 [0x410fd091]
>>
> 
> Urgh, I was going to say I don't get that warning with the lockdep patches,
> but I just realized you actually get a different warning because you have
> DEBUG_ATOMIC_SLEEP. I'll see if I can make sense of this one, it reminds me
> of [1] although we already have the _cpuslocked variant.
>

So that one is different. DEBUG_ATOMIC_SLEEP doesn't like that we try
to enable the 'arch_timer_read_ool_enabled' static branch in whatever
context arch_timer_enable_workaround() is.

It only happens once when bringing up the first A73, because after that
the static key is already enabled. It goes something like this:

- secondary_start_kernel()
  - preempt_disable()
  - notify_cpu_starting()
    - [...] // cpuhp machinery leads us to the arch timer
    - arch_timer_starting_cpu()
      - __arch_timer_setup()
        - arch_timer_check_ool_workaround()
          - arch_timer_enable_workaround()
            - static_branch_enable_cpuslocked()

It complains because we've had that preempt_disable() further up and
interrupts are being masked.

Thing is, we bring up secondary CPUs one at a time, so AFAICT we shouldn't
have any contention on jump_label_lock() and will probably never sleep.
I **think** we could move the preempt_disable() down to right before
unmasking the interrupts, but might_sleep() would still complain about
interrupts being masked.

I'm fairly sure we can't change that so early, but then I don't know what
to do with the warning (can we document/suppress it in some way?).

Thoughts?

[...]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-01-29 19:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-24  5:36 BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908 John Stultz
2019-01-24  8:10 ` Ard Biesheuvel
2019-01-24 10:05   ` Marc Zyngier
2019-01-25 17:23     ` John Stultz
2019-01-28 16:54 ` Valentin Schneider
2019-01-29 19:48   ` Valentin Schneider

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).