All of lore.kernel.org
 help / color / mirror / Atom feed
* stop_machine() soft lockup
@ 2018-09-04 19:03 Niklas Cassel
  2018-09-05  8:42 ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Niklas Cassel @ 2018-09-04 19:03 UTC (permalink / raw)
  To: peterz; +Cc: linux-kernel, bjorn.andersson

Hello Peter,

I'm seeing some lockups when booting linux-next on a db820c arm64 board.
I've tried to analyze, but I'm currently stuck.


[   30.974764] rcu: INFO: rcu_preempt self-detected stall on CPU
[   31.010121] rcu:     2-...0: (1 GPs behind) idle=3f6/1/0x4000000000000002 softirq=53/53 fqs=3099
[   31.035352] rcu:      (t=6518 jiffies g=-1119 q=1279)
[   31.054464] Task dump for CPU 2:
[   31.065349] migration/2     R  running task        0    22      2 0x0000002a
[   31.091937] Call trace:
[   31.103357]  dump_backtrace+0x0/0x200
[   31.114780]  show_stack+0x24/0x30
[   31.126183]  sched_show_task+0x20c/0x2d8
[   31.137604]  dump_cpu_task+0x48/0x58
[   31.149111]  rcu_dump_cpu_stacks+0xa0/0xe0
[   31.160620]  rcu_check_callbacks+0x85c/0xb60
[   31.172151]  update_process_times+0x34/0x60
[   31.183575]  tick_periodic+0x58/0x110
[   31.195048]  tick_handle_periodic+0x94/0xc8
[   31.206573]  arch_timer_handler_virt+0x38/0x58
[   31.218058]  handle_percpu_devid_irq+0xe4/0x458
[   31.229544]  generic_handle_irq+0x34/0x50
[   31.240952]  __handle_domain_irq+0x8c/0xf8
[   31.252375]  gic_handle_irq+0x84/0x180
[   31.263776]  el1_irq+0xec/0x198
[   31.275117]  multi_cpu_stop+0x198/0x220
[   31.286574]  cpu_stopper_thread+0xfc/0x188
[   31.298050]  smpboot_thread_fn+0x1bc/0x2c0
[   31.309562]  kthread+0x134/0x138
[   31.320888]  ret_from_fork+0x10/0x1c
[   31.332075] Task dump for CPU 3:
[   31.342925] migration/3     R  running task        0    27      2 0x0000002a
[   31.369347] Call trace:
[   31.380732]  __switch_to+0xa8/0xf0
[   31.391599]            (null)
[   32.813449] cpu_stopper_thread: calling softlockup_fn+0x0/0x78 for cpu: 0
[   32.853356] cpu_stopper_thread: calling softlockup_fn+0x0/0x78 for cpu: 1
[   32.996597] watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [migration/3:27]
[   33.008710] Modules linked in:
[   33.026913] irq event stamp: 702
[   33.038360] hardirqs last  enabled at (701): [<ffff0000081bb60c>] multi_cpu_stop+0x194/0x220
[   33.050062] hardirqs last disabled at (702): [<ffff000008083e2c>] el1_irq+0xac/0x198
[   33.061600] softirqs last  enabled at (396): [<ffff000008082164>] __do_softirq+0x47c/0x574
[   33.073284] softirqs last disabled at (389): [<ffff0000080ca41c>] irq_exit+0x134/0x148
[   33.084852] CPU: 3 PID: 27 Comm: migration/3 Tainted: G        W         4.19.0-rc2-next-20180904-00001-g32563c06c8db-dirty #42
[   33.095929] Hardware name: Qualcomm Technologies, Inc. DB820c (DT)
[   33.107086] pstate: 60400005 (nZCv daif +PAN -UAO)
[   33.118542] pc : multi_cpu_stop+0x198/0x220
[   33.129799] lr : multi_cpu_stop+0x194/0x220
[   33.140522] sp : ffff00000ae5bd50
[   33.151104] x29: ffff00000ae5bd50 x28: 0000000000000000
[   33.176154] x27: ffff00000803bc80 x26: 0000000000000000
[   33.201057] x25: 0000000000000000 x24: 0000000000000000
[   33.226014] x23: ffff0000090b4000 x22: 0000000000000003
[   33.250973] x21: ffff00000803bca4 x20: ffff00000803bc80
[   33.275806] x19: 0000000000000004 x18: ffffffffffffffff
[   33.300799] x17: 0000000000000727 x16: ffff000009fcfd00
[   33.325856] x15: ffff000009cbe1c8 x14: 31203a64656c6261
[   33.350852] x13: 000000006f1527a4 x12: ffff8000d958e3b0
[   33.375706] x11: 0000000000000000 x10: 0000000000000428
[   33.400629] x9 : ffff8000d9591000 x8 : 000000000000001c
[   33.425566] x7 : ffff8000d9c08400 x6 : 0000000000000444
[   33.450410] x5 : 0000000000007045 x4 : 0000000000000000
[   33.475399] x3 : 0000000000000000 x2 : ffff00000aabb000
[   33.500394] x1 : 0000000000000001 x0 : ffff8000d958db00
[   33.525460] Call trace:
[   33.536707]  multi_cpu_stop+0x198/0x220
[   33.548035]  cpu_stopper_thread+0xfc/0x188
[   33.559444]  smpboot_thread_fn+0x1bc/0x2c0
[   33.570610]  kthread+0x134/0x138
[   33.581944]  ret_from_fork+0x10/0x1c
[   36.813418] cpu_stopper_thread: calling softlockup_fn+0x0/0x78 for cpu: 0
[   36.853359] cpu_stopper_thread: calling softlockup_fn+0x0/0x78 for cpu: 1
[   40.813415] cpu_stopper_thread: calling softlockup_fn+0x0/0x78 for cpu: 0
[   40.853352] cpu_stopper_thread: calling softlockup_fn+0x0/0x78 for cpu: 1
[   44.813415] cpu_stopper_thread: calling softlockup_fn+0x0/0x78 for cpu: 0
[   44.853342] cpu_stopper_thread: calling softlockup_fn+0x0/0x78 for cpu: 1
[   48.813418] cpu_stopper_thread: calling softlockup_fn+0x0/0x78 for cpu: 0
[   48.853344] cpu_stopper_thread: calling softlockup_fn+0x0/0x78 for cpu: 1
[   52.813407] cpu_stopper_thread: calling softlockup_fn+0x0/0x78 for cpu: 0
[   52.853341] cpu_stopper_thread: calling softlockup_fn+0x0/0x78 for cpu: 1
[   56.813414] cpu_stopper_thread: calling softlockup_fn+0x0/0x78 for cpu: 0
[   56.853345] cpu_stopper_thread: calling softlockup_fn+0x0/0x78 for cpu: 1
[   56.921016] watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [migration/2:22]
[   56.933363] Modules linked in:
[   56.951546] irq event stamp: 604
[   56.963173] hardirqs last  enabled at (603): [<ffff0000081bb60c>] multi_cpu_stop+0x194/0x220
[   56.974744] hardirqs last disabled at (604): [<ffff000008083e2c>] el1_irq+0xac/0x198
[   56.986463] softirqs last  enabled at (450): [<ffff000008082164>] __do_softirq+0x47c/0x574
[   56.998266] softirqs last disabled at (443): [<ffff0000080ca41c>] irq_exit+0x134/0x148
[   57.009651] CPU: 2 PID: 22 Comm: migration/2 Tainted: G        W    L    4.19.0-rc2-next-20180904-00001-g32563c06c8db-dirty #42
[   57.021070] Hardware name: Qualcomm Technologies, Inc. DB820c (DT)
[   57.032279] pstate: 60400005 (nZCv daif +PAN -UAO)
[   57.043733] pc : multi_cpu_stop+0x198/0x220
[   57.055152] lr : multi_cpu_stop+0x194/0x220
[   57.065929] sp : ffff00000ae33d50
[   57.076598] x29: ffff00000ae33d50 x28: 0000000000000000
[   57.101700] x27: ffff00000803bc80 x26: 0000000000000000
[   57.126947] x25: 0000000000000000 x24: 0000000000000000
[   57.152160] x23: ffff0000090b4000 x22: 0000000000000002
[   57.177351] x21: ffff00000803bca4 x20: ffff00000803bc80
[   57.202378] x19: 0000000000000004 x18: ffffffffffffffff
[   57.227500] x17: 0000000000000708 x16: ffff000009fcfbe0
[   57.252639] x15: ffff000009cbe1c8 x14: 31203a64656c6261
[   57.277712] x13: 000000006f1527a4 x12: ffff8000d9fc63b0
[   57.302892] x11: 0000000000000000 x10: 0000000000000268
[   57.328157] x9 : ffff8000d9590400 x8 : 000000000000001c
[   57.353240] x7 : ffff8000d9c08400 x6 : 0000000000000284
[   57.378364] x5 : 0000000000004f5d x4 : 0000000000000000
[   57.403477] x3 : 0000000000000000 x2 : ffff00000aabb000
[   57.428601] x1 : 0000000000000001 x0 : ffff8000d9fc5b00
[   57.453865] Call trace:
[   57.465264]  multi_cpu_stop+0x198/0x220
[   57.476690]  cpu_stopper_thread+0xfc/0x188
[   57.487921]  smpboot_thread_fn+0x1bc/0x2c0
[   57.499416]  kthread+0x134/0x138
[   57.510752]  ret_from_fork+0x10/0x1c
[   60.813418] cpu_stopper_thread: calling softlockup_fn+0x0/0x78 for cpu: 0
[   60.853344] cpu_stopper_thread: calling softlockup_fn+0x0/0x78 for cpu: 1
[   63.137327] BUG: workqueue lockup - pool cpus=2 node=0 flags=0x0 nice=0 stuck for 60s!
[   63.138392] Showing busy workqueues and worker pools:
[   63.145567] workqueue mm_percpu_wq: flags=0x8
[   63.150564]   pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256
[   63.154568]     pending: vmstat_update
[   64.813418] cpu_stopper_thread: calling softlockup_fn+0x0/0x78 for cpu: 0
[   64.853348] cpu_stopper_thread: calling softlockup_fn+0x0/0x78 for cpu: 1
[   67.233334] INFO: task swapper/0:1 blocked for more than 30 seconds.
[   67.233610]       Tainted: G        W    L    4.19.0-rc2-next-20180904-00001-g32563c06c8db-dirty #42
[   67.239301] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[   67.248525] swapper/0       D    0     1      0 0x00000028
[   67.256236] Call trace:
[   67.261631]  __switch_to+0xa8/0xf0
[   67.263546]  __schedule+0x34c/0xc90
[   67.267279]  schedule+0x34/0x98
[   67.270662]  schedule_timeout+0x250/0x540
[   67.273785]  wait_for_common+0xe0/0x1a0
[   67.277955]  wait_for_completion+0x28/0x38
[   67.281602]  __stop_cpus+0xd4/0xf8
[   67.285766]  stop_cpus+0x70/0xa8
[   67.289145]  stop_machine_cpuslocked+0x124/0x130
[   67.292276]  stop_machine+0x54/0x70
[   67.297299]  timekeeping_notify+0x44/0x70
[   67.300088]  __clocksource_select+0xa8/0x1d8
[   67.304526]  clocksource_done_booting+0x4c/0x64
[   67.309124]  do_one_initcall+0x94/0x3f8
[   67.312847]  kernel_init_freeable+0x47c/0x528
[   67.316671]  kernel_init+0x18/0x110
[   67.321603]  ret_from_fork+0x10/0x1c
[   67.324490]
[   67.324490] Showing all locks held in the system:
[   67.328566] 3 locks held by swapper/0/1:
[   67.334709]  #0: (____ptrval____) (clocksource_mutex){+.+.}, at: clocksource_done_booting+0x2c/0x64
[   67.338641]  #1: (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: stop_machine+0x28/0x70
[   67.347569]  #2: (____ptrval____) (stop_cpus_mutex){+.+.}, at: stop_cpus+0x60/0xa8
[   67.356260] 1 lock held by khungtaskd/543:
[   67.363757]  #0: (____ptrval____) (rcu_read_lock){....}, at: debug_show_all_locks+0x14/0x1b0
[   67.367628]
[   67.376368] =============================================

These lines:
[   33.038360] hardirqs last  enabled at (701): [<ffff0000081bb60c>] multi_cpu_stop+0x194/0x220
[   33.050062] hardirqs last disabled at (702): [<ffff000008083e2c>] el1_irq+0xac/0x198
seem to indicate that an irq comes directly after local_irq_restore() in multi_cpu_stop().


I've added some debug prints. In the first stop_machine call, everything
completes as it should, in the second iteration, it looks like an irq comes
directly after local_irq_restore() in multi_cpu_stop(), and after that cpu2
and cpu3 is never completed, so wait_for_completion() never returns.


[    0.418197] stop_machine: calling stop_machine_cpuslocked with fn: __enable_cpu_capability+0x0/0x38
[    0.420051] stop_machine_cpuslocked: calling stop_cpus
[    0.429017] >>>>>>>>>>>>>>>>>>>>>>>> stop_cpus:450 taking stop_cpus_mutex
[    0.434205] __stop_cpus: queueing work
[    0.441027] queueing multi_cpu_stop+0x0/0x220 for cpu: 0
[    0.444641] queueing multi_cpu_stop+0x0/0x220 for cpu: 1
[    0.450145] queueing multi_cpu_stop+0x0/0x220 for cpu: 2
[    0.450188] cpu_stopper_thread: calling multi_cpu_stop+0x0/0x220 for cpu: 1
[    0.455438] queueing multi_cpu_stop+0x0/0x220 for cpu: 3
[    0.458486] cpu_stopper_thread: calling multi_cpu_stop+0x0/0x220 for cpu: 2
[    0.459101] multi_cpu_stop: cpu: 2 before local_save_flags, in atomic: 1 irqs_disabled: 0
[    0.459660] multi_cpu_stop: cpu: 2 after local_save_flags, in atomic: 1 irqs_disabled: 0
[    0.460713] multi_cpu_stop: cpu: 1 before local_save_flags, in atomic: 1 irqs_disabled: 0
[    0.467456] cpu_stopper_thread: calling multi_cpu_stop+0x0/0x220 for cpu: 0
[    0.469026] cpu_stopper_thread: calling multi_cpu_stop+0x0/0x220 for cpu: 3
[    0.469635] multi_cpu_stop: cpu: 3 before local_save_flags, in atomic: 1 irqs_disabled: 0
[    0.470190] multi_cpu_stop: cpu: 3 after local_save_flags, in atomic: 1 irqs_disabled: 0
[    0.472945] multi_cpu_stop: cpu: 1 after local_save_flags, in atomic: 1 irqs_disabled: 0
[    0.479629] multi_cpu_stop: cpu: 0 before local_save_flags, in atomic: 1 irqs_disabled: 0
[    0.542478] multi_cpu_stop: cpu: 0 after local_save_flags, in atomic: 1 irqs_disabled: 0
[    0.555518] multi_cpu_stop: cpu: 1 before local_irq_restore, in atomic: 1 irqs_disabled: 128
[    0.555523] multi_cpu_stop: cpu: 0 before local_irq_restore, in atomic: 1 irqs_disabled: 128
[    0.555830] multi_cpu_stop: cpu: 2 before local_irq_restore, in atomic: 1 irqs_disabled: 128
[    0.556074] multi_cpu_stop: cpu: 3 before local_irq_restore, in atomic: 1 irqs_disabled: 128
[    0.556366] multi_cpu_stop: cpu: 2 after local_irq_restore, in atomic: 1 irqs_disabled: 0
[    0.556624] multi_cpu_stop: cpu: 3 after local_irq_restore, in atomic: 1 irqs_disabled: 0
[    0.557133] cpu_stopper_thread: call to multi_cpu_stop+0x0/0x220 complete, setting done for cpu: 2
[    0.558722] multi_cpu_stop: cpu: 1 after local_irq_restore, in atomic: 1 irqs_disabled: 0
[    0.559213] cpu_stopper_thread: call to multi_cpu_stop+0x0/0x220 complete, setting done for cpu: 3
[    0.567138] multi_cpu_stop: cpu: 0 after local_irq_restore, in atomic: 1 irqs_disabled: 0
[    0.575551] cpu_stopper_thread: call to multi_cpu_stop+0x0/0x220 complete, setting done for cpu: 1
[    0.583972] cpu_stopper_thread: call to multi_cpu_stop+0x0/0x220 complete, setting done for cpu: 0
[    0.651730] <<<<<<<<<<<<<<<<<<<<<< __stop_cpus:411 now waiting for completion
[    0.660642] <<<<<<<<<<<<<<<<<<<<<< __stop_cpus:413 completion done
[    0.667836] >>>>>>>>>>>>>>>>>>>>>>>> stop_cpus:454 returning stop_cpus_mutex


[    5.016096] stop_machine: calling stop_machine_cpuslocked with fn: change_clocksource+0x0/0x118
[    5.016188] stop_machine_cpuslocked: calling stop_cpus
[    5.016271] >>>>>>>>>>>>>>>>>>>>>>>> stop_cpus:450 taking stop_cpus_mutex
[    5.016365] __stop_cpus: queueing work
[    5.016471] queueing multi_cpu_stop+0x0/0x220 for cpu: 0
[    5.016699] queueing multi_cpu_stop+0x0/0x220 for cpu: 1
[    5.016858] queueing multi_cpu_stop+0x0/0x220 for cpu: 2
[    5.016901] cpu_stopper_thread: calling multi_cpu_stop+0x0/0x220 for cpu: 0
[    5.016984] multi_cpu_stop: cpu: 0 before local_save_flags, in atomic: 1 irqs_disabled: 0
[    5.017004] queueing multi_cpu_stop+0x0/0x220 for cpu: 3
[    5.017063] multi_cpu_stop: cpu: 0 after local_save_flags, in atomic: 1 irqs_disabled: 0
[    5.017270] cpu_stopper_thread: calling multi_cpu_stop+0x0/0x220 for cpu: 1
[    5.017351] multi_cpu_stop: cpu: 1 before local_save_flags, in atomic: 1 irqs_disabled: 0
[    5.017429] multi_cpu_stop: cpu: 1 after local_save_flags, in atomic: 1 irqs_disabled: 0
[    5.046491] cpu_stopper_thread: calling multi_cpu_stop+0x0/0x220 for cpu: 2
[    5.079044] multi_cpu_stop: cpu: 2 before local_save_flags, in atomic: 1 irqs_disabled: 0
[    5.097884] multi_cpu_stop: cpu: 2 after local_save_flags, in atomic: 1 irqs_disabled: 0
[    5.518726] cpu_stopper_thread: calling multi_cpu_stop+0x0/0x220 for cpu: 3
[    5.571321] multi_cpu_stop: cpu: 3 before local_save_flags, in atomic: 1 irqs_disabled: 0
[    5.624144] multi_cpu_stop: cpu: 3 after local_save_flags, in atomic: 1 irqs_disabled: 0
[    5.680879] multi_cpu_stop: cpu: 1 before local_irq_restore, in atomic: 1 irqs_disabled: 128
[    5.680891] multi_cpu_stop: cpu: 0 before local_irq_restore, in atomic: 1 irqs_disabled: 128
[    5.681227] multi_cpu_stop: cpu: 0 after local_irq_restore, in atomic: 1 irqs_disabled: 0
[    5.682256] multi_cpu_stop: cpu: 3 before local_irq_restore, in atomic: 1 irqs_disabled: 128
[    5.682986] multi_cpu_stop: cpu: 2 before local_irq_restore, in atomic: 1 irqs_disabled: 128
[    5.688761] multi_cpu_stop: cpu: 1 after local_irq_restore, in atomic: 1 irqs_disabled: 0
[    5.697559] cpu_stopper_thread: call to multi_cpu_stop+0x0/0x220 complete, setting done for cpu: 0
[    5.705384] cpu_stopper_thread: call to multi_cpu_stop+0x0/0x220 complete, setting done for cpu: 1
[    5.705896] <<<<<<<<<<<<<<<<<<<<<< __stop_cpus:411 now waiting for completion



[   33.038360] hardirqs last  enabled at (701): [<ffff0000081bb60c>] multi_cpu_stop+0x194/0x220
translates to:
arch_local_irq_restore at /home/nks/linux-next/./arch/arm64/include/asm/irqflags.h:88 (discriminator 2)
 (inlined by) multi_cpu_stop at /home/nks/linux-next/kernel/stop_machine.c:238 (discriminator 2)

[   33.118542] pc : multi_cpu_stop+0x198/0x220
translates to:
get_current at /home/nks/linux-next/./arch/arm64/include/asm/current.h:19
 (inlined by) preempt_count at /home/nks/linux-next/./include/asm-generic/preempt.h:11
 (inlined by) multi_cpu_stop at /home/nks/linux-next/kernel/stop_machine.c:239
which equals:
 asm ("mrs %0, sp_el0" : "=r" (sp_el0));


If we got an irq directly after enabling irqs, I would assume that the execution
should have continued after the irq was done. Any thoughts?


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: stop_machine() soft lockup
  2018-09-04 19:03 stop_machine() soft lockup Niklas Cassel
@ 2018-09-05  8:42 ` Peter Zijlstra
  2018-09-05 11:47   ` Niklas Cassel
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2018-09-05  8:42 UTC (permalink / raw)
  To: Niklas Cassel; +Cc: linux-kernel, bjorn.andersson

On Tue, Sep 04, 2018 at 09:03:22PM +0200, Niklas Cassel wrote:
> Hello Peter,
> 
> I'm seeing some lockups when booting linux-next on a db820c arm64 board.
> I've tried to analyze, but I'm currently stuck.

Please see (should be in your Inbox too):

  https://lkml.kernel.org/r/20180905084158.GR24124@hirez.programming.kicks-ass.net

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: stop_machine() soft lockup
  2018-09-05  8:42 ` Peter Zijlstra
@ 2018-09-05 11:47   ` Niklas Cassel
  2018-09-05 13:14     ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Niklas Cassel @ 2018-09-05 11:47 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, bjorn.andersson

On Wed, Sep 05, 2018 at 10:42:41AM +0200, Peter Zijlstra wrote:
> On Tue, Sep 04, 2018 at 09:03:22PM +0200, Niklas Cassel wrote:
> > Hello Peter,
> >
> > I'm seeing some lockups when booting linux-next on a db820c arm64 board.
> > I've tried to analyze, but I'm currently stuck.
>
> Please see (should be in your Inbox too):
>
>   https://lkml.kernel.org/r/20180905084158.GR24124@hirez.programming.kicks-ass.net

I'm sorry if I mislead you by replying to your other mail thread,
both of them have timekeeping_notify() in the call trace,
but my problem has this call trace:

[  128.747853]  wait_for_common+0xe0/0x1a0
[  128.752023]  wait_for_completionx+0x28/0x38
[  128.755677]  __stop_cpus+0xd4/0xf8
[  128.759837]  stop_cpus+0x70/0xa8
[  128.762958]  stop_machine_cpuslocked+0x124/0x130
[  128.766345]  stop_machine+0x54/0x70
[  128.771373]  timekeeping_notify+0x44/0x70
[  128.774158]  __clocksource_select+0xa8/0x1d8
[  128.778605]  clocksource_done_booting+0x4c/0x64
[  128.782931]  do_one_initcall+0x94/0x3f8
[  128.786921]  kernel_init_freeable+0x47c/0x528
[  128.790742]  kernel_init+0x18/0x110
[  128.795673]  ret_from_fork+0x10/0x1c


while your other mail thread has this call trace:

* stop_machine()
* timekeeping_notify()
* __clocksource_select()
* clocksource_select()
* clocksource_watchdog_work()


So my problem is not related to the watchdog, I tried your revert anyway,
but unfortunately my problem persists.


In my problem, what appears to happen is that in the end of multi_cpu_stop(),
local_irq_restore() is called, and an irq comes directly after that,
then after 22 seconds soft lockup detector kicks in (so I appear to still
get timer ticks). The PC that the soft lockup detector prints is the line
after local_irq_restore(). Without any prints, that is simply end brace of
the function.

I'm booting with kernel command line: "earlycon ftrace=irqsoff"
I can't seem to reproduce the problem without ftrace=irqsoff,
not sure if it because of timing or because ftrace is involved.

I managed to get another call trace with ftrace included,
unfortunately both CPUs appear to be printing at the same time.

[   32.703910] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [migration/2:22]
[   32.728589] Modules linked in:
[   32.747176] irq event stamp: 154
[   32.758851] hardirqs last  enabled at (153): [<ffff0000081bb59c>] multi_cpu_stop+0xfc/0x168
[   32.770736] hardirqs last disabled at (154): [<ffff000008083e2c>] el1_irq+0xac/0x198
[   32.774911] watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:0H:30]
[   32.779946] Modules linked in:
[   32.786425] softirqs last  enabled at (114): [<ffff000008082164>] __do_softirq+0x47c/0x574
[   32.795450] irq event stamp: 62
[   32.805549] softirqs last disabled at (95): [<ffff0000080ca41c>] irq_exit+0x134/0x148
[   32.810450] hardirqs last  enabled at (61): [<ffff00000908efe8>] _raw_spin_unlock_irq+0x38/0x78
[   32.816075] hardirqs last disabled at (62): [<ffff000008083e2c>] el1_irq+0xac/0x198
[   32.821812] softirqs last  enabled at (0): [<ffff0000080bdb68>] copy_process.isra.5.part.6+0x338/0x1ac$
[   32.829394] CPU: 2 PID: 22 Comm: migration/2 Tainted: G        W         4.19.0-rc2-next-20180904-0000$
-gd2e5ea917660 #47
[   32.836772] softirqs last disabled at (0): [<0000000000000000>]           (null)
[   32.846193] Hardware name: Qualcomm Technologies, Inc. DB820c (DT)
[   32.857546] CPU: 3 PID: 30 Comm: kworker/3:0H Tainted: G        W         4.19.0-rc2-next-20180904-000$
2-gd2e5ea917660 #47
[   32.864532] pstate: 60400005 (nZCv daif +PAN -UAO)
[   32.870789] Hardware name: Qualcomm Technologies, Inc. DB820c (DT)
[   32.881909] pc : multi_cpu_stop+0x100/0x168
[   32.887583] lr : multi_cpu_stop+0xfc/0x168
[   32.892723] sp : ffff00000ae33d70
[   32.897717] x29: ffff00000ae33d70 x28: 0000000000000000
[   32.906981] pstate: 60400005 (nZCv daif +PAN -UAO)
[   32.916202] x27: 0000000000000002 x26: ffff00000803bc80
[   32.923256] pc : _raw_spin_unlock_irq+0x3c/0x78
[   32.933817] x25: ffff0000081bb4a0
[   32.937340] lr : _raw_spin_unlock_irq+0x38/0x78
[   32.942465] sp : ffff00000ae73c60
[   32.947455] x29: ffff00000ae73c60 x28: ffff00000978b000
[   32.953060] x24: 0000000000000000
[   32.967484] x23: 0000000000000000
[   32.971265] x27: ffff8000d958db00
[   32.976343] x22: 0000000000000000
[   32.982508] x26: 0000000000000001
[   32.996862] x21: ffff00000803bca4
[   33.000630] x25: ffff8000d9e8a328
[   33.005754] x20: ffff00000803bc80
[   33.011888] x24: ffff000009087f88
[   33.026057] x19: 0000000000000004
[   33.029982] x23: ffff8000d971ad80
[   33.035127] x18: 00000000000008bd
[   33.041263] x22: ffff000009cbf000
[   33.055312] x17: 00000000000008bc
[   33.059310] x21: ffff8000d958db00
[   33.064303] x16: ffff000009fd1360
[   33.070605] x20: ffff0000080fc20c
[   33.084645] x15: ffff000009f2a300
[   33.088605] x19: ffff8000da7f4d40
[   33.093572] x14: 00000000000017fc
[   33.099943] x18: ffffffffffffffff
[   33.114000] x13: 00000000c04846ac
[   33.117948] x17: 0000000000000693
[   33.122855] x12: ffff8000d9fc63b0
[   33.129195] x16: ffff000009fcf7e0
[   33.143370] x11: 0000000000000000
[   33.147294] x15: ffff000009cbe1c8
[   33.152152] x10: 0000000000000348
[   33.158535] x14: 0000000000002fff
[   33.172812] x9 : ffff8000d9590400
[   33.176620] x13: 0000000016163b60
[   33.181506] x8 : 000000000000001c
[   33.187866] x12: ffff8000d971b630
[   33.202251] x7 : ffff8000d9c08400
[   33.205880] x11: 0000000000000000
[   33.210939] x6 : 0000000000000364
[   33.217236] x10: 0000000000000a48
[   33.231721] x5 : 0000000000005356
[   33.235281] x9 : ffff8000d9590c00
[   33.240397] x4 : 0000000000000000
[   33.246592] x8 : 000000000000001c
[   33.261065] x3 : 0000000000000000
[   33.264749] x7 : ffff8000d9c08400
[   33.269874] x2 : ffff00000aabb000
[   33.276019] x6 : 0000000000000a64
[   33.290389] x1 : 0000000000000001
[   33.294192] x5 : 000000000000cf1a
[   33.299235] x0 : ffff8000d9fc5b00
[   33.305452] x4 : 0000000000000000
[   33.319989] Call trace:
[   33.324715] x3 : 0000000000000000 x2 : ffff00000aabb000
[   33.331717]  multi_cpu_stop+0x100/0x168
[   33.341973] x1 : 0000000000000001
[   33.345891]  cpu_stopper_thread+0xa8/0x118
[   33.351777]  smpboot_thread_fn+0x1bc/0x2c0
[   33.357845] x0 : ffff8000d971ad80
[   33.362176]  kthread+0x134/0x138
[   33.371281] Call trace:
[   33.377420]  ret_from_fork+0x10/0x1c
[   33.382212]  _raw_spin_unlock_irq+0x3c/0x78
[   33.387685]  finish_task_switch+0xa4/0x200
[   33.396875]  __schedule+0x350/0xc90
[   33.408272]  preempt_schedule_notrace+0x5c/0x130
[   33.419484]  ftrace_ops_no_ops+0xf4/0x180
[   33.430795]  ftrace_graph_call+0x0/0xc
[   33.442095]  preempt_count_add+0x1c/0x130
[   33.453259]  schedule+0x2c/0x98
[   33.464586]  worker_thread+0xdc/0x478
[   33.475886]  kthread+0x134/0x138
[   33.487099]  ret_from_fork+0x10/0x1c
[   34.464807] rcu: INFO: rcu_preempt self-detected stall on CPU
[   34.487238] rcu:     2-...0: (2757 ticks this GP) idle=20e/1/0x4000000000000002 softirq=120/120 fqs=316
5
[   34.509364] rcu:      (t=6514 jiffies g=-1095 q=1250)
[   34.528505] Task dump for CPU 2:
[   34.539317] migration/2     R  running task        0    22      2 0x0000002a
[   34.565716] Call trace:
[   34.577387]  dump_backtrace+0x0/0x200
[   34.588782]  show_stack+0x24/0x30
[   34.600098]  sched_show_task+0x20c/0x2d8
[   34.611726]  dump_cpu_task+0x48/0x58
[   34.623109]  rcu_dump_cpu_stacks+0xa0/0xe0
[   34.634637]  rcu_check_callbacks+0x85c/0xb60
[   34.646324]  update_process_times+0x34/0x60
[   34.657569]  tick_periodic+0x58/0x110
[   34.669169]  tick_handle_periodic+0x94/0xc8
[   34.680719]  arch_timer_handler_virt+0x38/0x58
[   34.692090]  handle_percpu_devid_irq+0xe4/0x458
[   34.703783]  generic_handle_irq+0x34/0x50
[   34.715121]  __handle_domain_irq+0x8c/0xf8
[   34.726529]  gic_handle_irq+0x84/0x180
[   34.738131]  el1_irq+0xec/0x198
[   34.749285]  multi_cpu_stop+0x100/0x168
[   34.760840]  cpu_stopper_thread+0xa8/0x118
[   34.772384]  smpboot_thread_fn+0x1bc/0x2c0
[   34.783587]  kthread+0x134/0x138
[   34.795188]  ret_from_fork+0x10/0x1c



Could perhaps
[   32.821812] softirqs last  enabled at (0): [<ffff0000080bdb68>] copy_process.isra.5.part.6+0x338/0x1ac
be involved in why the execution appears to never continue after the irq?


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: stop_machine() soft lockup
  2018-09-05 11:47   ` Niklas Cassel
@ 2018-09-05 13:14     ` Peter Zijlstra
  2018-09-05 13:45       ` Niklas Cassel
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2018-09-05 13:14 UTC (permalink / raw)
  To: Niklas Cassel; +Cc: linux-kernel, bjorn.andersson

On Wed, Sep 05, 2018 at 01:47:49PM +0200, Niklas Cassel wrote:
> On Wed, Sep 05, 2018 at 10:42:41AM +0200, Peter Zijlstra wrote:
> > On Tue, Sep 04, 2018 at 09:03:22PM +0200, Niklas Cassel wrote:
> > > Hello Peter,
> > >
> > > I'm seeing some lockups when booting linux-next on a db820c arm64 board.
> > > I've tried to analyze, but I'm currently stuck.
> >
> > Please see (should be in your Inbox too):
> >
> >   https://lkml.kernel.org/r/20180905084158.GR24124@hirez.programming.kicks-ass.net
> 
> I'm sorry if I mislead you by replying to your other mail thread,
> both of them have timekeeping_notify() in the call trace,
> but my problem has this call trace:
> 
> [  128.747853]  wait_for_common+0xe0/0x1a0
> [  128.752023]  wait_for_completionx+0x28/0x38
> [  128.755677]  __stop_cpus+0xd4/0xf8
> [  128.759837]  stop_cpus+0x70/0xa8
> [  128.762958]  stop_machine_cpuslocked+0x124/0x130
> [  128.766345]  stop_machine+0x54/0x70
> [  128.771373]  timekeeping_notify+0x44/0x70
> [  128.774158]  __clocksource_select+0xa8/0x1d8
> [  128.778605]  clocksource_done_booting+0x4c/0x64
> [  128.782931]  do_one_initcall+0x94/0x3f8
> [  128.786921]  kernel_init_freeable+0x47c/0x528
> [  128.790742]  kernel_init+0x18/0x110
> [  128.795673]  ret_from_fork+0x10/0x1c
> 
> 
> while your other mail thread has this call trace:
> 
> * stop_machine()
> * timekeeping_notify()
> * __clocksource_select()
> * clocksource_select()
> * clocksource_watchdog_work()
> 
> 
> So my problem is not related to the watchdog, I tried your revert anyway,
> but unfortunately my problem persists.

Oh, right, missed that distinction. And this is new?

I'll try and have a look. Lockdep doesn't suggest anything?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: stop_machine() soft lockup
  2018-09-05 13:14     ` Peter Zijlstra
@ 2018-09-05 13:45       ` Niklas Cassel
  2018-09-06 12:56         ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Niklas Cassel @ 2018-09-05 13:45 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, bjorn.andersson

On Wed, Sep 05, 2018 at 03:14:10PM +0200, Peter Zijlstra wrote:
> On Wed, Sep 05, 2018 at 01:47:49PM +0200, Niklas Cassel wrote:
> > On Wed, Sep 05, 2018 at 10:42:41AM +0200, Peter Zijlstra wrote:
> > > On Tue, Sep 04, 2018 at 09:03:22PM +0200, Niklas Cassel wrote:
> > > > Hello Peter,
> > > >
> > > > I'm seeing some lockups when booting linux-next on a db820c arm64 board.
> > > > I've tried to analyze, but I'm currently stuck.
> > >
> > > Please see (should be in your Inbox too):
> > >
> > >   https://lkml.kernel.org/r/20180905084158.GR24124@hirez.programming.kicks-ass.net
> >
> > I'm sorry if I mislead you by replying to your other mail thread,
> > both of them have timekeeping_notify() in the call trace,
> > but my problem has this call trace:
> >
> > [  128.747853]  wait_for_common+0xe0/0x1a0
> > [  128.752023]  wait_for_completionx+0x28/0x38
> > [  128.755677]  __stop_cpus+0xd4/0xf8
> > [  128.759837]  stop_cpus+0x70/0xa8
> > [  128.762958]  stop_machine_cpuslocked+0x124/0x130
> > [  128.766345]  stop_machine+0x54/0x70
> > [  128.771373]  timekeeping_notify+0x44/0x70
> > [  128.774158]  __clocksource_select+0xa8/0x1d8
> > [  128.778605]  clocksource_done_booting+0x4c/0x64
> > [  128.782931]  do_one_initcall+0x94/0x3f8
> > [  128.786921]  kernel_init_freeable+0x47c/0x528
> > [  128.790742]  kernel_init+0x18/0x110
> > [  128.795673]  ret_from_fork+0x10/0x1c
> >
> >
> > while your other mail thread has this call trace:
> >
> > * stop_machine()
> > * timekeeping_notify()
> > * __clocksource_select()
> > * clocksource_select()
> > * clocksource_watchdog_work()
> >
> >
> > So my problem is not related to the watchdog, I tried your revert anyway,
> > but unfortunately my problem persists.
>
> Oh, right, missed that distinction. And this is new?

I can reproduce it on 4.14.15 so probably not that new.

>
> I'll try and have a look. Lockdep doesn't suggest anything?

Thanks.

Usually, when seeing these soft lockups, no.

However, I just managed to get this lockdep splat when booting without
earlycon and ftrace=irqsoff
I'm not sure that it is related, but I'm planning on looking into it anyway:

[   70.644248] Bluetooth: hci0: QCA Failed to request file: qca/rampatch_00440302.bin (-11)
[   70.644299] Bluetooth: hci0: QCA Failed to download patch (-11)
[   70.644475] cfg80211: failed to load regulatory.db
[   70.658037] remoteproc remoteproc0: powering up adsp-pil
[   70.663080] remoteproc remoteproc0: Direct firmware load for adsp.mdt failed with error -2
[   70.667438]
[   70.667445] ======================================================
[   70.667450] WARNING: possible circular locking dependency detected
[   70.667459] 4.19.0-rc2-next-20180904-00002-gd2e5ea917660-dirty #44 Tainted: G        W
[   70.667464] ------------------------------------------------------
[   70.667471] kworker/0:1/14 is trying to acquire lock:
[   70.667478] (____ptrval____) (&(&pool->lock)->rlock){-.-.}, at: __queue_work+0x3d4/0x6a0
[   70.667512]
[   70.667512] but task is already holding lock:
[   70.667517] (____ptrval____) (&port_lock_key){-.-.}, at: msm_uart_irq+0x38/0x710
[   70.667545]
[   70.667545] which lock already depends on the new lock.
[   70.667545]
[   70.667550]
[   70.667550] the existing dependency chain (in reverse order) is:
[   70.667555]
[   70.667555] -> #2 (&port_lock_key){-.-.}:
[   70.667585]        _raw_spin_lock+0x44/0x58
[   70.667591]        __msm_console_write+0x84/0x1e8
[   70.667598]        msm_console_write+0x64/0x78
[   70.667608]        console_unlock+0x400/0x610
[   70.667615]        register_console+0x290/0x3b8
[   70.667622]        uart_add_one_port+0x4cc/0x4d8
[   70.667628]        msm_serial_probe+0x158/0x1d8
[   70.667639]        platform_drv_probe+0x58/0xa8
[   70.667645]        really_probe+0x280/0x3d8
[   70.667651]        driver_probe_device+0x60/0x148
[   70.667658]        __driver_attach+0x144/0x148
[   70.667665]        bus_for_each_dev+0x84/0xd8
[   70.667671]        driver_attach+0x30/0x40
[   70.667677]        bus_add_driver+0x234/0x2a8
[   70.667684]        driver_register+0x64/0x110
[   70.667691]        __platform_driver_register+0x54/0x60
[   70.667700]        msm_serial_init+0x40/0x70
[   70.667709]        do_one_initcall+0x94/0x3f8
[   70.667717]        kernel_init_freeable+0x47c/0x528
[   70.667726]        kernel_init+0x18/0x110
[   70.667732]        ret_from_fork+0x10/0x1c
[   70.667737]
[   70.667737]
[   70.667737] -> #1 (console_owner){-.-.}:
[   70.667762]        console_unlock+0x298/0x610
[   70.667769]        vprintk_emit+0x110/0x298
[   70.667776]        vprintk_default+0x48/0x58
[   70.667782]        vprintk_func+0x100/0x200
[   70.667789]        printk+0x74/0x94
[   70.667797]        __warn_printk+0x48/0xa0
[   70.667803]        check_flush_dependency+0xf8/0x150
[   70.667809]        __flush_work+0xc0/0x2e0
[   70.667815]        __cancel_work_timer+0x154/0x1c8
[   70.667821]        cancel_delayed_work_sync+0x24/0x30
[   70.667832]        ufshcd_ungate_work+0x30/0x100
[   70.667838]        process_one_work+0x2a0/0x710
[   70.667844]        worker_thread+0x48/0x478
[   70.667852]        kthread+0x134/0x138
[   70.667858]        ret_from_fork+0x10/0x1c
[   70.667863]
[   70.667863] -> #0 (&(&pool->lock)->rlock){-.-.}:
[   70.667890]        lock_acquire+0xc0/0x230
[   70.667897]        _raw_spin_lock+0x44/0x58
[   70.667903]        __queue_work+0x3d4/0x6a0
[   70.667909]        queue_work_on+0xc8/0xd0
[   70.667920]        hci_uart_tx_wakeup+0x188/0x228
[   70.667926]        hci_uart_write_wakeup+0x70/0x78
[   70.667933]        ttyport_write_wakeup+0xdc/0xe8
[   70.667940]        tty_port_tty_wakeup+0x28/0x38
[   70.667946]        uart_write_wakeup+0x24/0x38
[   70.667952]        msm_handle_tx_pio+0x94/0x1d0
[   70.667958]        msm_handle_tx+0x10c/0x3c0
[   70.667964]        msm_uart_irq+0x310/0x710
[   70.667972]        __handle_irq_event_percpu+0xb8/0x440
[   70.667978]        handle_irq_event_percpu+0x40/0x98
[   70.667983]        handle_irq_event+0x50/0x80
[   70.667990]        handle_fasteoi_irq+0xc8/0x1a0
[   70.667998]        generic_handle_irq+0x34/0x50
[   70.668004]        __handle_domain_irq+0x8c/0xf8
[   70.668010]        gic_handle_irq+0x84/0x180
[   70.668016]        el1_irq+0xec/0x198
[   70.668022]        console_unlock+0x4e0/0x610
[   70.668029]        vprintk_emit+0x110/0x298
[   70.668036]        dev_vprintk_emit+0x150/0x248
[   70.668042]        dev_printk_emit+0x84/0xa8
[   70.668048]        __dev_printk+0x5c/0xa0
[   70.668054]        _dev_warn+0x74/0x98
[   70.668062]        _request_firmware+0x428/0x5d8
[   70.668068]        request_firmware+0x40/0x50
[   70.668076]        rproc_boot+0xfc/0x480
[   70.668081]        rproc_auto_boot_callback+0x24/0x38
[   70.668087]        request_firmware_work_func+0x50/0x88
[   70.668094]        process_one_work+0x2a0/0x710
[   70.668100]        worker_thread+0x48/0x478
[   70.668106]        kthread+0x134/0x138
[   70.668113]        ret_from_fork+0x10/0x1c
[   70.668118]
[   70.668124] Chain exists of:
[   70.668124]   &(&pool->lock)->rlock --> console_owner --> &port_lock_key
[   70.668124]
[   70.668155]  Possible unsafe locking scenario:
[   70.668155]
[   70.668160]        CPU0                    CPU1
[   70.668164]        ----                    ----
[   70.668169]   lock(&port_lock_key);
[   70.668182]                                lock(console_owner);
[   70.668195]                                lock(&port_lock_key);
[   70.668209]   lock(&(&pool->lock)->rlock);
[   70.668222]
[   70.668222]  *** DEADLOCK ***
[   70.668222]
[   70.668229] 6 locks held by kworker/0:1/14:
[   70.668234]  #0: (____ptrval____) ((wq_completion)"events"){+.+.}, at: process_one_work+0x1f8/0x710
[   70.668263]  #1: (____ptrval____) ((work_completion)(&fw_work->work)){+.+.}, at: process_one_work+0x1f8
/0x710
[   70.668293]  #2: (____ptrval____) (&rproc->lock){+.+.}, at: rproc_boot+0x4c/0x480
[   70.668322]  #3: (____ptrval____) (console_lock){+.+.}, at: vprintk_emit+0x108/0x298
[   70.668351]  #4: (____ptrval____) (&port_lock_key){-.-.}, at: msm_uart_irq+0x38/0x710
[   70.668379]  #5: (____ptrval____) (&hu->proto_lock){.+.+}, at: hci_uart_write_wakeup+0x70/0x78
[   70.668409]
[   70.668409] stack backtrace:
[   70.668418] CPU: 0 PID: 14 Comm: kworker/0:1 Tainted: G        W         4.19.0-rc2-next-20180904-00002
-gd2e5ea917660-dirty #44
[   70.668423] Hardware name: Qualcomm Technologies, Inc. DB820c (DT)
[   70.668435] Workqueue: events request_firmware_work_func
[   70.668446] Call trace:
[   70.668454]  dump_backtrace+0x0/0x200
[   70.668460]  show_stack+0x24/0x30
[   70.668469]  dump_stack+0xac/0xe4
[   70.668476]  print_circular_bug.isra.19+0x1d4/0x2e8
[   70.668482]  __lock_acquire+0x1814/0x1878
[   70.668489]  lock_acquire+0xc0/0x230
[   70.668495]  _raw_spin_lock+0x44/0x58
[   70.668501]  __queue_work+0x3d4/0x6a0
[   70.668507]  queue_work_on+0xc8/0xd0
[   70.668514]  hci_uart_tx_wakeup+0x188/0x228
[   70.668520]  hci_uart_write_wakeup+0x70/0x78
[   70.668527]  ttyport_write_wakeup+0xdc/0xe8
[   70.668533]  tty_port_tty_wakeup+0x28/0x38
[   70.668539]  uart_write_wakeup+0x24/0x38
[   70.668544]  msm_handle_tx_pio+0x94/0x1d0
[   70.668550]  msm_handle_tx+0x10c/0x3c0
[   70.668557]  msm_uart_irq+0x310/0x710
[   70.668563]  __handle_irq_event_percpu+0xb8/0x440
[   70.668568]  handle_irq_event_percpu+0x40/0x98
[   70.668574]  handle_irq_event+0x50/0x80
[   70.668581]  handle_fasteoi_irq+0xc8/0x1a0
[   70.668588]  generic_handle_irq+0x34/0x50
[   70.668594]  __handle_domain_irq+0x8c/0xf8
[   70.668599]  gic_handle_irq+0x84/0x180
[   70.668606]  el1_irq+0xec/0x198
[   70.668612]  console_unlock+0x4e0/0x610
[   70.668619]  vprintk_emit+0x110/0x298
[   70.668625]  dev_vprintk_emit+0x150/0x248
[   70.668631]  dev_printk_emit+0x84/0xa8
[   70.668637]  __dev_printk+0x5c/0xa0
[   70.668643]  _dev_warn+0x74/0x98
[   70.668649]  _request_firmware+0x428/0x5d8
[   70.668656]  request_firmware+0x40/0x50
[   70.668662]  rproc_boot+0xfc/0x480
[   70.668668]  rproc_auto_boot_callback+0x24/0x38
[   70.668674]  request_firmware_work_func+0x50/0x88
[   70.668680]  process_one_work+0x2a0/0x710
[   70.668686]  worker_thread+0x48/0x478
[   70.668693]  kthread+0x134/0x138
[   70.668699]  ret_from_fork+0x10/0x1c


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: stop_machine() soft lockup
  2018-09-05 13:45       ` Niklas Cassel
@ 2018-09-06 12:56         ` Peter Zijlstra
  0 siblings, 0 replies; 6+ messages in thread
From: Peter Zijlstra @ 2018-09-06 12:56 UTC (permalink / raw)
  To: Niklas Cassel; +Cc: linux-kernel, bjorn.andersson

On Wed, Sep 05, 2018 at 03:45:53PM +0200, Niklas Cassel wrote:

> However, I just managed to get this lockdep splat when booting without
> earlycon and ftrace=irqsoff
> I'm not sure that it is related, but I'm planning on looking into it anyway:

That looks like your generic printk is crap splat.

I tend to use this patch-set:

  https://lkml.kernel.org/r/20170928121823.430053219@infradead.org

and have (on x86):

  earlyprintk=serial,ttyS0,115200 force_early_printk debug ignore_loglevel

That results in a printk that does pure UART bit-banging and works from
any context without issue.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-09-06 12:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-04 19:03 stop_machine() soft lockup Niklas Cassel
2018-09-05  8:42 ` Peter Zijlstra
2018-09-05 11:47   ` Niklas Cassel
2018-09-05 13:14     ` Peter Zijlstra
2018-09-05 13:45       ` Niklas Cassel
2018-09-06 12:56         ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.