All of lore.kernel.org
 help / color / mirror / Atom feed
* console: lockup on boot
@ 2014-05-30 13:11 Sasha Levin
  2014-05-30 13:58 ` Peter Hurley
  0 siblings, 1 reply; 17+ messages in thread
From: Sasha Levin @ 2014-05-30 13:11 UTC (permalink / raw)
  To: Jan Kara, pmladek, Andrew Morton; +Cc: LKML

Hi all,

I sometime see lockups when booting my KVM guest with the latest -next kernel,
it basically hangs right when it should start 'init', and after a while I get
the following spew:

[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[   30.770667] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[   30.770667] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  320.541585] kworker/dying (172) used greatest stack depth: 3656 bytes left
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  408.010589] INFO: task swapper/0:1 blocked for more than 200 seconds.
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] general protection fault: 0000 [#1]
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[ 2950.759251] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[ 3109.224481] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[ 3251.771402] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[ 3401.522129] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[ 3568.564422] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[ 3754.638392] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[ 3941.597288] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[ 4114.243312] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[ 4287.428963] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
[ 4458.566115] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
[   30.790833] PANIC: double fault, error_code: 0x0
[  320.551104]  lock: console_sem+0x0/0x60, .magic: dead4ead, .owner: kworker/u101:1/484, .owner_cpu: -1
[  408.020558]  lock: console_sem+0x0/0x60, .magic: dead4ead, .owner: kworker/u101:1/484, .owner_cpu: 33
[  321.761422]  lock: console_sem+0x0/0x60, .magic: dead4ead, .owner: kworker/u101:1/484, .owner_cpu: -1
[  320.551104] CPU: 33 PID: 172 Comm: kworker/dying Tainted: G        W     3.15.0-rc7-next-20140529-sasha-00017-gd239c90-dirty #550
[  320.551104]  ffffffff8e087060 ffff88012a800b38 ffffffff8c4e92e3 0000000000000001
[   30.790833]  ffff88007bdd3000
[   30.790833] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W     3.15.0-rc7-next-20140529-sasha-00017-gd239c90-dirty #550
[   30.790833]  ffff88012a800b58
[   30.790833] task: ffff880a2b3f0000 ti: ffff88007c1d0000 task.ti: ffff88007c1d0000
[   30.790833]  ffffffff8c4dc9fe
[  320.551104] RIP:  (??:?)
[  320.551104]  ffffffff8e087060
[   30.790833]  0000000086c41770
[   30.790833] string.isra.5 (lib/vsprintf.c:506)
[   30.790833]  ffff88012a800b88
[   30.790833] RSP: 0000:ffff88012ba00000  EFLAGS: 00010092
[   30.790833]  ffffffff891d08b4
[   30.790833] RAX: ffffffff8d6ee711 RBX: ffffffff906949f0 RCX: 0000000000000000
[   30.790833]  ffffffff8e087078
[   30.790833] RDX: ffffffff8d6eb246 RSI: ffffffffffffffff RDI: ffffffff906949f0
[  320.551104]
[   30.790833] RBP: ffff88012ba00018 R08: 000000000000ffff R09: 000000000000ffff
[  320.551104] Call Trace:
[   30.790833] R10: 73203a4755423001 R11: 206b636f6c6e6970 R12: ffffffff90694dc0
[   30.790833] R13: ffffffff90694dc0 R14: 000000000000ffff R15: 0000000000000000
[   30.790833]  <UNK>
[   30.790833] FS:  0000000000000000(0000) GS:ffff88012ba00000(0000) knlGS:0000000000000000
[  321.761422] CPU: 0 PID: 484 Comm: kworker/u101:1 Tainted: G        W     3.15.0-rc7-next-20140529-sasha-00017-gd239c90-dirty #550
[   30.790833] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  320.551104] sending NMI to all CPUs:
[   30.790833] CR2: ffff88012b9ffff8 CR3: 000000000e02c000 CR4: 00000000000006a0
[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
[  321.761422]  lock: logbuf_lock+0x0/0x48, .magic: dead4ead, .owner: kworker/u101:1/484, .owner_cpu: -1
[  321.761422] CPU: 0 PID: 484 Comm: kworker/u101:1 Tainted: G        W     3.15.0-rc7-next-20140529-sasha-00017-gd239c90-dirty #550
[  321.761422] Workqueue: events_power_efficient fb_flashcursor
[  321.761422]  ffffffff8e087000 ffff88007da06c78 ffffffff8c4e92e3 0000000000000000
[  321.761422]  ffff88007bdd3000 ffff88007da06c98 ffffffff8c4dc9fe ffffffff8e087000
[  321.761422]  0000000086c41770 ffff88007da06cc8 ffffffff891d08b4 ffffffff8e087018
[  321.761422] Call Trace:
[  321.761422] <NMI> dump_stack (lib/dump_stack.c:52)
[  321.761422] spin_dump (kernel/locking/spinlock_debug.c:68 (discriminator 6))
[  321.761422] do_raw_spin_lock (include/linux/nmi.h:35 kernel/locking/spinlock_debug.c:119 kernel/locking/spinlock_debug.c:137)
[  321.761422] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151)
[  321.761422] ? vprintk_emit (kernel/printk/printk.c:1620)
[  321.761422] vprintk_emit (kernel/printk/printk.c:1620)
[  321.761422] printk (kernel/printk/printk.c:1806)
[  321.761422] arch_trigger_all_cpu_backtrace_handler (arch/x86/kernel/apic/hw_nmi.c:85)
[  321.761422] nmi_handle (arch/x86/kernel/nmi.c:133)
[  321.761422] ? nmi_handle (arch/x86/kernel/nmi.c:114)
[  321.761422] default_do_nmi (arch/x86/kernel/nmi.c:336)
[  321.761422] ? get_color.isra.12 (drivers/video/console/fbcon.c:369)
[  321.761422] do_nmi (./arch/x86/include/asm/current.h:14 arch/x86/kernel/nmi.c:540)
[  321.761422] end_repeat_nmi (arch/x86/kernel/entry_64.S:1701)
[  321.761422] ? get_color.isra.12 (drivers/video/console/fbcon.c:369)
[  321.761422] ? get_color.isra.12 (drivers/video/console/fbcon.c:369)
[  321.761422] ? get_symbol_pos (kernel/kallsyms.c:230)
[  321.761422] ? get_symbol_pos (kernel/kallsyms.c:230)
[  321.761422] ? get_symbol_pos (kernel/kallsyms.c:230)
[  321.761422]  <<EOE>>  <UNK>
[  321.761422] NMI backtrace for cpu 0
[  321.761422] CPU: 0 PID: 484 Comm: kworker/u101:1 Tainted: G        W     3.15.0-rc7-next-20140529-sasha-00017-gd239c90-dirty #550
[  321.761422] Workqueue: events_power_efficient fb_flashcursor
[  321.761422] task: ffff88007bdd3000 ti: ffff88007bde8000 task.ti: ffff88007bde8000
[  321.761422] RIP: get_symbol_pos (kernel/kallsyms.c:230)
[  321.761422] RSP: 0000:ffff88007bde5c78  EFLAGS: 00000006
[  321.761422] RAX: 0000000000008c7c RBX: ffff88007bde5cd0 RCX: 0000000000009831
[  321.761422] RDX: ffff88007bde5cd8 RSI: ffff88007bde5ce0 RDI: ffffffff89bba9c0
[  321.761422] RBP: ffff88007bde5c78 R08: 000000000000a3e6 R09: 000000000000ffff
[  321.761422] R10: 000000000005da8d R11: 3a6d6d6f43203438 R12: ffff88007bde5d41
[  321.761422] R13: ffffffff89bba9c0 R14: 0000000000000000 R15: ffffffff8d6e8c47
[  321.761422] FS:  0000000000000000(0000) GS:ffff88007da00000(0000) knlGS:0000000000000000
[  321.761422] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  321.761422] CR2: 00000000ffffffff CR3: 000000000e02c000 CR4: 00000000000006b0
[  321.761422] Stack:
[  321.761422]  ffff88007bde5cb8 ffffffff8920eb4e 70202c786c383025 ffff88007bde5e31
[  321.761422]  ffff8800fbde5e2b ffff88007bde5d41 ffffffff89bba9c0 0000000000000000
[  321.761422]  ffff88007bde5d08 ffffffff8920ec42 522829676e6fffff 296e656c3e2d4345
[  321.761422] Call Trace:
[  321.761422]  <UNK>
[ 321.761422] Code: 30 0e f7 8c 55 48 85 c0 48 89 e5 75 02 0f 0b 4c 8b 15 ca fc 04 04 31 c0 4d 89 d0 eb 16 48 d1 e9 48 01 c1 48 39 3c cd 30 0e f7 8c <4c> 0f 47 c1 48 0f 46 c1 4c 89 c1 48 29 c1 48 83 f9 01 77 de 48
All code
========
   0:	30 0e                	xor    %cl,(%rsi)
   2:	f7                   	(bad)
   3:	8c 55 48             	mov    %ss,0x48(%rbp)
   6:	85 c0                	test   %eax,%eax
   8:	48 89 e5             	mov    %rsp,%rbp
   b:	75 02                	jne    0xf
   d:	0f 0b                	ud2
   f:	4c 8b 15 ca fc 04 04 	mov    0x404fcca(%rip),%r10        # 0x404fce0
  16:	31 c0                	xor    %eax,%eax
  18:	4d 89 d0             	mov    %r10,%r8
  1b:	eb 16                	jmp    0x33
  1d:	48 d1 e9             	shr    %rcx
  20:	48 01 c1             	add    %rax,%rcx
  23:	48 39 3c cd 30 0e f7 	cmp    %rdi,-0x7308f1d0(,%rcx,8)
  2a:	8c
  2b:*	4c 0f 47 c1          	cmova  %rcx,%r8		<-- trapping instruction
  2f:	48 0f 46 c1          	cmovbe %rcx,%rax
  33:	4c 89 c1             	mov    %r8,%rcx
  36:	48 29 c1             	sub    %rax,%rcx
  39:	48 83 f9 01          	cmp    $0x1,%rcx
  3d:	77 de                	ja     0x1d
  3f:	48                   	rex.W
	...

Code starting with the faulting instruction
===========================================
   0:	4c 0f 47 c1          	cmova  %rcx,%r8
   4:	48 0f 46 c1          	cmovbe %rcx,%rax
   8:	4c 89 c1             	mov    %r8,%rcx
   b:	48 29 c1             	sub    %rax,%rcx
   e:	48 83 f9 01          	cmp    $0x1,%rcx
  12:	77 de                	ja     0xfffffffffffffff2
  14:	48                   	rex.W
	...
[   30.790833] NMI backtrace for cpu 1
[hang]


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: console: lockup on boot
  2014-05-30 13:11 console: lockup on boot Sasha Levin
@ 2014-05-30 13:58 ` Peter Hurley
  2014-05-30 14:07   ` Jan Kara
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Hurley @ 2014-05-30 13:58 UTC (permalink / raw)
  To: Sasha Levin, Jan Kara, pmladek, Andrew Morton, Jet Chen; +Cc: LKML

On 05/30/2014 09:11 AM, Sasha Levin wrote:
> Hi all,
>
> I sometime see lockups when booting my KVM guest with the latest -next kernel,
> it basically hangs right when it should start 'init', and after a while I get
> the following spew:
>
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0

Maybe related to this report: https://lkml.org/lkml/2014/5/30/26
from Jet Chen which was bisected to

commit bafe980f5afc7ccc693fd8c81c8aa5a02fbb5ae0
Author:     Jan Kara <jack@suse.cz>
AuthorDate: Thu May 22 10:43:35 2014 +1000
Commit:     Stephen Rothwell <sfr@canb.auug.org.au>
CommitDate: Thu May 22 10:43:35 2014 +1000

     printk: enable interrupts before calling console_trylock_for_printk()
         We need interrupts disabled when calling console_trylock_for_printk() only
     so that cpu id we pass to can_use_console() remains valid (for other
     things console_sem provides all the exclusion we need and deadlocks on
     console_sem due to interrupts are impossible because we use
     down_trylock()).  However if we are rescheduled, we are guaranteed to run
     on an online cpu so we can easily just get the cpu id in
     can_use_console().
         We can lose a bit of performance when we enable interrupts in
     vprintk_emit() and then disable them again in console_unlock() but OTOH it
     can somewhat reduce interrupt latency caused by console_unlock()
     especially since later in the patch series we will want to spin on
     console_sem in console_trylock_for_printk().
         Signed-off-by: Jan Kara <jack@suse.cz>
     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

?

> [   30.770667] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [   30.770667] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  320.541585] kworker/dying (172) used greatest stack depth: 3656 bytes left
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  408.010589] INFO: task swapper/0:1 blocked for more than 200 seconds.
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] general protection fault: 0000 [#1]
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [ 2950.759251] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [ 3109.224481] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [ 3251.771402] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [ 3401.522129] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [ 3568.564422] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [ 3754.638392] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [ 3941.597288] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [ 4114.243312] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [ 4287.428963] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> [ 4458.566115] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> [  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> [   30.790833] PANIC: double fault, error_code: 0x0
> [  320.551104]  lock: console_sem+0x0/0x60, .magic: dead4ead, .owner: kworker/u101:1/484, .owner_cpu: -1
> [  408.020558]  lock: console_sem+0x0/0x60, .magic: dead4ead, .owner: kworker/u101:1/484, .owner_cpu: 33
> [  321.761422]  lock: console_sem+0x0/0x60, .magic: dead4ead, .owner: kworker/u101:1/484, .owner_cpu: -1
> [  320.551104] CPU: 33 PID: 172 Comm: kworker/dying Tainted: G        W     3.15.0-rc7-next-20140529-sasha-00017-gd239c90-dirty #550
> [  320.551104]  ffffffff8e087060 ffff88012a800b38 ffffffff8c4e92e3 0000000000000001
> [   30.790833]  ffff88007bdd3000
> [   30.790833] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W     3.15.0-rc7-next-20140529-sasha-00017-gd239c90-dirty #550
> [   30.790833]  ffff88012a800b58
> [   30.790833] task: ffff880a2b3f0000 ti: ffff88007c1d0000 task.ti: ffff88007c1d0000
> [   30.790833]  ffffffff8c4dc9fe
> [  320.551104] RIP:  (??:?)
> [  320.551104]  ffffffff8e087060
> [   30.790833]  0000000086c41770
> [   30.790833] string.isra.5 (lib/vsprintf.c:506)
> [   30.790833]  ffff88012a800b88
> [   30.790833] RSP: 0000:ffff88012ba00000  EFLAGS: 00010092
> [   30.790833]  ffffffff891d08b4
> [   30.790833] RAX: ffffffff8d6ee711 RBX: ffffffff906949f0 RCX: 0000000000000000
> [   30.790833]  ffffffff8e087078
> [   30.790833] RDX: ffffffff8d6eb246 RSI: ffffffffffffffff RDI: ffffffff906949f0
> [  320.551104]
> [   30.790833] RBP: ffff88012ba00018 R08: 000000000000ffff R09: 000000000000ffff
> [  320.551104] Call Trace:
> [   30.790833] R10: 73203a4755423001 R11: 206b636f6c6e6970 R12: ffffffff90694dc0
> [   30.790833] R13: ffffffff90694dc0 R14: 000000000000ffff R15: 0000000000000000
> [   30.790833]  <UNK>
> [   30.790833] FS:  0000000000000000(0000) GS:ffff88012ba00000(0000) knlGS:0000000000000000
> [  321.761422] CPU: 0 PID: 484 Comm: kworker/u101:1 Tainted: G        W     3.15.0-rc7-next-20140529-sasha-00017-gd239c90-dirty #550
> [   30.790833] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  320.551104] sending NMI to all CPUs:
> [   30.790833] CR2: ffff88012b9ffff8 CR3: 000000000e02c000 CR4: 00000000000006a0
> [  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> [  321.761422]  lock: logbuf_lock+0x0/0x48, .magic: dead4ead, .owner: kworker/u101:1/484, .owner_cpu: -1
> [  321.761422] CPU: 0 PID: 484 Comm: kworker/u101:1 Tainted: G        W     3.15.0-rc7-next-20140529-sasha-00017-gd239c90-dirty #550
> [  321.761422] Workqueue: events_power_efficient fb_flashcursor
> [  321.761422]  ffffffff8e087000 ffff88007da06c78 ffffffff8c4e92e3 0000000000000000
> [  321.761422]  ffff88007bdd3000 ffff88007da06c98 ffffffff8c4dc9fe ffffffff8e087000
> [  321.761422]  0000000086c41770 ffff88007da06cc8 ffffffff891d08b4 ffffffff8e087018
> [  321.761422] Call Trace:
> [  321.761422] <NMI> dump_stack (lib/dump_stack.c:52)
> [  321.761422] spin_dump (kernel/locking/spinlock_debug.c:68 (discriminator 6))
> [  321.761422] do_raw_spin_lock (include/linux/nmi.h:35 kernel/locking/spinlock_debug.c:119 kernel/locking/spinlock_debug.c:137)
> [  321.761422] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151)
> [  321.761422] ? vprintk_emit (kernel/printk/printk.c:1620)
> [  321.761422] vprintk_emit (kernel/printk/printk.c:1620)
> [  321.761422] printk (kernel/printk/printk.c:1806)
> [  321.761422] arch_trigger_all_cpu_backtrace_handler (arch/x86/kernel/apic/hw_nmi.c:85)
> [  321.761422] nmi_handle (arch/x86/kernel/nmi.c:133)
> [  321.761422] ? nmi_handle (arch/x86/kernel/nmi.c:114)
> [  321.761422] default_do_nmi (arch/x86/kernel/nmi.c:336)
> [  321.761422] ? get_color.isra.12 (drivers/video/console/fbcon.c:369)
> [  321.761422] do_nmi (./arch/x86/include/asm/current.h:14 arch/x86/kernel/nmi.c:540)
> [  321.761422] end_repeat_nmi (arch/x86/kernel/entry_64.S:1701)
> [  321.761422] ? get_color.isra.12 (drivers/video/console/fbcon.c:369)
> [  321.761422] ? get_color.isra.12 (drivers/video/console/fbcon.c:369)
> [  321.761422] ? get_symbol_pos (kernel/kallsyms.c:230)
> [  321.761422] ? get_symbol_pos (kernel/kallsyms.c:230)
> [  321.761422] ? get_symbol_pos (kernel/kallsyms.c:230)
> [  321.761422]  <<EOE>>  <UNK>
> [  321.761422] NMI backtrace for cpu 0
> [  321.761422] CPU: 0 PID: 484 Comm: kworker/u101:1 Tainted: G        W     3.15.0-rc7-next-20140529-sasha-00017-gd239c90-dirty #550
> [  321.761422] Workqueue: events_power_efficient fb_flashcursor
> [  321.761422] task: ffff88007bdd3000 ti: ffff88007bde8000 task.ti: ffff88007bde8000
> [  321.761422] RIP: get_symbol_pos (kernel/kallsyms.c:230)
> [  321.761422] RSP: 0000:ffff88007bde5c78  EFLAGS: 00000006
> [  321.761422] RAX: 0000000000008c7c RBX: ffff88007bde5cd0 RCX: 0000000000009831
> [  321.761422] RDX: ffff88007bde5cd8 RSI: ffff88007bde5ce0 RDI: ffffffff89bba9c0
> [  321.761422] RBP: ffff88007bde5c78 R08: 000000000000a3e6 R09: 000000000000ffff
> [  321.761422] R10: 000000000005da8d R11: 3a6d6d6f43203438 R12: ffff88007bde5d41
> [  321.761422] R13: ffffffff89bba9c0 R14: 0000000000000000 R15: ffffffff8d6e8c47
> [  321.761422] FS:  0000000000000000(0000) GS:ffff88007da00000(0000) knlGS:0000000000000000
> [  321.761422] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  321.761422] CR2: 00000000ffffffff CR3: 000000000e02c000 CR4: 00000000000006b0
> [  321.761422] Stack:
> [  321.761422]  ffff88007bde5cb8 ffffffff8920eb4e 70202c786c383025 ffff88007bde5e31
> [  321.761422]  ffff8800fbde5e2b ffff88007bde5d41 ffffffff89bba9c0 0000000000000000
> [  321.761422]  ffff88007bde5d08 ffffffff8920ec42 522829676e6fffff 296e656c3e2d4345
> [  321.761422] Call Trace:
> [  321.761422]  <UNK>
> [ 321.761422] Code: 30 0e f7 8c 55 48 85 c0 48 89 e5 75 02 0f 0b 4c 8b 15 ca fc 04 04 31 c0 4d 89 d0 eb 16 48 d1 e9 48 01 c1 48 39 3c cd 30 0e f7 8c <4c> 0f 47 c1 48 0f 46 c1 4c 89 c1 48 29 c1 48 83 f9 01 77 de 48
> All code
> ========
>     0:	30 0e                	xor    %cl,(%rsi)
>     2:	f7                   	(bad)
>     3:	8c 55 48             	mov    %ss,0x48(%rbp)
>     6:	85 c0                	test   %eax,%eax
>     8:	48 89 e5             	mov    %rsp,%rbp
>     b:	75 02                	jne    0xf
>     d:	0f 0b                	ud2
>     f:	4c 8b 15 ca fc 04 04 	mov    0x404fcca(%rip),%r10        # 0x404fce0
>    16:	31 c0                	xor    %eax,%eax
>    18:	4d 89 d0             	mov    %r10,%r8
>    1b:	eb 16                	jmp    0x33
>    1d:	48 d1 e9             	shr    %rcx
>    20:	48 01 c1             	add    %rax,%rcx
>    23:	48 39 3c cd 30 0e f7 	cmp    %rdi,-0x7308f1d0(,%rcx,8)
>    2a:	8c
>    2b:*	4c 0f 47 c1          	cmova  %rcx,%r8		<-- trapping instruction
>    2f:	48 0f 46 c1          	cmovbe %rcx,%rax
>    33:	4c 89 c1             	mov    %r8,%rcx
>    36:	48 29 c1             	sub    %rax,%rcx
>    39:	48 83 f9 01          	cmp    $0x1,%rcx
>    3d:	77 de                	ja     0x1d
>    3f:	48                   	rex.W
> 	...
>
> Code starting with the faulting instruction
> ===========================================
>     0:	4c 0f 47 c1          	cmova  %rcx,%r8
>     4:	48 0f 46 c1          	cmovbe %rcx,%rax
>     8:	4c 89 c1             	mov    %r8,%rcx
>     b:	48 29 c1             	sub    %rax,%rcx
>     e:	48 83 f9 01          	cmp    $0x1,%rcx
>    12:	77 de                	ja     0xfffffffffffffff2
>    14:	48                   	rex.W
> 	...
> [   30.790833] NMI backtrace for cpu 1
> [hang]
>
>
> Thanks,
> Sasha
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: console: lockup on boot
  2014-05-30 13:58 ` Peter Hurley
@ 2014-05-30 14:07   ` Jan Kara
  2014-06-06 19:05     ` Sasha Levin
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Kara @ 2014-05-30 14:07 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Sasha Levin, Jan Kara, pmladek, Andrew Morton, Jet Chen, LKML

On Fri 30-05-14 09:58:14, Peter Hurley wrote:
> On 05/30/2014 09:11 AM, Sasha Levin wrote:
> >Hi all,
> >
> >I sometime see lockups when booting my KVM guest with the latest -next kernel,
> >it basically hangs right when it should start 'init', and after a while I get
> >the following spew:
> >
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> 
> Maybe related to this report: https://lkml.org/lkml/2014/5/30/26
> from Jet Chen which was bisected to
> 
> commit bafe980f5afc7ccc693fd8c81c8aa5a02fbb5ae0
> Author:     Jan Kara <jack@suse.cz>
> AuthorDate: Thu May 22 10:43:35 2014 +1000
> Commit:     Stephen Rothwell <sfr@canb.auug.org.au>
> CommitDate: Thu May 22 10:43:35 2014 +1000
> 
>     printk: enable interrupts before calling console_trylock_for_printk()
>         We need interrupts disabled when calling console_trylock_for_printk() only
>     so that cpu id we pass to can_use_console() remains valid (for other
>     things console_sem provides all the exclusion we need and deadlocks on
>     console_sem due to interrupts are impossible because we use
>     down_trylock()).  However if we are rescheduled, we are guaranteed to run
>     on an online cpu so we can easily just get the cpu id in
>     can_use_console().
>         We can lose a bit of performance when we enable interrupts in
>     vprintk_emit() and then disable them again in console_unlock() but OTOH it
>     can somewhat reduce interrupt latency caused by console_unlock()
>     especially since later in the patch series we will want to spin on
>     console_sem in console_trylock_for_printk().
>         Signed-off-by: Jan Kara <jack@suse.cz>
>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> 
> ?
  Yeah, very likely. I think I see the problem, I'll send the fix shortly.

								Honza
 
> >[   30.770667] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[   30.770667] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  320.541585] kworker/dying (172) used greatest stack depth: 3656 bytes left
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  408.010589] INFO: task swapper/0:1 blocked for more than 200 seconds.
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] general protection fault: 0000 [#1]
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[ 2950.759251] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[ 3109.224481] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[ 3251.771402] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[ 3401.522129] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[ 3568.564422] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[ 3754.638392] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[ 3941.597288] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[ 4114.243312] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[ 4287.428963] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  320.551104] BUG: spinlock lockup suspected on CPU#33, kworker/dying/172
> >[ 4458.566115] BUG: spinlock lockup suspected on CPU#32, swapper/32/0
> >[  408.020558] BUG: spinlock lockup suspected on CPU#43, khungtaskd/3760
> >[   30.790833] PANIC: double fault, error_code: 0x0
> >[  320.551104]  lock: console_sem+0x0/0x60, .magic: dead4ead, .owner: kworker/u101:1/484, .owner_cpu: -1
> >[  408.020558]  lock: console_sem+0x0/0x60, .magic: dead4ead, .owner: kworker/u101:1/484, .owner_cpu: 33
> >[  321.761422]  lock: console_sem+0x0/0x60, .magic: dead4ead, .owner: kworker/u101:1/484, .owner_cpu: -1
> >[  320.551104] CPU: 33 PID: 172 Comm: kworker/dying Tainted: G        W     3.15.0-rc7-next-20140529-sasha-00017-gd239c90-dirty #550
> >[  320.551104]  ffffffff8e087060 ffff88012a800b38 ffffffff8c4e92e3 0000000000000001
> >[   30.790833]  ffff88007bdd3000
> >[   30.790833] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W     3.15.0-rc7-next-20140529-sasha-00017-gd239c90-dirty #550
> >[   30.790833]  ffff88012a800b58
> >[   30.790833] task: ffff880a2b3f0000 ti: ffff88007c1d0000 task.ti: ffff88007c1d0000
> >[   30.790833]  ffffffff8c4dc9fe
> >[  320.551104] RIP:  (??:?)
> >[  320.551104]  ffffffff8e087060
> >[   30.790833]  0000000086c41770
> >[   30.790833] string.isra.5 (lib/vsprintf.c:506)
> >[   30.790833]  ffff88012a800b88
> >[   30.790833] RSP: 0000:ffff88012ba00000  EFLAGS: 00010092
> >[   30.790833]  ffffffff891d08b4
> >[   30.790833] RAX: ffffffff8d6ee711 RBX: ffffffff906949f0 RCX: 0000000000000000
> >[   30.790833]  ffffffff8e087078
> >[   30.790833] RDX: ffffffff8d6eb246 RSI: ffffffffffffffff RDI: ffffffff906949f0
> >[  320.551104]
> >[   30.790833] RBP: ffff88012ba00018 R08: 000000000000ffff R09: 000000000000ffff
> >[  320.551104] Call Trace:
> >[   30.790833] R10: 73203a4755423001 R11: 206b636f6c6e6970 R12: ffffffff90694dc0
> >[   30.790833] R13: ffffffff90694dc0 R14: 000000000000ffff R15: 0000000000000000
> >[   30.790833]  <UNK>
> >[   30.790833] FS:  0000000000000000(0000) GS:ffff88012ba00000(0000) knlGS:0000000000000000
> >[  321.761422] CPU: 0 PID: 484 Comm: kworker/u101:1 Tainted: G        W     3.15.0-rc7-next-20140529-sasha-00017-gd239c90-dirty #550
> >[   30.790833] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> >[  320.551104] sending NMI to all CPUs:
> >[   30.790833] CR2: ffff88012b9ffff8 CR3: 000000000e02c000 CR4: 00000000000006a0
> >[  321.761422] BUG: spinlock lockup suspected on CPU#0, kworker/u101:1/484
> >[  321.761422]  lock: logbuf_lock+0x0/0x48, .magic: dead4ead, .owner: kworker/u101:1/484, .owner_cpu: -1
> >[  321.761422] CPU: 0 PID: 484 Comm: kworker/u101:1 Tainted: G        W     3.15.0-rc7-next-20140529-sasha-00017-gd239c90-dirty #550
> >[  321.761422] Workqueue: events_power_efficient fb_flashcursor
> >[  321.761422]  ffffffff8e087000 ffff88007da06c78 ffffffff8c4e92e3 0000000000000000
> >[  321.761422]  ffff88007bdd3000 ffff88007da06c98 ffffffff8c4dc9fe ffffffff8e087000
> >[  321.761422]  0000000086c41770 ffff88007da06cc8 ffffffff891d08b4 ffffffff8e087018
> >[  321.761422] Call Trace:
> >[  321.761422] <NMI> dump_stack (lib/dump_stack.c:52)
> >[  321.761422] spin_dump (kernel/locking/spinlock_debug.c:68 (discriminator 6))
> >[  321.761422] do_raw_spin_lock (include/linux/nmi.h:35 kernel/locking/spinlock_debug.c:119 kernel/locking/spinlock_debug.c:137)
> >[  321.761422] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151)
> >[  321.761422] ? vprintk_emit (kernel/printk/printk.c:1620)
> >[  321.761422] vprintk_emit (kernel/printk/printk.c:1620)
> >[  321.761422] printk (kernel/printk/printk.c:1806)
> >[  321.761422] arch_trigger_all_cpu_backtrace_handler (arch/x86/kernel/apic/hw_nmi.c:85)
> >[  321.761422] nmi_handle (arch/x86/kernel/nmi.c:133)
> >[  321.761422] ? nmi_handle (arch/x86/kernel/nmi.c:114)
> >[  321.761422] default_do_nmi (arch/x86/kernel/nmi.c:336)
> >[  321.761422] ? get_color.isra.12 (drivers/video/console/fbcon.c:369)
> >[  321.761422] do_nmi (./arch/x86/include/asm/current.h:14 arch/x86/kernel/nmi.c:540)
> >[  321.761422] end_repeat_nmi (arch/x86/kernel/entry_64.S:1701)
> >[  321.761422] ? get_color.isra.12 (drivers/video/console/fbcon.c:369)
> >[  321.761422] ? get_color.isra.12 (drivers/video/console/fbcon.c:369)
> >[  321.761422] ? get_symbol_pos (kernel/kallsyms.c:230)
> >[  321.761422] ? get_symbol_pos (kernel/kallsyms.c:230)
> >[  321.761422] ? get_symbol_pos (kernel/kallsyms.c:230)
> >[  321.761422]  <<EOE>>  <UNK>
> >[  321.761422] NMI backtrace for cpu 0
> >[  321.761422] CPU: 0 PID: 484 Comm: kworker/u101:1 Tainted: G        W     3.15.0-rc7-next-20140529-sasha-00017-gd239c90-dirty #550
> >[  321.761422] Workqueue: events_power_efficient fb_flashcursor
> >[  321.761422] task: ffff88007bdd3000 ti: ffff88007bde8000 task.ti: ffff88007bde8000
> >[  321.761422] RIP: get_symbol_pos (kernel/kallsyms.c:230)
> >[  321.761422] RSP: 0000:ffff88007bde5c78  EFLAGS: 00000006
> >[  321.761422] RAX: 0000000000008c7c RBX: ffff88007bde5cd0 RCX: 0000000000009831
> >[  321.761422] RDX: ffff88007bde5cd8 RSI: ffff88007bde5ce0 RDI: ffffffff89bba9c0
> >[  321.761422] RBP: ffff88007bde5c78 R08: 000000000000a3e6 R09: 000000000000ffff
> >[  321.761422] R10: 000000000005da8d R11: 3a6d6d6f43203438 R12: ffff88007bde5d41
> >[  321.761422] R13: ffffffff89bba9c0 R14: 0000000000000000 R15: ffffffff8d6e8c47
> >[  321.761422] FS:  0000000000000000(0000) GS:ffff88007da00000(0000) knlGS:0000000000000000
> >[  321.761422] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> >[  321.761422] CR2: 00000000ffffffff CR3: 000000000e02c000 CR4: 00000000000006b0
> >[  321.761422] Stack:
> >[  321.761422]  ffff88007bde5cb8 ffffffff8920eb4e 70202c786c383025 ffff88007bde5e31
> >[  321.761422]  ffff8800fbde5e2b ffff88007bde5d41 ffffffff89bba9c0 0000000000000000
> >[  321.761422]  ffff88007bde5d08 ffffffff8920ec42 522829676e6fffff 296e656c3e2d4345
> >[  321.761422] Call Trace:
> >[  321.761422]  <UNK>
> >[ 321.761422] Code: 30 0e f7 8c 55 48 85 c0 48 89 e5 75 02 0f 0b 4c 8b 15 ca fc 04 04 31 c0 4d 89 d0 eb 16 48 d1 e9 48 01 c1 48 39 3c cd 30 0e f7 8c <4c> 0f 47 c1 48 0f 46 c1 4c 89 c1 48 29 c1 48 83 f9 01 77 de 48
> >All code
> >========
> >    0:	30 0e                	xor    %cl,(%rsi)
> >    2:	f7                   	(bad)
> >    3:	8c 55 48             	mov    %ss,0x48(%rbp)
> >    6:	85 c0                	test   %eax,%eax
> >    8:	48 89 e5             	mov    %rsp,%rbp
> >    b:	75 02                	jne    0xf
> >    d:	0f 0b                	ud2
> >    f:	4c 8b 15 ca fc 04 04 	mov    0x404fcca(%rip),%r10        # 0x404fce0
> >   16:	31 c0                	xor    %eax,%eax
> >   18:	4d 89 d0             	mov    %r10,%r8
> >   1b:	eb 16                	jmp    0x33
> >   1d:	48 d1 e9             	shr    %rcx
> >   20:	48 01 c1             	add    %rax,%rcx
> >   23:	48 39 3c cd 30 0e f7 	cmp    %rdi,-0x7308f1d0(,%rcx,8)
> >   2a:	8c
> >   2b:*	4c 0f 47 c1          	cmova  %rcx,%r8		<-- trapping instruction
> >   2f:	48 0f 46 c1          	cmovbe %rcx,%rax
> >   33:	4c 89 c1             	mov    %r8,%rcx
> >   36:	48 29 c1             	sub    %rax,%rcx
> >   39:	48 83 f9 01          	cmp    $0x1,%rcx
> >   3d:	77 de                	ja     0x1d
> >   3f:	48                   	rex.W
> >	...
> >
> >Code starting with the faulting instruction
> >===========================================
> >    0:	4c 0f 47 c1          	cmova  %rcx,%r8
> >    4:	48 0f 46 c1          	cmovbe %rcx,%rax
> >    8:	4c 89 c1             	mov    %r8,%rcx
> >    b:	48 29 c1             	sub    %rax,%rcx
> >    e:	48 83 f9 01          	cmp    $0x1,%rcx
> >   12:	77 de                	ja     0xfffffffffffffff2
> >   14:	48                   	rex.W
> >	...
> >[   30.790833] NMI backtrace for cpu 1
> >[hang]
> >
> >
> >Thanks,
> >Sasha
> >--
> >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >the body of a message to majordomo@vger.kernel.org
> >More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >Please read the FAQ at  http://www.tux.org/lkml/
> >
> 
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: console: lockup on boot
  2014-05-30 14:07   ` Jan Kara
@ 2014-06-06 19:05     ` Sasha Levin
  2014-06-10 15:59       ` Peter Hurley
  0 siblings, 1 reply; 17+ messages in thread
From: Sasha Levin @ 2014-06-06 19:05 UTC (permalink / raw)
  To: Jan Kara, Peter Hurley; +Cc: pmladek, Andrew Morton, Jet Chen, LKML

On 05/30/2014 10:07 AM, Jan Kara wrote:
> On Fri 30-05-14 09:58:14, Peter Hurley wrote:
>> > On 05/30/2014 09:11 AM, Sasha Levin wrote:
>>> > >Hi all,
>>> > >
>>> > >I sometime see lockups when booting my KVM guest with the latest -next kernel,
>>> > >it basically hangs right when it should start 'init', and after a while I get
>>> > >the following spew:
>>> > >
>>> > >[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
>> > 
>> > Maybe related to this report: https://lkml.org/lkml/2014/5/30/26
>> > from Jet Chen which was bisected to
>> > 
>> > commit bafe980f5afc7ccc693fd8c81c8aa5a02fbb5ae0
>> > Author:     Jan Kara <jack@suse.cz>
>> > AuthorDate: Thu May 22 10:43:35 2014 +1000
>> > Commit:     Stephen Rothwell <sfr@canb.auug.org.au>
>> > CommitDate: Thu May 22 10:43:35 2014 +1000
>> > 
>> >     printk: enable interrupts before calling console_trylock_for_printk()
>> >         We need interrupts disabled when calling console_trylock_for_printk() only
>> >     so that cpu id we pass to can_use_console() remains valid (for other
>> >     things console_sem provides all the exclusion we need and deadlocks on
>> >     console_sem due to interrupts are impossible because we use
>> >     down_trylock()).  However if we are rescheduled, we are guaranteed to run
>> >     on an online cpu so we can easily just get the cpu id in
>> >     can_use_console().
>> >         We can lose a bit of performance when we enable interrupts in
>> >     vprintk_emit() and then disable them again in console_unlock() but OTOH it
>> >     can somewhat reduce interrupt latency caused by console_unlock()
>> >     especially since later in the patch series we will want to spin on
>> >     console_sem in console_trylock_for_printk().
>> >         Signed-off-by: Jan Kara <jack@suse.cz>
>> >     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>> > 
>> > ?
>   Yeah, very likely. I think I see the problem, I'll send the fix shortly.

Hi Jan,

It seems that the issue I'm seeing is different from the "[prink]  BUG: spinlock
lockup suspected on CPU#0, swapper/1".

Is there anything else I could try here? The issue is very common during testing.


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: console: lockup on boot
  2014-06-06 19:05     ` Sasha Levin
@ 2014-06-10 15:59       ` Peter Hurley
  2014-06-11 14:55         ` Sasha Levin
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Hurley @ 2014-06-10 15:59 UTC (permalink / raw)
  To: Sasha Levin, Jan Kara; +Cc: pmladek, Andrew Morton, Jet Chen, LKML

On 06/06/2014 03:05 PM, Sasha Levin wrote:
> On 05/30/2014 10:07 AM, Jan Kara wrote:
>> On Fri 30-05-14 09:58:14, Peter Hurley wrote:
>>>> On 05/30/2014 09:11 AM, Sasha Levin wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> I sometime see lockups when booting my KVM guest with the latest -next kernel,
>>>>>> it basically hangs right when it should start 'init', and after a while I get
>>>>>> the following spew:
>>>>>>
>>>>>> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
>>>>
>>>> Maybe related to this report: https://lkml.org/lkml/2014/5/30/26
>>>> from Jet Chen which was bisected to
>>>>
>>>> commit bafe980f5afc7ccc693fd8c81c8aa5a02fbb5ae0
>>>> Author:     Jan Kara <jack@suse.cz>
>>>> AuthorDate: Thu May 22 10:43:35 2014 +1000
>>>> Commit:     Stephen Rothwell <sfr@canb.auug.org.au>
>>>> CommitDate: Thu May 22 10:43:35 2014 +1000
>>>>
>>>>      printk: enable interrupts before calling console_trylock_for_printk()
>>>>          We need interrupts disabled when calling console_trylock_for_printk() only
>>>>      so that cpu id we pass to can_use_console() remains valid (for other
>>>>      things console_sem provides all the exclusion we need and deadlocks on
>>>>      console_sem due to interrupts are impossible because we use
>>>>      down_trylock()).  However if we are rescheduled, we are guaranteed to run
>>>>      on an online cpu so we can easily just get the cpu id in
>>>>      can_use_console().
>>>>          We can lose a bit of performance when we enable interrupts in
>>>>      vprintk_emit() and then disable them again in console_unlock() but OTOH it
>>>>      can somewhat reduce interrupt latency caused by console_unlock()
>>>>      especially since later in the patch series we will want to spin on
>>>>      console_sem in console_trylock_for_printk().
>>>>          Signed-off-by: Jan Kara <jack@suse.cz>
>>>>      Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>>>>
>>>> ?
>>    Yeah, very likely. I think I see the problem, I'll send the fix shortly.
>
> Hi Jan,
>
> It seems that the issue I'm seeing is different from the "[prink]  BUG: spinlock
> lockup suspected on CPU#0, swapper/1".
>
> Is there anything else I could try here? The issue is very common during testing.

Sasha,

Is this bisectable? Maybe that's the best way forward here.

Regards,
Peter Hurley


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: console: lockup on boot
  2014-06-10 15:59       ` Peter Hurley
@ 2014-06-11 14:55         ` Sasha Levin
  2014-06-11 15:34           ` Peter Hurley
                             ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Sasha Levin @ 2014-06-11 14:55 UTC (permalink / raw)
  To: Peter Hurley, Jan Kara
  Cc: pmladek, Andrew Morton, Jet Chen, LKML, Linus Torvalds

On 06/10/2014 11:59 AM, Peter Hurley wrote:
> On 06/06/2014 03:05 PM, Sasha Levin wrote:
>> On 05/30/2014 10:07 AM, Jan Kara wrote:
>>> On Fri 30-05-14 09:58:14, Peter Hurley wrote:
>>>>> On 05/30/2014 09:11 AM, Sasha Levin wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I sometime see lockups when booting my KVM guest with the latest -next kernel,
>>>>>>> it basically hangs right when it should start 'init', and after a while I get
>>>>>>> the following spew:
>>>>>>>
>>>>>>> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
>>>>>
>>>>> Maybe related to this report: https://lkml.org/lkml/2014/5/30/26
>>>>> from Jet Chen which was bisected to
>>>>>
>>>>> commit bafe980f5afc7ccc693fd8c81c8aa5a02fbb5ae0
>>>>> Author:     Jan Kara <jack@suse.cz>
>>>>> AuthorDate: Thu May 22 10:43:35 2014 +1000
>>>>> Commit:     Stephen Rothwell <sfr@canb.auug.org.au>
>>>>> CommitDate: Thu May 22 10:43:35 2014 +1000
>>>>>
>>>>>      printk: enable interrupts before calling console_trylock_for_printk()
>>>>>          We need interrupts disabled when calling console_trylock_for_printk() only
>>>>>      so that cpu id we pass to can_use_console() remains valid (for other
>>>>>      things console_sem provides all the exclusion we need and deadlocks on
>>>>>      console_sem due to interrupts are impossible because we use
>>>>>      down_trylock()).  However if we are rescheduled, we are guaranteed to run
>>>>>      on an online cpu so we can easily just get the cpu id in
>>>>>      can_use_console().
>>>>>          We can lose a bit of performance when we enable interrupts in
>>>>>      vprintk_emit() and then disable them again in console_unlock() but OTOH it
>>>>>      can somewhat reduce interrupt latency caused by console_unlock()
>>>>>      especially since later in the patch series we will want to spin on
>>>>>      console_sem in console_trylock_for_printk().
>>>>>          Signed-off-by: Jan Kara <jack@suse.cz>
>>>>>      Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>>>>>
>>>>> ?
>>>    Yeah, very likely. I think I see the problem, I'll send the fix shortly.
>>
>> Hi Jan,
>>
>> It seems that the issue I'm seeing is different from the "[prink]  BUG: spinlock
>> lockup suspected on CPU#0, swapper/1".
>>
>> Is there anything else I could try here? The issue is very common during testing.
> 
> Sasha,
> 
> Is this bisectable? Maybe that's the best way forward here.

I've ran a bisection again and ended up at the same commit as Jet Chen (the commit
unfortunately already made it to Linus's tree).

Note that I did try Jan's proposed fix and that didn't solve the issue for me, I
believe we're seeing different issues caused by the same commit.


939f04bec1a4ef6ba4370b0f34b01decc844b1b1 is the first bad commit
commit 939f04bec1a4ef6ba4370b0f34b01decc844b1b1
Author: Jan Kara <jack@suse.cz>
Date:   Wed Jun 4 16:11:37 2014 -0700

    printk: enable interrupts before calling console_trylock_for_printk()

    We need interrupts disabled when calling console_trylock_for_printk()
    only so that cpu id we pass to can_use_console() remains valid (for
    other things console_sem provides all the exclusion we need and
    deadlocks on console_sem due to interrupts are impossible because we use
    down_trylock()).  However if we are rescheduled, we are guaranteed to
    run on an online cpu so we can easily just get the cpu id in
    can_use_console().

    We can lose a bit of performance when we enable interrupts in
    vprintk_emit() and then disable them again in console_unlock() but OTOH
    it can somewhat reduce interrupt latency caused by console_unlock()
    especially since later in the patch series we will want to spin on
    console_sem in console_trylock_for_printk().

    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: console: lockup on boot
  2014-06-11 14:55         ` Sasha Levin
@ 2014-06-11 15:34           ` Peter Hurley
  2014-06-11 20:31             ` Jan Kara
  2014-06-11 17:38           ` Linus Torvalds
  2014-06-11 20:34           ` Jan Kara
  2 siblings, 1 reply; 17+ messages in thread
From: Peter Hurley @ 2014-06-11 15:34 UTC (permalink / raw)
  To: Sasha Levin, Jan Kara
  Cc: pmladek, Andrew Morton, Jet Chen, LKML, Linus Torvalds

On 06/11/2014 10:55 AM, Sasha Levin wrote:
> On 06/10/2014 11:59 AM, Peter Hurley wrote:
>> On 06/06/2014 03:05 PM, Sasha Levin wrote:
>>> On 05/30/2014 10:07 AM, Jan Kara wrote:
>>>> On Fri 30-05-14 09:58:14, Peter Hurley wrote:
>>>>>> On 05/30/2014 09:11 AM, Sasha Levin wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I sometime see lockups when booting my KVM guest with the latest -next kernel,
>>>>>>>> it basically hangs right when it should start 'init', and after a while I get
>>>>>>>> the following spew:
>>>>>>>>
>>>>>>>> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
>>>>>>
>>>>>> Maybe related to this report: https://lkml.org/lkml/2014/5/30/26
>>>>>> from Jet Chen which was bisected to
>>>>>>
>>>>>> commit bafe980f5afc7ccc693fd8c81c8aa5a02fbb5ae0
>>>>>> Author:     Jan Kara <jack@suse.cz>
>>>>>> AuthorDate: Thu May 22 10:43:35 2014 +1000
>>>>>> Commit:     Stephen Rothwell <sfr@canb.auug.org.au>
>>>>>> CommitDate: Thu May 22 10:43:35 2014 +1000
>>>>>>
>>>>>>       printk: enable interrupts before calling console_trylock_for_printk()
>>>>>>           We need interrupts disabled when calling console_trylock_for_printk() only
>>>>>>       so that cpu id we pass to can_use_console() remains valid (for other
>>>>>>       things console_sem provides all the exclusion we need and deadlocks on
>>>>>>       console_sem due to interrupts are impossible because we use
>>>>>>       down_trylock()).  However if we are rescheduled, we are guaranteed to run
>>>>>>       on an online cpu so we can easily just get the cpu id in
>>>>>>       can_use_console().
>>>>>>           We can lose a bit of performance when we enable interrupts in
>>>>>>       vprintk_emit() and then disable them again in console_unlock() but OTOH it
>>>>>>       can somewhat reduce interrupt latency caused by console_unlock()
>>>>>>       especially since later in the patch series we will want to spin on
>>>>>>       console_sem in console_trylock_for_printk().
>>>>>>           Signed-off-by: Jan Kara <jack@suse.cz>
>>>>>>       Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>>>>>>
>>>>>> ?
>>>>     Yeah, very likely. I think I see the problem, I'll send the fix shortly.
>>>
>>> Hi Jan,
>>>
>>> It seems that the issue I'm seeing is different from the "[prink]  BUG: spinlock
>>> lockup suspected on CPU#0, swapper/1".
>>>
>>> Is there anything else I could try here? The issue is very common during testing.
>>
>> Sasha,
>>
>> Is this bisectable? Maybe that's the best way forward here.
>
> I've ran a bisection again and ended up at the same commit as Jet Chen (the commit
> unfortunately already made it to Linus's tree).
>
> Note that I did try Jan's proposed fix and that didn't solve the issue for me, I
> believe we're seeing different issues caused by the same commit.
>
>
> 939f04bec1a4ef6ba4370b0f34b01decc844b1b1 is the first bad commit
> commit 939f04bec1a4ef6ba4370b0f34b01decc844b1b1
> Author: Jan Kara <jack@suse.cz>
> Date:   Wed Jun 4 16:11:37 2014 -0700
>
>      printk: enable interrupts before calling console_trylock_for_printk()
>
>      We need interrupts disabled when calling console_trylock_for_printk()
>      only so that cpu id we pass to can_use_console() remains valid (for
>      other things console_sem provides all the exclusion we need and
>      deadlocks on console_sem due to interrupts are impossible because we use
>      down_trylock()).  However if we are rescheduled, we are guaranteed to
>      run on an online cpu so we can easily just get the cpu id in
>      can_use_console().
>
>      We can lose a bit of performance when we enable interrupts in
>      vprintk_emit() and then disable them again in console_unlock() but OTOH
>      it can somewhat reduce interrupt latency caused by console_unlock()
>      especially since later in the patch series we will want to spin on
>      console_sem in console_trylock_for_printk().
>
>      Signed-off-by: Jan Kara <jack@suse.cz>
>      Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>      Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

I apologize; I didn't look at the patch very closely, but now that I do, this
sticks out:

@@ -1597,17 +1599,22 @@ asmlinkage int vprintk_emit(int facility, int level,

  	logbuf_cpu = UINT_MAX;
  	raw_spin_unlock(&logbuf_lock);
+	lockdep_on();
+	local_irq_restore(flags);
+

What prevents cpu migration right here?
If nothing, then logbuf_cpu is now stale and the recursion test at
the top of vprintk_emit is doing nothing to prevent recursion.


+	/*
+	 * Disable preemption to avoid being preempted while holding
+	 * console_sem which would prevent anyone from printing to console
+	 */
+	preempt_disable();
  	/*
  	 * Try to acquire and then immediately release the console semaphore.
  	 * The release will print out buffers and wake up /dev/kmsg and syslog()
  	 * users.
  	 */
-	if (console_trylock_for_printk(this_cpu))
+	if (console_trylock_for_printk())
  		console_unlock();

Regards,
Peter Hurley

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: console: lockup on boot
  2014-06-11 14:55         ` Sasha Levin
  2014-06-11 15:34           ` Peter Hurley
@ 2014-06-11 17:38           ` Linus Torvalds
  2014-06-11 17:44             ` Linus Torvalds
  2014-06-11 20:34           ` Jan Kara
  2 siblings, 1 reply; 17+ messages in thread
From: Linus Torvalds @ 2014-06-11 17:38 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Peter Hurley, Jan Kara, Petr Mladek, Andrew Morton, Jet Chen, LKML

On Wed, Jun 11, 2014 at 7:55 AM, Sasha Levin <sasha.levin@oracle.com> wrote:
>
> I've ran a bisection again and ended up at the same commit as Jet Chen (the commit
> unfortunately already made it to Linus's tree).

Ok, I'm going to revert that commit.  The games it plays with cpu
numbers and preemption make me worried, and it looks silly.

If the only reason for that patch is interrupt latency, then if the
"preempt_disable()" had been done *before* restoring irq's, none of
those "let's reload the CPU number" games would have been necessary,
and the patch could have been smaller. So even if that is the solution
to the problems people report, it's actually *better* to just revert
the patch entirely, and try again with a smaller and simpler version,
rather than try to change the patch after-the-fact.

          Linus

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: console: lockup on boot
  2014-06-11 17:38           ` Linus Torvalds
@ 2014-06-11 17:44             ` Linus Torvalds
  0 siblings, 0 replies; 17+ messages in thread
From: Linus Torvalds @ 2014-06-11 17:44 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Peter Hurley, Jan Kara, Petr Mladek, Andrew Morton, Jet Chen, LKML

On Wed, Jun 11, 2014 at 10:38 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Ok, I'm going to revert that commit.  The games it plays with cpu
> numbers and preemption make me worried, and it looks silly.
>
> If the only reason for that patch is interrupt latency, then if the
> "preempt_disable()" had been done *before* restoring irq's, none of
> those "let's reload the CPU number" games would have been necessary,
> and the patch could have been smaller.

Ugh. It doesn't revert cleanly, so I guess I'm not just reverting it
after all. Jan?

               Linus

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: console: lockup on boot
  2014-06-11 15:34           ` Peter Hurley
@ 2014-06-11 20:31             ` Jan Kara
  0 siblings, 0 replies; 17+ messages in thread
From: Jan Kara @ 2014-06-11 20:31 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Sasha Levin, Jan Kara, pmladek, Andrew Morton, Jet Chen, LKML,
	Linus Torvalds

On Wed 11-06-14 11:34:28, Peter Hurley wrote:
> On 06/11/2014 10:55 AM, Sasha Levin wrote:
> >On 06/10/2014 11:59 AM, Peter Hurley wrote:
> >>On 06/06/2014 03:05 PM, Sasha Levin wrote:
> >>>On 05/30/2014 10:07 AM, Jan Kara wrote:
> >>>>On Fri 30-05-14 09:58:14, Peter Hurley wrote:
> >>>>>>On 05/30/2014 09:11 AM, Sasha Levin wrote:
> >>>>>>>>Hi all,
> >>>>>>>>
> >>>>>>>>I sometime see lockups when booting my KVM guest with the latest -next kernel,
> >>>>>>>>it basically hangs right when it should start 'init', and after a while I get
> >>>>>>>>the following spew:
> >>>>>>>>
> >>>>>>>>[   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >>>>>>
> >>>>>>Maybe related to this report: https://lkml.org/lkml/2014/5/30/26
> >>>>>>from Jet Chen which was bisected to
> >>>>>>
> >>>>>>commit bafe980f5afc7ccc693fd8c81c8aa5a02fbb5ae0
> >>>>>>Author:     Jan Kara <jack@suse.cz>
> >>>>>>AuthorDate: Thu May 22 10:43:35 2014 +1000
> >>>>>>Commit:     Stephen Rothwell <sfr@canb.auug.org.au>
> >>>>>>CommitDate: Thu May 22 10:43:35 2014 +1000
> >>>>>>
> >>>>>>      printk: enable interrupts before calling console_trylock_for_printk()
> >>>>>>          We need interrupts disabled when calling console_trylock_for_printk() only
> >>>>>>      so that cpu id we pass to can_use_console() remains valid (for other
> >>>>>>      things console_sem provides all the exclusion we need and deadlocks on
> >>>>>>      console_sem due to interrupts are impossible because we use
> >>>>>>      down_trylock()).  However if we are rescheduled, we are guaranteed to run
> >>>>>>      on an online cpu so we can easily just get the cpu id in
> >>>>>>      can_use_console().
> >>>>>>          We can lose a bit of performance when we enable interrupts in
> >>>>>>      vprintk_emit() and then disable them again in console_unlock() but OTOH it
> >>>>>>      can somewhat reduce interrupt latency caused by console_unlock()
> >>>>>>      especially since later in the patch series we will want to spin on
> >>>>>>      console_sem in console_trylock_for_printk().
> >>>>>>          Signed-off-by: Jan Kara <jack@suse.cz>
> >>>>>>      Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> >>>>>>
> >>>>>>?
> >>>>    Yeah, very likely. I think I see the problem, I'll send the fix shortly.
> >>>
> >>>Hi Jan,
> >>>
> >>>It seems that the issue I'm seeing is different from the "[prink]  BUG: spinlock
> >>>lockup suspected on CPU#0, swapper/1".
> >>>
> >>>Is there anything else I could try here? The issue is very common during testing.
> >>
> >>Sasha,
> >>
> >>Is this bisectable? Maybe that's the best way forward here.
> >
> >I've ran a bisection again and ended up at the same commit as Jet Chen (the commit
> >unfortunately already made it to Linus's tree).
> >
> >Note that I did try Jan's proposed fix and that didn't solve the issue for me, I
> >believe we're seeing different issues caused by the same commit.
> >
> >
> >939f04bec1a4ef6ba4370b0f34b01decc844b1b1 is the first bad commit
> >commit 939f04bec1a4ef6ba4370b0f34b01decc844b1b1
> >Author: Jan Kara <jack@suse.cz>
> >Date:   Wed Jun 4 16:11:37 2014 -0700
> >
> >     printk: enable interrupts before calling console_trylock_for_printk()
> >
> >     We need interrupts disabled when calling console_trylock_for_printk()
> >     only so that cpu id we pass to can_use_console() remains valid (for
> >     other things console_sem provides all the exclusion we need and
> >     deadlocks on console_sem due to interrupts are impossible because we use
> >     down_trylock()).  However if we are rescheduled, we are guaranteed to
> >     run on an online cpu so we can easily just get the cpu id in
> >     can_use_console().
> >
> >     We can lose a bit of performance when we enable interrupts in
> >     vprintk_emit() and then disable them again in console_unlock() but OTOH
> >     it can somewhat reduce interrupt latency caused by console_unlock()
> >     especially since later in the patch series we will want to spin on
> >     console_sem in console_trylock_for_printk().
> >
> >     Signed-off-by: Jan Kara <jack@suse.cz>
> >     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> >     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> 
> I apologize; I didn't look at the patch very closely, but now that I do,
> this sticks out:
> 
> @@ -1597,17 +1599,22 @@ asmlinkage int vprintk_emit(int facility, int level,
> 
>  	logbuf_cpu = UINT_MAX;
>  	raw_spin_unlock(&logbuf_lock);
> +	lockdep_on();
> +	local_irq_restore(flags);
> +
> 
> What prevents cpu migration right here?
  Nothing.

> If nothing, then logbuf_cpu is now stale and the recursion test at
> the top of vprintk_emit is doing nothing to prevent recursion.
  Well, note that logbuf_cpu has just been set to UINT_MAX (i.e. undefined)
two lines above. So my patch changes nothing wrt. how printk recursion
detection works (at least AFAICT).
 
> +	/*
> +	 * Disable preemption to avoid being preempted while holding
> +	 * console_sem which would prevent anyone from printing to console
> +	 */
> +	preempt_disable();
>  	/*
>  	 * Try to acquire and then immediately release the console semaphore.
>  	 * The release will print out buffers and wake up /dev/kmsg and syslog()
>  	 * users.
>  	 */
> -	if (console_trylock_for_printk(this_cpu))
> +	if (console_trylock_for_printk())
>  		console_unlock();

							Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: console: lockup on boot
  2014-06-11 14:55         ` Sasha Levin
  2014-06-11 15:34           ` Peter Hurley
  2014-06-11 17:38           ` Linus Torvalds
@ 2014-06-11 20:34           ` Jan Kara
  2014-06-11 21:31             ` Jan Kara
  2 siblings, 1 reply; 17+ messages in thread
From: Jan Kara @ 2014-06-11 20:34 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Peter Hurley, Jan Kara, pmladek, Andrew Morton, Jet Chen, LKML,
	Linus Torvalds

On Wed 11-06-14 10:55:55, Sasha Levin wrote:
> On 06/10/2014 11:59 AM, Peter Hurley wrote:
> > On 06/06/2014 03:05 PM, Sasha Levin wrote:
> >> On 05/30/2014 10:07 AM, Jan Kara wrote:
> >>> On Fri 30-05-14 09:58:14, Peter Hurley wrote:
> >>>>> On 05/30/2014 09:11 AM, Sasha Levin wrote:
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> I sometime see lockups when booting my KVM guest with the latest -next kernel,
> >>>>>>> it basically hangs right when it should start 'init', and after a while I get
> >>>>>>> the following spew:
> >>>>>>>
> >>>>>>> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >>>>>
> >>>>> Maybe related to this report: https://lkml.org/lkml/2014/5/30/26
> >>>>> from Jet Chen which was bisected to
> >>>>>
> >>>>> commit bafe980f5afc7ccc693fd8c81c8aa5a02fbb5ae0
> >>>>> Author:     Jan Kara <jack@suse.cz>
> >>>>> AuthorDate: Thu May 22 10:43:35 2014 +1000
> >>>>> Commit:     Stephen Rothwell <sfr@canb.auug.org.au>
> >>>>> CommitDate: Thu May 22 10:43:35 2014 +1000
> >>>>>
> >>>>>      printk: enable interrupts before calling console_trylock_for_printk()
> >>>>>          We need interrupts disabled when calling console_trylock_for_printk() only
> >>>>>      so that cpu id we pass to can_use_console() remains valid (for other
> >>>>>      things console_sem provides all the exclusion we need and deadlocks on
> >>>>>      console_sem due to interrupts are impossible because we use
> >>>>>      down_trylock()).  However if we are rescheduled, we are guaranteed to run
> >>>>>      on an online cpu so we can easily just get the cpu id in
> >>>>>      can_use_console().
> >>>>>          We can lose a bit of performance when we enable interrupts in
> >>>>>      vprintk_emit() and then disable them again in console_unlock() but OTOH it
> >>>>>      can somewhat reduce interrupt latency caused by console_unlock()
> >>>>>      especially since later in the patch series we will want to spin on
> >>>>>      console_sem in console_trylock_for_printk().
> >>>>>          Signed-off-by: Jan Kara <jack@suse.cz>
> >>>>>      Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> >>>>>
> >>>>> ?
> >>>    Yeah, very likely. I think I see the problem, I'll send the fix shortly.
> >>
> >> Hi Jan,
> >>
> >> It seems that the issue I'm seeing is different from the "[prink]  BUG: spinlock
> >> lockup suspected on CPU#0, swapper/1".
> >>
> >> Is there anything else I could try here? The issue is very common during testing.
> > 
> > Sasha,
> > 
> > Is this bisectable? Maybe that's the best way forward here.
> 
> I've ran a bisection again and ended up at the same commit as Jet Chen
> (the commit unfortunately already made it to Linus's tree).
> 
> Note that I did try Jan's proposed fix and that didn't solve the issue
> for me, I believe we're seeing different issues caused by the same
> commit.
  Sorry it has been busy time lately and I didn't have as much time to look
into this as would be needed.
> 
> 
> 939f04bec1a4ef6ba4370b0f34b01decc844b1b1 is the first bad commit
> commit 939f04bec1a4ef6ba4370b0f34b01decc844b1b1
> Author: Jan Kara <jack@suse.cz>
> Date:   Wed Jun 4 16:11:37 2014 -0700
> 
>     printk: enable interrupts before calling console_trylock_for_printk()
> 
>     We need interrupts disabled when calling console_trylock_for_printk()
>     only so that cpu id we pass to can_use_console() remains valid (for
>     other things console_sem provides all the exclusion we need and
>     deadlocks on console_sem due to interrupts are impossible because we use
>     down_trylock()).  However if we are rescheduled, we are guaranteed to
>     run on an online cpu so we can easily just get the cpu id in
>     can_use_console().
> 
>     We can lose a bit of performance when we enable interrupts in
>     vprintk_emit() and then disable them again in console_unlock() but OTOH
>     it can somewhat reduce interrupt latency caused by console_unlock()
>     especially since later in the patch series we will want to spin on
>     console_sem in console_trylock_for_printk().
> 
>     Signed-off-by: Jan Kara <jack@suse.cz>
>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> 
> 
> Thanks,
> Sasha
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: console: lockup on boot
  2014-06-11 20:34           ` Jan Kara
@ 2014-06-11 21:31             ` Jan Kara
  2014-06-12  3:07               ` Sasha Levin
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Kara @ 2014-06-11 21:31 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Peter Hurley, Jan Kara, pmladek, Andrew Morton, Jet Chen, LKML,
	Linus Torvalds

[-- Attachment #1: Type: text/plain, Size: 4109 bytes --]

On Wed 11-06-14 22:34:36, Jan Kara wrote:
> On Wed 11-06-14 10:55:55, Sasha Levin wrote:
> > On 06/10/2014 11:59 AM, Peter Hurley wrote:
> > > On 06/06/2014 03:05 PM, Sasha Levin wrote:
> > >> On 05/30/2014 10:07 AM, Jan Kara wrote:
> > >>> On Fri 30-05-14 09:58:14, Peter Hurley wrote:
> > >>>>> On 05/30/2014 09:11 AM, Sasha Levin wrote:
> > >>>>>>> Hi all,
> > >>>>>>>
> > >>>>>>> I sometime see lockups when booting my KVM guest with the latest -next kernel,
> > >>>>>>> it basically hangs right when it should start 'init', and after a while I get
> > >>>>>>> the following spew:
> > >>>>>>>
> > >>>>>>> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> > >>>>>
> > >>>>> Maybe related to this report: https://lkml.org/lkml/2014/5/30/26
> > >>>>> from Jet Chen which was bisected to
> > >>>>>
> > >>>>> commit bafe980f5afc7ccc693fd8c81c8aa5a02fbb5ae0
> > >>>>> Author:     Jan Kara <jack@suse.cz>
> > >>>>> AuthorDate: Thu May 22 10:43:35 2014 +1000
> > >>>>> Commit:     Stephen Rothwell <sfr@canb.auug.org.au>
> > >>>>> CommitDate: Thu May 22 10:43:35 2014 +1000
> > >>>>>
> > >>>>>      printk: enable interrupts before calling console_trylock_for_printk()
> > >>>>>          We need interrupts disabled when calling console_trylock_for_printk() only
> > >>>>>      so that cpu id we pass to can_use_console() remains valid (for other
> > >>>>>      things console_sem provides all the exclusion we need and deadlocks on
> > >>>>>      console_sem due to interrupts are impossible because we use
> > >>>>>      down_trylock()).  However if we are rescheduled, we are guaranteed to run
> > >>>>>      on an online cpu so we can easily just get the cpu id in
> > >>>>>      can_use_console().
> > >>>>>          We can lose a bit of performance when we enable interrupts in
> > >>>>>      vprintk_emit() and then disable them again in console_unlock() but OTOH it
> > >>>>>      can somewhat reduce interrupt latency caused by console_unlock()
> > >>>>>      especially since later in the patch series we will want to spin on
> > >>>>>      console_sem in console_trylock_for_printk().
> > >>>>>          Signed-off-by: Jan Kara <jack@suse.cz>
> > >>>>>      Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > >>>>>
> > >>>>> ?
> > >>>    Yeah, very likely. I think I see the problem, I'll send the fix shortly.
> > >>
> > >> Hi Jan,
> > >>
> > >> It seems that the issue I'm seeing is different from the "[prink]  BUG: spinlock
> > >> lockup suspected on CPU#0, swapper/1".
> > >>
> > >> Is there anything else I could try here? The issue is very common during testing.
> > > 
> > > Sasha,
> > > 
> > > Is this bisectable? Maybe that's the best way forward here.
> > 
> > I've ran a bisection again and ended up at the same commit as Jet Chen
> > (the commit unfortunately already made it to Linus's tree).
> > 
> > Note that I did try Jan's proposed fix and that didn't solve the issue
> > for me, I believe we're seeing different issues caused by the same
> > commit.
>   Sorry it has been busy time lately and I didn't have as much time to look
> into this as would be needed.
  Oops, pressed send too early... So I have two debug patches for you. Can
you try whether the problem reproduces with the first one or with both of
them applied?

The stacktrace from spinlock lockup is somewhat weird - the spinlock lockup
message comes from NMI context where we tried to do print stack dump.
Indeed calling printk from NMI context *can* lead to this sort of lockups
(and the current process is a worker thread which apparently runs
fb_flashcursor() function which calls console_unlock() which can hold
logbuf_lock so all these things fit together).  What I really fail to see
is how my patch makes the problem happen to you pretty reliably.

Another somewhat strange thing to me is that although lock->owner in your
dump looks fine and points to currently running process, lock->owner_cpu is
-1 which is a combination that shouldn't ever happen as I'm looking into
spinlock debug code.

So for now I'm still puzzled.
								Honza

-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

[-- Attachment #2: 0001-printk-Debug-patch1.patch --]
[-- Type: text/x-patch, Size: 1001 bytes --]

>From bf739a5af7a73b08b3dea0491d38816f686dbf57 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Wed, 11 Jun 2014 22:33:23 +0200
Subject: [PATCH 1/2] printk: Debug patch1

Signed-off-by: Jan Kara <jack@suse.cz>
---
 kernel/printk/printk.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 5ea6b148a1a6..da94dc18b6fa 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1649,6 +1649,7 @@ asmlinkage int vprintk_emit(int facility, int level,
 	 * console_sem which would prevent anyone from printing to console
 	 */
 	preempt_disable();
+	lockdep_off();
 	/*
 	 * Try to acquire and then immediately release the console semaphore.
 	 * The release will print out buffers and wake up /dev/kmsg and syslog()
@@ -1656,6 +1657,7 @@ asmlinkage int vprintk_emit(int facility, int level,
 	 */
 	if (console_trylock_for_printk())
 		console_unlock();
+	lockdep_on();
 	preempt_enable();
 
 	return printed_len;
-- 
1.8.1.4


[-- Attachment #3: 0002-printk-Debug-patch-2.patch --]
[-- Type: text/x-patch, Size: 1191 bytes --]

>From 976ab0ea1d855686d8166b9c08239aa50e1c983b Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Wed, 11 Jun 2014 22:38:41 +0200
Subject: [PATCH 2/2] printk: Debug patch 2

Signed-off-by: Jan Kara <jack@suse.cz>
---
 kernel/printk/printk.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index da94dc18b6fa..7263c4353150 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1638,17 +1638,19 @@ asmlinkage int vprintk_emit(int facility, int level,
 	logbuf_cpu = UINT_MAX;
 	raw_spin_unlock(&logbuf_lock);
 	lockdep_on();
-	local_irq_restore(flags);
-
-	/* If called from the scheduler, we can not call up(). */
-	if (in_sched)
-		return printed_len;
-
 	/*
 	 * Disable preemption to avoid being preempted while holding
 	 * console_sem which would prevent anyone from printing to console
 	 */
 	preempt_disable();
+	local_irq_restore(flags);
+
+	/* If called from the scheduler, we can not call up(). */
+	if (in_sched) {
+		preempt_enable();
+		return printed_len;
+	}
+
 	lockdep_off();
 	/*
 	 * Try to acquire and then immediately release the console semaphore.
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: console: lockup on boot
  2014-06-11 21:31             ` Jan Kara
@ 2014-06-12  3:07               ` Sasha Levin
  2014-06-12  8:26                 ` Jan Kara
  2014-06-19 17:28                 ` Jan Kara
  0 siblings, 2 replies; 17+ messages in thread
From: Sasha Levin @ 2014-06-12  3:07 UTC (permalink / raw)
  To: Jan Kara
  Cc: Peter Hurley, pmladek, Andrew Morton, Jet Chen, LKML, Linus Torvalds

On 06/11/2014 05:31 PM, Jan Kara wrote:
> On Wed 11-06-14 22:34:36, Jan Kara wrote:
>> > On Wed 11-06-14 10:55:55, Sasha Levin wrote:
>>> > > On 06/10/2014 11:59 AM, Peter Hurley wrote:
>>>> > > > On 06/06/2014 03:05 PM, Sasha Levin wrote:
>>>>> > > >> On 05/30/2014 10:07 AM, Jan Kara wrote:
>>>>>> > > >>> On Fri 30-05-14 09:58:14, Peter Hurley wrote:
>>>>>>>> > > >>>>> On 05/30/2014 09:11 AM, Sasha Levin wrote:
>>>>>>>>>> > > >>>>>>> Hi all,
>>>>>>>>>> > > >>>>>>>
>>>>>>>>>> > > >>>>>>> I sometime see lockups when booting my KVM guest with the latest -next kernel,
>>>>>>>>>> > > >>>>>>> it basically hangs right when it should start 'init', and after a while I get
>>>>>>>>>> > > >>>>>>> the following spew:
>>>>>>>>>> > > >>>>>>>
>>>>>>>>>> > > >>>>>>> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
>>>>>>>> > > >>>>>
>>>>>>>> > > >>>>> Maybe related to this report: https://lkml.org/lkml/2014/5/30/26
>>>>>>>> > > >>>>> from Jet Chen which was bisected to
>>>>>>>> > > >>>>>
>>>>>>>> > > >>>>> commit bafe980f5afc7ccc693fd8c81c8aa5a02fbb5ae0
>>>>>>>> > > >>>>> Author:     Jan Kara <jack@suse.cz>
>>>>>>>> > > >>>>> AuthorDate: Thu May 22 10:43:35 2014 +1000
>>>>>>>> > > >>>>> Commit:     Stephen Rothwell <sfr@canb.auug.org.au>
>>>>>>>> > > >>>>> CommitDate: Thu May 22 10:43:35 2014 +1000
>>>>>>>> > > >>>>>
>>>>>>>> > > >>>>>      printk: enable interrupts before calling console_trylock_for_printk()
>>>>>>>> > > >>>>>          We need interrupts disabled when calling console_trylock_for_printk() only
>>>>>>>> > > >>>>>      so that cpu id we pass to can_use_console() remains valid (for other
>>>>>>>> > > >>>>>      things console_sem provides all the exclusion we need and deadlocks on
>>>>>>>> > > >>>>>      console_sem due to interrupts are impossible because we use
>>>>>>>> > > >>>>>      down_trylock()).  However if we are rescheduled, we are guaranteed to run
>>>>>>>> > > >>>>>      on an online cpu so we can easily just get the cpu id in
>>>>>>>> > > >>>>>      can_use_console().
>>>>>>>> > > >>>>>          We can lose a bit of performance when we enable interrupts in
>>>>>>>> > > >>>>>      vprintk_emit() and then disable them again in console_unlock() but OTOH it
>>>>>>>> > > >>>>>      can somewhat reduce interrupt latency caused by console_unlock()
>>>>>>>> > > >>>>>      especially since later in the patch series we will want to spin on
>>>>>>>> > > >>>>>      console_sem in console_trylock_for_printk().
>>>>>>>> > > >>>>>          Signed-off-by: Jan Kara <jack@suse.cz>
>>>>>>>> > > >>>>>      Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>>>>>>>> > > >>>>>
>>>>>>>> > > >>>>> ?
>>>>>> > > >>>    Yeah, very likely. I think I see the problem, I'll send the fix shortly.
>>>>> > > >>
>>>>> > > >> Hi Jan,
>>>>> > > >>
>>>>> > > >> It seems that the issue I'm seeing is different from the "[prink]  BUG: spinlock
>>>>> > > >> lockup suspected on CPU#0, swapper/1".
>>>>> > > >>
>>>>> > > >> Is there anything else I could try here? The issue is very common during testing.
>>>> > > > 
>>>> > > > Sasha,
>>>> > > > 
>>>> > > > Is this bisectable? Maybe that's the best way forward here.
>>> > > 
>>> > > I've ran a bisection again and ended up at the same commit as Jet Chen
>>> > > (the commit unfortunately already made it to Linus's tree).
>>> > > 
>>> > > Note that I did try Jan's proposed fix and that didn't solve the issue
>>> > > for me, I believe we're seeing different issues caused by the same
>>> > > commit.
>> >   Sorry it has been busy time lately and I didn't have as much time to look
>> > into this as would be needed.
>   Oops, pressed send too early... So I have two debug patches for you. Can
> you try whether the problem reproduces with the first one or with both of
> them applied?

The first patch fixed it (I assumed that there's no need to try the second).


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: console: lockup on boot
  2014-06-12  3:07               ` Sasha Levin
@ 2014-06-12  8:26                 ` Jan Kara
  2014-06-12  8:54                   ` Mike Galbraith
  2014-06-19 17:28                 ` Jan Kara
  1 sibling, 1 reply; 17+ messages in thread
From: Jan Kara @ 2014-06-12  8:26 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Jan Kara, Peter Hurley, pmladek, Andrew Morton, Jet Chen, LKML,
	Linus Torvalds, Peter Zijlstra

On Wed 11-06-14 23:07:04, Sasha Levin wrote:
> On 06/11/2014 05:31 PM, Jan Kara wrote:
> > On Wed 11-06-14 22:34:36, Jan Kara wrote:
> >> > On Wed 11-06-14 10:55:55, Sasha Levin wrote:
> >>> > > On 06/10/2014 11:59 AM, Peter Hurley wrote:
> >>>> > > > On 06/06/2014 03:05 PM, Sasha Levin wrote:
> >>>>> > > >> On 05/30/2014 10:07 AM, Jan Kara wrote:
> >>>>>> > > >>> On Fri 30-05-14 09:58:14, Peter Hurley wrote:
> >>>>>>>> > > >>>>> On 05/30/2014 09:11 AM, Sasha Levin wrote:
> >>>>>>>>>> > > >>>>>>> Hi all,
> >>>>>>>>>> > > >>>>>>>
> >>>>>>>>>> > > >>>>>>> I sometime see lockups when booting my KVM guest with the latest -next kernel,
> >>>>>>>>>> > > >>>>>>> it basically hangs right when it should start 'init', and after a while I get
> >>>>>>>>>> > > >>>>>>> the following spew:
> >>>>>>>>>> > > >>>>>>>
> >>>>>>>>>> > > >>>>>>> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >>>>>>>> > > >>>>>
> >>>>>>>> > > >>>>> Maybe related to this report: https://lkml.org/lkml/2014/5/30/26
> >>>>>>>> > > >>>>> from Jet Chen which was bisected to
> >>>>>>>> > > >>>>>
> >>>>>>>> > > >>>>> commit bafe980f5afc7ccc693fd8c81c8aa5a02fbb5ae0
> >>>>>>>> > > >>>>> Author:     Jan Kara <jack@suse.cz>
> >>>>>>>> > > >>>>> AuthorDate: Thu May 22 10:43:35 2014 +1000
> >>>>>>>> > > >>>>> Commit:     Stephen Rothwell <sfr@canb.auug.org.au>
> >>>>>>>> > > >>>>> CommitDate: Thu May 22 10:43:35 2014 +1000
> >>>>>>>> > > >>>>>
> >>>>>>>> > > >>>>>      printk: enable interrupts before calling console_trylock_for_printk()
> >>>>>>>> > > >>>>>          We need interrupts disabled when calling console_trylock_for_printk() only
> >>>>>>>> > > >>>>>      so that cpu id we pass to can_use_console() remains valid (for other
> >>>>>>>> > > >>>>>      things console_sem provides all the exclusion we need and deadlocks on
> >>>>>>>> > > >>>>>      console_sem due to interrupts are impossible because we use
> >>>>>>>> > > >>>>>      down_trylock()).  However if we are rescheduled, we are guaranteed to run
> >>>>>>>> > > >>>>>      on an online cpu so we can easily just get the cpu id in
> >>>>>>>> > > >>>>>      can_use_console().
> >>>>>>>> > > >>>>>          We can lose a bit of performance when we enable interrupts in
> >>>>>>>> > > >>>>>      vprintk_emit() and then disable them again in console_unlock() but OTOH it
> >>>>>>>> > > >>>>>      can somewhat reduce interrupt latency caused by console_unlock()
> >>>>>>>> > > >>>>>      especially since later in the patch series we will want to spin on
> >>>>>>>> > > >>>>>      console_sem in console_trylock_for_printk().
> >>>>>>>> > > >>>>>          Signed-off-by: Jan Kara <jack@suse.cz>
> >>>>>>>> > > >>>>>      Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> >>>>>>>> > > >>>>>
> >>>>>>>> > > >>>>> ?
> >>>>>> > > >>>    Yeah, very likely. I think I see the problem, I'll send the fix shortly.
> >>>>> > > >>
> >>>>> > > >> Hi Jan,
> >>>>> > > >>
> >>>>> > > >> It seems that the issue I'm seeing is different from the "[prink]  BUG: spinlock
> >>>>> > > >> lockup suspected on CPU#0, swapper/1".
> >>>>> > > >>
> >>>>> > > >> Is there anything else I could try here? The issue is very common during testing.
> >>>> > > > 
> >>>> > > > Sasha,
> >>>> > > > 
> >>>> > > > Is this bisectable? Maybe that's the best way forward here.
> >>> > > 
> >>> > > I've ran a bisection again and ended up at the same commit as Jet Chen
> >>> > > (the commit unfortunately already made it to Linus's tree).
> >>> > > 
> >>> > > Note that I did try Jan's proposed fix and that didn't solve the issue
> >>> > > for me, I believe we're seeing different issues caused by the same
> >>> > > commit.
> >> >   Sorry it has been busy time lately and I didn't have as much time to look
> >> > into this as would be needed.
> >   Oops, pressed send too early... So I have two debug patches for you. Can
> > you try whether the problem reproduces with the first one or with both of
> > them applied?
> 
> The first patch fixed it (I assumed that there's no need to try the second).
  Good. So that shows that it is the increased lockdep coverage which is
causing problems - with my patch, lockdep is able to identify some problem
because console drivers are now called with lockdep enabled. But because
the problem was found in some difficult context, lockdep just hung the
machine when trying to report it... Sadly the stacktraces you posted don't
tell us what lockdep found.

Adding Peter Zijlstra to CC. Peter, any idea how lockdep could report
problems when holding logbuf_lock? One possibility would be to extend
logbuf_cpu recursion logic to every holder of logbuf_lock. That will at
least avoid the spinlock recursion killing the machine but we won't be able
to see what lockdep found...

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: console: lockup on boot
  2014-06-12  8:26                 ` Jan Kara
@ 2014-06-12  8:54                   ` Mike Galbraith
  2014-07-08 13:02                     ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: Mike Galbraith @ 2014-06-12  8:54 UTC (permalink / raw)
  To: Jan Kara
  Cc: Sasha Levin, Peter Hurley, pmladek, Andrew Morton, Jet Chen,
	LKML, Linus Torvalds, Peter Zijlstra

On Thu, 2014-06-12 at 10:26 +0200, Jan Kara wrote: 
> On Wed 11-06-14 23:07:04, Sasha Levin wrote:

> > The first patch fixed it (I assumed that there's no need to try the second).
>   Good. So that shows that it is the increased lockdep coverage which is
> causing problems - with my patch, lockdep is able to identify some problem
> because console drivers are now called with lockdep enabled. But because
> the problem was found in some difficult context, lockdep just hung the
> machine when trying to report it... Sadly the stacktraces you posted don't
> tell us what lockdep found.
> 
> Adding Peter Zijlstra to CC. Peter, any idea how lockdep could report
> problems when holding logbuf_lock? One possibility would be to extend
> logbuf_cpu recursion logic to every holder of logbuf_lock. That will at
> least avoid the spinlock recursion killing the machine but we won't be able
> to see what lockdep found...

Could tell lockdep to use trace_printk().


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: console: lockup on boot
  2014-06-12  3:07               ` Sasha Levin
  2014-06-12  8:26                 ` Jan Kara
@ 2014-06-19 17:28                 ` Jan Kara
  1 sibling, 0 replies; 17+ messages in thread
From: Jan Kara @ 2014-06-19 17:28 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Jan Kara, Peter Hurley, pmladek, Andrew Morton, Jet Chen, LKML,
	Linus Torvalds

[-- Attachment #1: Type: text/plain, Size: 4372 bytes --]

On Wed 11-06-14 23:07:04, Sasha Levin wrote:
> On 06/11/2014 05:31 PM, Jan Kara wrote:
> > On Wed 11-06-14 22:34:36, Jan Kara wrote:
> >> > On Wed 11-06-14 10:55:55, Sasha Levin wrote:
> >>> > > On 06/10/2014 11:59 AM, Peter Hurley wrote:
> >>>> > > > On 06/06/2014 03:05 PM, Sasha Levin wrote:
> >>>>> > > >> On 05/30/2014 10:07 AM, Jan Kara wrote:
> >>>>>> > > >>> On Fri 30-05-14 09:58:14, Peter Hurley wrote:
> >>>>>>>> > > >>>>> On 05/30/2014 09:11 AM, Sasha Levin wrote:
> >>>>>>>>>> > > >>>>>>> Hi all,
> >>>>>>>>>> > > >>>>>>>
> >>>>>>>>>> > > >>>>>>> I sometime see lockups when booting my KVM guest with the latest -next kernel,
> >>>>>>>>>> > > >>>>>>> it basically hangs right when it should start 'init', and after a while I get
> >>>>>>>>>> > > >>>>>>> the following spew:
> >>>>>>>>>> > > >>>>>>>
> >>>>>>>>>> > > >>>>>>> [   30.790833] BUG: spinlock lockup suspected on CPU#1, swapper/1/0
> >>>>>>>> > > >>>>>
> >>>>>>>> > > >>>>> Maybe related to this report: https://lkml.org/lkml/2014/5/30/26
> >>>>>>>> > > >>>>> from Jet Chen which was bisected to
> >>>>>>>> > > >>>>>
> >>>>>>>> > > >>>>> commit bafe980f5afc7ccc693fd8c81c8aa5a02fbb5ae0
> >>>>>>>> > > >>>>> Author:     Jan Kara <jack@suse.cz>
> >>>>>>>> > > >>>>> AuthorDate: Thu May 22 10:43:35 2014 +1000
> >>>>>>>> > > >>>>> Commit:     Stephen Rothwell <sfr@canb.auug.org.au>
> >>>>>>>> > > >>>>> CommitDate: Thu May 22 10:43:35 2014 +1000
> >>>>>>>> > > >>>>>
> >>>>>>>> > > >>>>>      printk: enable interrupts before calling console_trylock_for_printk()
> >>>>>>>> > > >>>>>          We need interrupts disabled when calling console_trylock_for_printk() only
> >>>>>>>> > > >>>>>      so that cpu id we pass to can_use_console() remains valid (for other
> >>>>>>>> > > >>>>>      things console_sem provides all the exclusion we need and deadlocks on
> >>>>>>>> > > >>>>>      console_sem due to interrupts are impossible because we use
> >>>>>>>> > > >>>>>      down_trylock()).  However if we are rescheduled, we are guaranteed to run
> >>>>>>>> > > >>>>>      on an online cpu so we can easily just get the cpu id in
> >>>>>>>> > > >>>>>      can_use_console().
> >>>>>>>> > > >>>>>          We can lose a bit of performance when we enable interrupts in
> >>>>>>>> > > >>>>>      vprintk_emit() and then disable them again in console_unlock() but OTOH it
> >>>>>>>> > > >>>>>      can somewhat reduce interrupt latency caused by console_unlock()
> >>>>>>>> > > >>>>>      especially since later in the patch series we will want to spin on
> >>>>>>>> > > >>>>>      console_sem in console_trylock_for_printk().
> >>>>>>>> > > >>>>>          Signed-off-by: Jan Kara <jack@suse.cz>
> >>>>>>>> > > >>>>>      Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> >>>>>>>> > > >>>>>
> >>>>>>>> > > >>>>> ?
> >>>>>> > > >>>    Yeah, very likely. I think I see the problem, I'll send the fix shortly.
> >>>>> > > >>
> >>>>> > > >> Hi Jan,
> >>>>> > > >>
> >>>>> > > >> It seems that the issue I'm seeing is different from the "[prink]  BUG: spinlock
> >>>>> > > >> lockup suspected on CPU#0, swapper/1".
> >>>>> > > >>
> >>>>> > > >> Is there anything else I could try here? The issue is very common during testing.
> >>>> > > > 
> >>>> > > > Sasha,
> >>>> > > > 
> >>>> > > > Is this bisectable? Maybe that's the best way forward here.
> >>> > > 
> >>> > > I've ran a bisection again and ended up at the same commit as Jet Chen
> >>> > > (the commit unfortunately already made it to Linus's tree).
> >>> > > 
> >>> > > Note that I did try Jan's proposed fix and that didn't solve the issue
> >>> > > for me, I believe we're seeing different issues caused by the same
> >>> > > commit.
> >> >   Sorry it has been busy time lately and I didn't have as much time to look
> >> > into this as would be needed.
> >   Oops, pressed send too early... So I have two debug patches for you. Can
> > you try whether the problem reproduces with the first one or with both of
> > them applied?
> 
> The first patch fixed it (I assumed that there's no need to try the second).
  OK, so I got back to it. Can you try with attached patch and without my
"fix"? lockdep should dump complaints using trace buffer (so you should
check it from time to time) instead of using printk and thus we should be
able to see what it is complaining about.  Thanks!

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

[-- Attachment #2: 0001-lockdep-Dump-info-via-tracing.patch --]
[-- Type: text/x-patch, Size: 41506 bytes --]

>From 93930c92c664d8b834e1e04b92c00023626fc07a Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Thu, 19 Jun 2014 15:36:09 +0200
Subject: [PATCH] lockdep: Dump info via tracing

Signed-off-by: Jan Kara <jack@suse.cz>
---
 kernel/locking/lockdep.c | 707 +++++++++++++++++++++++++++--------------------
 1 file changed, 402 insertions(+), 305 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index d24e4339b46d..b15e7dec55f6 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -77,6 +77,26 @@ module_param(lock_stat, int, 0644);
  */
 static arch_spinlock_t lockdep_lock = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
 
+static void sprint_ip_sym(char *buf, unsigned long ip)
+{
+	sprintf(buf, "[<%p>] %pS\n", (void *) ip, (void *) ip);
+}
+
+static void trace_print_stack_trace(struct stack_trace *trace, int spaces)
+{
+	int i, n;
+	char buf[256];
+
+	if (!trace->entries)
+		return;
+
+	for (i = 0; i < trace->nr_entries; i++) {
+		n = sprintf(buf, "%*c", 1 + spaces, ' ');
+		sprint_ip_sym(buf + n, trace->entries[i]);
+		trace_printk(buf);
+	}
+}
+
 static int graph_lock(void)
 {
 	arch_spin_lock(&lockdep_lock);
@@ -382,9 +402,9 @@ static unsigned long stack_trace[MAX_STACK_TRACE_ENTRIES];
 
 static void print_lockdep_off(const char *bug_msg)
 {
-	printk(KERN_DEBUG "%s\n", bug_msg);
-	printk(KERN_DEBUG "turning off the locking correctness validator.\n");
-	printk(KERN_DEBUG "Please attach the output of /proc/lock_stat to the bug report\n");
+	trace_printk("%s\n", bug_msg);
+	trace_printk("turning off the locking correctness validator.\n");
+	trace_printk("Please attach the output of /proc/lock_stat to the bug report\n");
 }
 
 static int save_trace(struct stack_trace *trace)
@@ -417,7 +437,7 @@ static int save_trace(struct stack_trace *trace)
 			return 0;
 
 		print_lockdep_off("BUG: MAX_STACK_TRACE_ENTRIES too low!");
-		dump_stack();
+		trace_dump_stack(0);
 
 		return 0;
 	}
@@ -506,7 +526,7 @@ void get_usage_chars(struct lock_class *class, char usage[LOCK_USAGE_CHARS])
 	usage[i] = '\0';
 }
 
-static void __print_lock_name(struct lock_class *class)
+static void __sprint_lock_name(char *buf, struct lock_class *class)
 {
 	char str[KSYM_NAME_LEN];
 	const char *name;
@@ -514,28 +534,28 @@ static void __print_lock_name(struct lock_class *class)
 	name = class->name;
 	if (!name) {
 		name = __get_key_name(class->key, str);
-		printk("%s", name);
+		strcpy(buf, name);
 	} else {
-		printk("%s", name);
+		strcpy(buf, name);
 		if (class->name_version > 1)
-			printk("#%d", class->name_version);
+			sprintf(buf + strlen(buf), "#%d", class->name_version);
 		if (class->subclass)
-			printk("/%d", class->subclass);
+			sprintf(buf + strlen(buf), "/%d", class->subclass);
 	}
 }
 
-static void print_lock_name(struct lock_class *class)
+static void sprint_lock_name(char *buf, struct lock_class *class)
 {
 	char usage[LOCK_USAGE_CHARS];
 
 	get_usage_chars(class, usage);
 
-	printk(" (");
-	__print_lock_name(class);
-	printk("){%s}", usage);
+	strcpy(buf, " (");
+	__sprint_lock_name(buf, class);
+	sprintf(buf + strlen(buf), "){%s}", usage);
 }
 
-static void print_lockdep_cache(struct lockdep_map *lock)
+static void sprint_lockdep_cache(char *buf, struct lockdep_map *lock)
 {
 	const char *name;
 	char str[KSYM_NAME_LEN];
@@ -544,14 +564,14 @@ static void print_lockdep_cache(struct lockdep_map *lock)
 	if (!name)
 		name = __get_key_name(lock->key->subkeys, str);
 
-	printk("%s", name);
+	strcpy(buf, name);
 }
 
-static void print_lock(struct held_lock *hlock)
+static void sprint_lock(char *buf, struct held_lock *hlock)
 {
-	print_lock_name(hlock_class(hlock));
-	printk(", at: ");
-	print_ip_sym(hlock->acquire_ip);
+	sprint_lock_name(buf, hlock_class(hlock));
+	strcat(buf, ", at: ");
+	sprint_ip_sym(buf + strlen(buf), hlock->acquire_ip);
 }
 
 static void lockdep_print_held_locks(struct task_struct *curr)
@@ -559,21 +579,24 @@ static void lockdep_print_held_locks(struct task_struct *curr)
 	int i, depth = curr->lockdep_depth;
 
 	if (!depth) {
-		printk("no locks held by %s/%d.\n", curr->comm, task_pid_nr(curr));
+		trace_printk("no locks held by %s/%d.\n", curr->comm, task_pid_nr(curr));
 		return;
 	}
-	printk("%d lock%s held by %s/%d:\n",
+	trace_printk("%d lock%s held by %s/%d:\n",
 		depth, depth > 1 ? "s" : "", curr->comm, task_pid_nr(curr));
 
 	for (i = 0; i < depth; i++) {
-		printk(" #%d: ", i);
-		print_lock(curr->held_locks + i);
+		char buf[256];
+
+		sprintf(buf, " #%d: ", i);
+		sprint_lock(buf + strlen(buf), curr->held_locks + i);
+		trace_printk(buf);
 	}
 }
 
 static void print_kernel_ident(void)
 {
-	printk("%s %.*s %s\n", init_utsname()->release,
+	trace_printk("%s %.*s %s\n", init_utsname()->release,
 		(int)strcspn(init_utsname()->version, " "),
 		init_utsname()->version,
 		print_tainted());
@@ -669,11 +692,11 @@ look_up_lock_class(struct lockdep_map *lock, unsigned int subclass)
 
 	if (unlikely(subclass >= MAX_LOCKDEP_SUBCLASSES)) {
 		debug_locks_off();
-		printk(KERN_ERR
+		trace_printk(
 			"BUG: looking up invalid subclass: %u\n", subclass);
-		printk(KERN_ERR
+		trace_printk(KERN_ERR
 			"turning off the locking correctness validator.\n");
-		dump_stack();
+		trace_dump_stack(0);
 		return NULL;
 	}
 
@@ -737,10 +760,10 @@ register_lock_class(struct lockdep_map *lock, unsigned int subclass, int force)
  	 */
 	if (!static_obj(lock->key)) {
 		debug_locks_off();
-		printk("INFO: trying to register non-static key.\n");
-		printk("the code is fine but needs lockdep annotation.\n");
-		printk("turning off the locking correctness validator.\n");
-		dump_stack();
+		trace_printk("INFO: trying to register non-static key.\n");
+		trace_printk("the code is fine but needs lockdep annotation.\n");
+		trace_printk("turning off the locking correctness validator.\n");
+		trace_dump_stack(0);
 
 		return NULL;
 	}
@@ -772,7 +795,7 @@ register_lock_class(struct lockdep_map *lock, unsigned int subclass, int force)
 		raw_local_irq_restore(flags);
 
 		print_lockdep_off("BUG: MAX_LOCKDEP_KEYS too low!");
-		dump_stack();
+		trace_dump_stack(0);
 		return NULL;
 	}
 	class = lock_classes + nr_lock_classes++;
@@ -798,11 +821,11 @@ register_lock_class(struct lockdep_map *lock, unsigned int subclass, int force)
 		graph_unlock();
 		raw_local_irq_restore(flags);
 
-		printk("\nnew class %p: %s", class->key, class->name);
+		trace_printk("\nnew class %p: %s", class->key, class->name);
 		if (class->name_version > 1)
-			printk("#%d", class->name_version);
-		printk("\n");
-		dump_stack();
+			trace_printk("#%d", class->name_version);
+		trace_printk("\n");
+		trace_dump_stack(0);
 
 		raw_local_irq_save(flags);
 		if (!graph_lock()) {
@@ -842,7 +865,7 @@ static struct lock_list *alloc_list_entry(void)
 			return NULL;
 
 		print_lockdep_off("BUG: MAX_LOCKDEP_ENTRIES too low!");
-		dump_stack();
+		trace_dump_stack(0);
 		return NULL;
 	}
 	return list_entries + nr_list_entries++;
@@ -1078,12 +1101,15 @@ static inline int __bfs_backwards(struct lock_list *src_entry,
 static noinline int
 print_circular_bug_entry(struct lock_list *target, int depth)
 {
+	char buf[256];
+
 	if (debug_locks_silent)
 		return 0;
-	printk("\n-> #%u", depth);
-	print_lock_name(target->class);
-	printk(":\n");
-	print_stack_trace(&target->trace, 6);
+	sprintf(buf, "\n-> #%u", depth);
+	sprint_lock_name(buf + strlen(buf), target->class);
+	strcat(buf, ":\n");
+	trace_printk(buf);
+	trace_print_stack_trace(&target->trace, 6);
 
 	return 0;
 }
@@ -1096,6 +1122,7 @@ print_circular_lock_scenario(struct held_lock *src,
 	struct lock_class *source = hlock_class(src);
 	struct lock_class *target = hlock_class(tgt);
 	struct lock_class *parent = prt->class;
+	char buf[256];
 
 	/*
 	 * A direct locking problem where unsafe_class lock is taken
@@ -1111,31 +1138,36 @@ print_circular_lock_scenario(struct held_lock *src,
 	 * from the safe_class lock to the unsafe_class lock.
 	 */
 	if (parent != source) {
-		printk("Chain exists of:\n  ");
-		__print_lock_name(source);
-		printk(" --> ");
-		__print_lock_name(parent);
-		printk(" --> ");
-		__print_lock_name(target);
-		printk("\n\n");
+		trace_printk("Chain exists of:\n  ");
+		__sprint_lock_name(buf, source);
+		strcat(buf, " --> ");
+		__sprint_lock_name(buf + strlen(buf), parent);
+		strcat(buf, " --> ");
+		__sprint_lock_name(buf + strlen(buf), target);
+		strcat(buf, "\n\n");
+		trace_printk(buf);
 	}
 
-	printk(" Possible unsafe locking scenario:\n\n");
-	printk("       CPU0                    CPU1\n");
-	printk("       ----                    ----\n");
-	printk("  lock(");
-	__print_lock_name(target);
-	printk(");\n");
-	printk("                               lock(");
-	__print_lock_name(parent);
-	printk(");\n");
-	printk("                               lock(");
-	__print_lock_name(target);
-	printk(");\n");
-	printk("  lock(");
-	__print_lock_name(source);
-	printk(");\n");
-	printk("\n *** DEADLOCK ***\n\n");
+	trace_printk(" Possible unsafe locking scenario:\n\n");
+	trace_printk("       CPU0                    CPU1\n");
+	trace_printk("       ----                    ----\n");
+	strcpy(buf, "  lock(");
+	__sprint_lock_name(buf + strlen(buf), target);
+	strcat(buf, ");\n");
+	trace_printk(buf);
+	strcpy(buf, "                               lock(");
+	__sprint_lock_name(buf + strlen(buf), parent);
+	strcat(buf, ");\n");
+	trace_printk(buf);
+	strcpy(buf, "                               lock(");
+	__sprint_lock_name(buf + strlen(buf), target);
+	strcat(buf, ");\n");
+	trace_printk(buf);
+	strcpy(buf, "  lock(");
+	__sprint_lock_name(buf + strlen(buf), source);
+	strcat(buf, ");\n");
+	trace_printk(buf);
+	trace_printk("\n *** DEADLOCK ***\n\n");
 }
 
 /*
@@ -1148,22 +1180,25 @@ print_circular_bug_header(struct lock_list *entry, unsigned int depth,
 			struct held_lock *check_tgt)
 {
 	struct task_struct *curr = current;
+	char buf[256];
 
 	if (debug_locks_silent)
 		return 0;
 
-	printk("\n");
-	printk("======================================================\n");
-	printk("[ INFO: possible circular locking dependency detected ]\n");
+	trace_printk("\n");
+	trace_printk("======================================================\n");
+	trace_printk("[ INFO: possible circular locking dependency detected ]\n");
 	print_kernel_ident();
-	printk("-------------------------------------------------------\n");
-	printk("%s/%d is trying to acquire lock:\n",
+	trace_printk("-------------------------------------------------------\n");
+	sprintf(buf, "%s/%d is trying to acquire lock:\n",
 		curr->comm, task_pid_nr(curr));
-	print_lock(check_src);
-	printk("\nbut task is already holding lock:\n");
-	print_lock(check_tgt);
-	printk("\nwhich lock already depends on the new lock.\n\n");
-	printk("\nthe existing dependency chain (in reverse order) is:\n");
+	sprint_lock(buf + strlen(buf), check_src);
+	trace_printk(buf);
+	trace_printk("\nbut task is already holding lock:\n");
+	sprint_lock(buf, check_tgt);
+	trace_printk(buf);
+	trace_printk("\nwhich lock already depends on the new lock.\n\n");
+	trace_printk("\nthe existing dependency chain (in reverse order) is:\n");
 
 	print_circular_bug_entry(entry, depth);
 
@@ -1203,14 +1238,14 @@ static noinline int print_circular_bug(struct lock_list *this,
 		parent = get_lock_parent(parent);
 	}
 
-	printk("\nother info that might help us debug this:\n\n");
+	trace_printk("\nother info that might help us debug this:\n\n");
 	print_circular_lock_scenario(check_src, check_tgt,
 				     first_parent);
 
 	lockdep_print_held_locks(curr);
 
-	printk("\nstack backtrace:\n");
-	dump_stack();
+	trace_printk("\nstack backtrace:\n");
+	trace_dump_stack(0);
 
 	return 0;
 }
@@ -1367,25 +1402,28 @@ find_usage_backwards(struct lock_list *root, enum lock_usage_bit bit,
 static void print_lock_class_header(struct lock_class *class, int depth)
 {
 	int bit;
+	char buf[256];
 
-	printk("%*s->", depth, "");
-	print_lock_name(class);
-	printk(" ops: %lu", class->ops);
-	printk(" {\n");
+	sprintf(buf, "%*s->", depth, "");
+	sprint_lock_name(buf + strlen(buf), class);
+	sprintf(buf + strlen(buf), " ops: %lu {\n", class->ops);
+	trace_printk(buf);
 
 	for (bit = 0; bit < LOCK_USAGE_STATES; bit++) {
 		if (class->usage_mask & (1 << bit)) {
 			int len = depth;
 
-			len += printk("%*s   %s", depth, "", usage_str[bit]);
-			len += printk(" at:\n");
-			print_stack_trace(class->usage_traces + bit, len);
+			len += sprintf(buf, "%*s   %s", depth, "", usage_str[bit]);
+			len += sprintf(buf + strlen(buf), " at:\n");
+			trace_printk(buf);
+			trace_print_stack_trace(class->usage_traces + bit, len);
 		}
 	}
-	printk("%*s }\n", depth, "");
+	trace_printk("%*s }\n", depth, "");
 
-	printk("%*s ... key      at: ",depth,"");
-	print_ip_sym((unsigned long)class->key);
+	sprintf(buf, "%*s ... key      at: ",depth,"");
+	sprint_ip_sym(buf + strlen(buf), (unsigned long)class->key);
+	trace_printk(buf);
 }
 
 /*
@@ -1403,12 +1441,12 @@ print_shortest_lock_dependencies(struct lock_list *leaf,
 
 	do {
 		print_lock_class_header(entry->class, depth);
-		printk("%*s ... acquired at:\n", depth, "");
-		print_stack_trace(&entry->trace, 2);
-		printk("\n");
+		trace_printk("%*s ... acquired at:\n", depth, "");
+		trace_print_stack_trace(&entry->trace, 2);
+		trace_printk("\n");
 
 		if (depth == 0 && (entry != root)) {
-			printk("lockdep:%s bad path found in chain graph\n", __func__);
+			trace_printk("lockdep:%s bad path found in chain graph\n", __func__);
 			break;
 		}
 
@@ -1428,6 +1466,7 @@ print_irq_lock_scenario(struct lock_list *safe_entry,
 	struct lock_class *safe_class = safe_entry->class;
 	struct lock_class *unsafe_class = unsafe_entry->class;
 	struct lock_class *middle_class = prev_class;
+	char buf[256];
 
 	if (middle_class == safe_class)
 		middle_class = next_class;
@@ -1446,33 +1485,39 @@ print_irq_lock_scenario(struct lock_list *safe_entry,
 	 * from the safe_class lock to the unsafe_class lock.
 	 */
 	if (middle_class != unsafe_class) {
-		printk("Chain exists of:\n  ");
-		__print_lock_name(safe_class);
-		printk(" --> ");
-		__print_lock_name(middle_class);
-		printk(" --> ");
-		__print_lock_name(unsafe_class);
-		printk("\n\n");
+		trace_printk("Chain exists of:\n");
+		strcpy(buf, "  ");
+		__sprint_lock_name(buf + strlen(buf), safe_class);
+		strcat(buf, " --> ");
+		__sprint_lock_name(buf + strlen(buf), middle_class);
+		strcat(buf, " --> ");
+		__sprint_lock_name(buf + strlen(buf), unsafe_class);
+		strcat(buf, "\n\n");
+		trace_printk(buf);
 	}
 
-	printk(" Possible interrupt unsafe locking scenario:\n\n");
-	printk("       CPU0                    CPU1\n");
-	printk("       ----                    ----\n");
-	printk("  lock(");
-	__print_lock_name(unsafe_class);
-	printk(");\n");
-	printk("                               local_irq_disable();\n");
-	printk("                               lock(");
-	__print_lock_name(safe_class);
-	printk(");\n");
-	printk("                               lock(");
-	__print_lock_name(middle_class);
-	printk(");\n");
-	printk("  <Interrupt>\n");
-	printk("    lock(");
-	__print_lock_name(safe_class);
-	printk(");\n");
-	printk("\n *** DEADLOCK ***\n\n");
+	trace_printk(" Possible interrupt unsafe locking scenario:\n\n");
+	trace_printk("       CPU0                    CPU1\n");
+	trace_printk("       ----                    ----\n");
+	strcpy(buf, "  lock(");
+	__sprint_lock_name(buf + strlen(buf), unsafe_class);
+	strcat(buf, ");\n");
+	trace_printk(buf);
+	trace_printk("                               local_irq_disable();\n");
+	strcpy(buf, "                               lock(");
+	__sprint_lock_name(buf + strlen(buf), safe_class);
+	strcat(buf, ");\n");
+	trace_printk(buf);
+	strcpy(buf, "                               lock(");
+	__sprint_lock_name(buf + strlen(buf), middle_class);
+	strcat(buf, ");\n");
+	trace_printk(buf);
+	trace_printk("  <Interrupt>\n");
+	strcpy(buf, "    lock(");
+	__sprint_lock_name(buf + strlen(buf), safe_class);
+	strcat(buf, ");\n");
+	trace_printk(buf);
+	trace_printk("\n *** DEADLOCK ***\n\n");
 }
 
 static int
@@ -1487,65 +1532,74 @@ print_bad_irq_dependency(struct task_struct *curr,
 			 enum lock_usage_bit bit2,
 			 const char *irqclass)
 {
+	char buf[256];
+
 	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
 		return 0;
 
-	printk("\n");
-	printk("======================================================\n");
-	printk("[ INFO: %s-safe -> %s-unsafe lock order detected ]\n",
+	trace_printk("\n");
+	trace_printk("======================================================\n");
+	trace_printk("[ INFO: %s-safe -> %s-unsafe lock order detected ]\n",
 		irqclass, irqclass);
 	print_kernel_ident();
-	printk("------------------------------------------------------\n");
-	printk("%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] is trying to acquire:\n",
+	trace_printk("------------------------------------------------------\n");
+	trace_printk("%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] is trying to acquire:\n",
 		curr->comm, task_pid_nr(curr),
 		curr->hardirq_context, hardirq_count() >> HARDIRQ_SHIFT,
 		curr->softirq_context, softirq_count() >> SOFTIRQ_SHIFT,
 		curr->hardirqs_enabled,
 		curr->softirqs_enabled);
-	print_lock(next);
-
-	printk("\nand this task is already holding:\n");
-	print_lock(prev);
-	printk("which would create a new lock dependency:\n");
-	print_lock_name(hlock_class(prev));
-	printk(" ->");
-	print_lock_name(hlock_class(next));
-	printk("\n");
-
-	printk("\nbut this new dependency connects a %s-irq-safe lock:\n",
+	sprint_lock(buf, next);
+	trace_printk(buf);
+
+	trace_printk("\nand this task is already holding:\n");
+	sprint_lock(buf, prev);
+	trace_printk(buf);
+	trace_printk("which would create a new lock dependency:\n");
+	sprint_lock_name(buf, hlock_class(prev));
+	strcat(buf, " ->");
+	sprint_lock_name(buf + strlen(buf), hlock_class(next));
+	strcat(buf, "\n");
+	trace_printk(buf);
+
+	trace_printk("\nbut this new dependency connects a %s-irq-safe lock:\n",
 		irqclass);
-	print_lock_name(backwards_entry->class);
-	printk("\n... which became %s-irq-safe at:\n", irqclass);
+	sprint_lock_name(buf, backwards_entry->class);
+	strcat(buf, "\n");
+	trace_printk(buf);
+	trace_printk("... which became %s-irq-safe at:\n", irqclass);
 
-	print_stack_trace(backwards_entry->class->usage_traces + bit1, 1);
+	trace_print_stack_trace(backwards_entry->class->usage_traces + bit1, 1);
 
-	printk("\nto a %s-irq-unsafe lock:\n", irqclass);
-	print_lock_name(forwards_entry->class);
-	printk("\n... which became %s-irq-unsafe at:\n", irqclass);
-	printk("...");
+	trace_printk("\nto a %s-irq-unsafe lock:\n", irqclass);
+	sprint_lock_name(buf, forwards_entry->class);
+	strcat(buf, "\n");
+	trace_printk(buf);
+	trace_printk("... which became %s-irq-unsafe at:\n", irqclass);
+	trace_printk("...");
 
-	print_stack_trace(forwards_entry->class->usage_traces + bit2, 1);
+	trace_print_stack_trace(forwards_entry->class->usage_traces + bit2, 1);
 
-	printk("\nother info that might help us debug this:\n\n");
+	trace_printk("\nother info that might help us debug this:\n\n");
 	print_irq_lock_scenario(backwards_entry, forwards_entry,
 				hlock_class(prev), hlock_class(next));
 
 	lockdep_print_held_locks(curr);
 
-	printk("\nthe dependencies between %s-irq-safe lock", irqclass);
-	printk(" and the holding lock:\n");
+	trace_printk("\nthe dependencies between %s-irq-safe lock", irqclass);
+	trace_printk(" and the holding lock:\n");
 	if (!save_trace(&prev_root->trace))
 		return 0;
 	print_shortest_lock_dependencies(backwards_entry, prev_root);
 
-	printk("\nthe dependencies between the lock to be acquired");
-	printk(" and %s-irq-unsafe lock:\n", irqclass);
+	trace_printk("\nthe dependencies between the lock to be acquired");
+	trace_printk(" and %s-irq-unsafe lock:\n", irqclass);
 	if (!save_trace(&next_root->trace))
 		return 0;
 	print_shortest_lock_dependencies(forwards_entry, next_root);
 
-	printk("\nstack backtrace:\n");
-	dump_stack();
+	trace_printk("\nstack backtrace:\n");
+	trace_dump_stack(0);
 
 	return 0;
 }
@@ -1699,44 +1753,51 @@ print_deadlock_scenario(struct held_lock *nxt,
 {
 	struct lock_class *next = hlock_class(nxt);
 	struct lock_class *prev = hlock_class(prv);
-
-	printk(" Possible unsafe locking scenario:\n\n");
-	printk("       CPU0\n");
-	printk("       ----\n");
-	printk("  lock(");
-	__print_lock_name(prev);
-	printk(");\n");
-	printk("  lock(");
-	__print_lock_name(next);
-	printk(");\n");
-	printk("\n *** DEADLOCK ***\n\n");
-	printk(" May be due to missing lock nesting notation\n\n");
+	char buf[256];
+
+	trace_printk(" Possible unsafe locking scenario:\n\n");
+	trace_printk("       CPU0\n");
+	trace_printk("       ----\n");
+	strcpy(buf, "  lock(");
+	__sprint_lock_name(buf + strlen(buf), prev);
+	strcat(buf, ");\n");
+	trace_printk(buf);
+	strcpy(buf, "  lock(");
+	__sprint_lock_name(buf + strlen(buf), next);
+	strcat(buf, ");\n");
+	trace_printk(buf);
+	trace_printk("\n *** DEADLOCK ***\n\n");
+	trace_printk(" May be due to missing lock nesting notation\n\n");
 }
 
 static int
 print_deadlock_bug(struct task_struct *curr, struct held_lock *prev,
 		   struct held_lock *next)
 {
+	char buf[256];
+
 	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
 		return 0;
 
-	printk("\n");
-	printk("=============================================\n");
-	printk("[ INFO: possible recursive locking detected ]\n");
+	trace_printk("\n");
+	trace_printk("=============================================\n");
+	trace_printk("[ INFO: possible recursive locking detected ]\n");
 	print_kernel_ident();
-	printk("---------------------------------------------\n");
-	printk("%s/%d is trying to acquire lock:\n",
+	trace_printk("---------------------------------------------\n");
+	trace_printk("%s/%d is trying to acquire lock:\n",
 		curr->comm, task_pid_nr(curr));
-	print_lock(next);
-	printk("\nbut task is already holding lock:\n");
-	print_lock(prev);
+	sprint_lock(buf, next);
+	trace_printk(buf);
+	trace_printk("\nbut task is already holding lock:\n");
+	sprint_lock(buf, prev);
+	trace_printk(buf);
 
-	printk("\nother info that might help us debug this:\n");
+	trace_printk("\nother info that might help us debug this:\n");
 	print_deadlock_scenario(next, prev);
 	lockdep_print_held_locks(curr);
 
-	printk("\nstack backtrace:\n");
-	dump_stack();
+	trace_printk("\nstack backtrace:\n");
+	trace_dump_stack(0);
 
 	return 0;
 }
@@ -1894,13 +1955,16 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
 	 * Debugging printouts:
 	 */
 	if (verbose(hlock_class(prev)) || verbose(hlock_class(next))) {
+		char buf[256];
+
 		graph_unlock();
-		printk("\n new dependency: ");
-		print_lock_name(hlock_class(prev));
-		printk(" => ");
-		print_lock_name(hlock_class(next));
-		printk("\n");
-		dump_stack();
+		strcpy(buf, "\n new dependency: ");
+		sprint_lock_name(buf + strlen(buf), hlock_class(prev));
+		strcat(buf, " => ");
+		sprint_lock_name(buf + strlen(buf), hlock_class(next));
+		strcat(buf, "\n");
+		trace_printk(buf);
+		trace_dump_stack(0);
 		return graph_lock();
 	}
 	return 1;
@@ -2025,7 +2089,7 @@ static inline int lookup_chain_cache(struct task_struct *curr,
 cache_hit:
 			debug_atomic_inc(chain_lookup_hits);
 			if (very_verbose(class))
-				printk("\nhash chain already cached, key: "
+				trace_printk("\nhash chain already cached, key: "
 					"%016Lx tail class: [%p] %s\n",
 					(unsigned long long)chain_key,
 					class->key, class->name);
@@ -2033,7 +2097,7 @@ cache_hit:
 		}
 	}
 	if (very_verbose(class))
-		printk("\nnew hash chain, key: %016Lx tail class: [%p] %s\n",
+		trace_printk("\nnew hash chain, key: %016Lx tail class: [%p] %s\n",
 			(unsigned long long)chain_key, class->key, class->name);
 	/*
 	 * Allocate a new chain entry from the static array, and add
@@ -2055,7 +2119,7 @@ cache_hit:
 			return 0;
 
 		print_lockdep_off("BUG: MAX_LOCKDEP_CHAINS too low!");
-		dump_stack();
+		trace_dump_stack(0);
 		return 0;
 	}
 	chain = lock_chains + nr_lock_chains++;
@@ -2203,55 +2267,61 @@ static void
 print_usage_bug_scenario(struct held_lock *lock)
 {
 	struct lock_class *class = hlock_class(lock);
-
-	printk(" Possible unsafe locking scenario:\n\n");
-	printk("       CPU0\n");
-	printk("       ----\n");
-	printk("  lock(");
-	__print_lock_name(class);
-	printk(");\n");
-	printk("  <Interrupt>\n");
-	printk("    lock(");
-	__print_lock_name(class);
-	printk(");\n");
-	printk("\n *** DEADLOCK ***\n\n");
+	char buf[256];
+
+	trace_printk(" Possible unsafe locking scenario:\n\n");
+	trace_printk("       CPU0\n");
+	trace_printk("       ----\n");
+	strcpy(buf, "  lock(");
+	__sprint_lock_name(buf + strlen(buf), class);
+	strcat(buf, ");\n");
+	trace_printk(buf);
+	trace_printk("  <Interrupt>\n");
+	strcpy(buf, "    lock(");
+	__sprint_lock_name(buf + strlen(buf), class);
+	strcat(buf, ");\n");
+	trace_printk(buf);
+	trace_printk("\n *** DEADLOCK ***\n\n");
 }
 
 static int
 print_usage_bug(struct task_struct *curr, struct held_lock *this,
 		enum lock_usage_bit prev_bit, enum lock_usage_bit new_bit)
 {
+	char buf[256];
+
 	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
 		return 0;
 
-	printk("\n");
-	printk("=================================\n");
-	printk("[ INFO: inconsistent lock state ]\n");
+	trace_printk("\n");
+	trace_printk("=================================\n");
+	trace_printk("[ INFO: inconsistent lock state ]\n");
 	print_kernel_ident();
-	printk("---------------------------------\n");
+	trace_printk("---------------------------------\n");
 
-	printk("inconsistent {%s} -> {%s} usage.\n",
+	trace_printk("inconsistent {%s} -> {%s} usage.\n",
 		usage_str[prev_bit], usage_str[new_bit]);
 
-	printk("%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] takes:\n",
+	trace_printk("%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] takes:\n",
 		curr->comm, task_pid_nr(curr),
 		trace_hardirq_context(curr), hardirq_count() >> HARDIRQ_SHIFT,
 		trace_softirq_context(curr), softirq_count() >> SOFTIRQ_SHIFT,
 		trace_hardirqs_enabled(curr),
 		trace_softirqs_enabled(curr));
-	print_lock(this);
+	sprint_lock(buf, this);
+	trace_printk(buf);
 
-	printk("{%s} state was registered at:\n", usage_str[prev_bit]);
-	print_stack_trace(hlock_class(this)->usage_traces + prev_bit, 1);
+	trace_printk("{%s} state was registered at:\n", usage_str[prev_bit]);
+	trace_print_stack_trace(hlock_class(this)->usage_traces + prev_bit, 1);
 
 	print_irqtrace_events(curr);
-	printk("\nother info that might help us debug this:\n");
+	trace_printk("\nother info that might help us debug this:\n");
 	print_usage_bug_scenario(this);
 
 	lockdep_print_held_locks(curr);
 
-	printk("\nstack backtrace:\n");
-	dump_stack();
+	trace_printk("\nstack backtrace:\n");
+	trace_dump_stack(0);
 
 	return 0;
 }
@@ -2285,32 +2355,36 @@ print_irq_inversion_bug(struct task_struct *curr,
 	struct lock_list *entry = other;
 	struct lock_list *middle = NULL;
 	int depth;
+	char buf[256];
 
 	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
 		return 0;
 
-	printk("\n");
-	printk("=========================================================\n");
-	printk("[ INFO: possible irq lock inversion dependency detected ]\n");
+	trace_printk("\n");
+	trace_printk("=========================================================\n");
+	trace_printk("[ INFO: possible irq lock inversion dependency detected ]\n");
 	print_kernel_ident();
-	printk("---------------------------------------------------------\n");
-	printk("%s/%d just changed the state of lock:\n",
+	trace_printk("---------------------------------------------------------\n");
+	trace_printk("%s/%d just changed the state of lock:\n",
 		curr->comm, task_pid_nr(curr));
-	print_lock(this);
+	sprint_lock(buf, this);
+	trace_printk(buf);
 	if (forwards)
-		printk("but this lock took another, %s-unsafe lock in the past:\n", irqclass);
+		trace_printk("but this lock took another, %s-unsafe lock in the past:\n", irqclass);
 	else
-		printk("but this lock was taken by another, %s-safe lock in the past:\n", irqclass);
-	print_lock_name(other->class);
-	printk("\n\nand interrupts could create inverse lock ordering between them.\n\n");
+		trace_printk("but this lock was taken by another, %s-safe lock in the past:\n", irqclass);
+	sprint_lock_name(buf, other->class);
+	strcat(buf, "\n");
+	trace_printk(buf);
+	trace_printk("\nand interrupts could create inverse lock ordering between them.\n\n");
 
-	printk("\nother info that might help us debug this:\n");
+	trace_printk("\nother info that might help us debug this:\n");
 
 	/* Find a middle lock (if one exists) */
 	depth = get_lock_depth(other);
 	do {
 		if (depth == 0 && (entry != root)) {
-			printk("lockdep:%s bad path found in chain graph\n", __func__);
+			trace_printk("lockdep:%s bad path found in chain graph\n", __func__);
 			break;
 		}
 		middle = entry;
@@ -2326,13 +2400,13 @@ print_irq_inversion_bug(struct task_struct *curr,
 
 	lockdep_print_held_locks(curr);
 
-	printk("\nthe shortest dependencies between 2nd lock and 1st lock:\n");
+	trace_printk("\nthe shortest dependencies between 2nd lock and 1st lock:\n");
 	if (!save_trace(&root->trace))
 		return 0;
 	print_shortest_lock_dependencies(other, root);
 
-	printk("\nstack backtrace:\n");
-	dump_stack();
+	trace_printk("\nstack backtrace:\n");
+	trace_dump_stack(0);
 
 	return 0;
 }
@@ -2387,15 +2461,21 @@ check_usage_backwards(struct task_struct *curr, struct held_lock *this,
 
 void print_irqtrace_events(struct task_struct *curr)
 {
-	printk("irq event stamp: %u\n", curr->irq_events);
-	printk("hardirqs last  enabled at (%u): ", curr->hardirq_enable_event);
-	print_ip_sym(curr->hardirq_enable_ip);
-	printk("hardirqs last disabled at (%u): ", curr->hardirq_disable_event);
-	print_ip_sym(curr->hardirq_disable_ip);
-	printk("softirqs last  enabled at (%u): ", curr->softirq_enable_event);
-	print_ip_sym(curr->softirq_enable_ip);
-	printk("softirqs last disabled at (%u): ", curr->softirq_disable_event);
-	print_ip_sym(curr->softirq_disable_ip);
+	char buf[256];
+
+	trace_printk("irq event stamp: %u\n", curr->irq_events);
+	sprintf(buf, "hardirqs last  enabled at (%u): ", curr->hardirq_enable_event);
+	sprint_ip_sym(buf + strlen(buf), curr->hardirq_enable_ip);
+	trace_printk(buf);
+	sprintf(buf, "hardirqs last disabled at (%u): ", curr->hardirq_disable_event);
+	sprint_ip_sym(buf + strlen(buf), curr->hardirq_disable_ip);
+	trace_printk(buf);
+	sprintf(buf, "softirqs last  enabled at (%u): ", curr->softirq_enable_event);
+	sprint_ip_sym(buf + strlen(buf), curr->softirq_enable_ip);
+	trace_printk(buf);
+	sprintf(buf, "softirqs last disabled at (%u): ", curr->softirq_disable_event);
+	sprint_ip_sym(buf + strlen(buf), curr->softirq_disable_ip);
+	trace_printk(buf);
 }
 
 static int HARDIRQ_verbose(struct lock_class *class)
@@ -2937,10 +3017,13 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 	 * We must printk outside of the graph_lock:
 	 */
 	if (ret == 2) {
-		printk("\nmarked lock as {%s}:\n", usage_str[new_bit]);
-		print_lock(this);
+		char buf[256];
+
+		trace_printk("\nmarked lock as {%s}:\n", usage_str[new_bit]);
+		sprint_lock(buf, this);
+		trace_printk(buf);
 		print_irqtrace_events(curr);
-		dump_stack();
+		trace_dump_stack(0);
 	}
 
 	return ret;
@@ -2982,7 +3065,7 @@ void lockdep_init_map(struct lockdep_map *lock, const char *name,
 	 * Sanity check, the lock-class key must be persistent:
 	 */
 	if (!static_obj(key)) {
-		printk("BUG: key %p not in .data!\n", key);
+		trace_printk("BUG: key %p not in .data!\n", key);
 		/*
 		 * What it says above ^^^^^, I suggest you read it.
 		 */
@@ -3007,31 +3090,34 @@ print_lock_nested_lock_not_held(struct task_struct *curr,
 				struct held_lock *hlock,
 				unsigned long ip)
 {
+	char buf[256];
+
 	if (!debug_locks_off())
 		return 0;
 	if (debug_locks_silent)
 		return 0;
 
-	printk("\n");
-	printk("==================================\n");
-	printk("[ BUG: Nested lock was not taken ]\n");
+	trace_printk("\n");
+	trace_printk("==================================\n");
+	trace_printk("[ BUG: Nested lock was not taken ]\n");
 	print_kernel_ident();
-	printk("----------------------------------\n");
+	trace_printk("----------------------------------\n");
 
-	printk("%s/%d is trying to lock:\n", curr->comm, task_pid_nr(curr));
-	print_lock(hlock);
+	trace_printk("%s/%d is trying to lock:\n", curr->comm, task_pid_nr(curr));
+	sprint_lock(buf, hlock);
+	trace_printk(buf);
 
-	printk("\nbut this task is not holding:\n");
-	printk("%s\n", hlock->nest_lock->name);
+	trace_printk("\nbut this task is not holding:\n");
+	trace_printk("%s\n", hlock->nest_lock->name);
 
-	printk("\nstack backtrace:\n");
-	dump_stack();
+	trace_printk("\nstack backtrace:\n");
+	trace_dump_stack(0);
 
-	printk("\nother info that might help us debug this:\n");
+	trace_printk("\nother info that might help us debug this:\n");
 	lockdep_print_held_locks(curr);
 
-	printk("\nstack backtrace:\n");
-	dump_stack();
+	trace_printk("\nstack backtrace:\n");
+	trace_dump_stack(0);
 
 	return 0;
 }
@@ -3081,11 +3167,11 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	}
 	atomic_inc((atomic_t *)&class->ops);
 	if (very_verbose(class)) {
-		printk("\nacquire class [%p] %s", class->key, class->name);
+		trace_printk("\nacquire class [%p] %s", class->key, class->name);
 		if (class->name_version > 1)
-			printk("#%d", class->name_version);
-		printk("\n");
-		dump_stack();
+			trace_printk("#%d", class->name_version);
+		trace_printk("\n");
+		trace_dump_stack(0);
 	}
 
 	/*
@@ -3192,12 +3278,12 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	if (unlikely(curr->lockdep_depth >= MAX_LOCK_DEPTH)) {
 		debug_locks_off();
 		print_lockdep_off("BUG: MAX_LOCK_DEPTH too low!");
-		printk(KERN_DEBUG "depth: %i  max: %lu!\n",
+		trace_printk("depth: %i  max: %lu!\n",
 		       curr->lockdep_depth, MAX_LOCK_DEPTH);
 
 		lockdep_print_held_locks(current);
 		debug_show_all_locks();
-		dump_stack();
+		trace_dump_stack(0);
 
 		return 0;
 	}
@@ -3212,27 +3298,31 @@ static int
 print_unlock_imbalance_bug(struct task_struct *curr, struct lockdep_map *lock,
 			   unsigned long ip)
 {
+	char buf[256];
+
 	if (!debug_locks_off())
 		return 0;
 	if (debug_locks_silent)
 		return 0;
 
-	printk("\n");
-	printk("=====================================\n");
-	printk("[ BUG: bad unlock balance detected! ]\n");
+	trace_printk("\n");
+	trace_printk("=====================================\n");
+	trace_printk("[ BUG: bad unlock balance detected! ]\n");
 	print_kernel_ident();
-	printk("-------------------------------------\n");
-	printk("%s/%d is trying to release lock (",
+	trace_printk("-------------------------------------\n");
+	sprintf(buf, "%s/%d is trying to release lock (",
 		curr->comm, task_pid_nr(curr));
-	print_lockdep_cache(lock);
-	printk(") at:\n");
-	print_ip_sym(ip);
-	printk("but there are no more locks to release!\n");
-	printk("\nother info that might help us debug this:\n");
+	sprint_lockdep_cache(buf + strlen(buf), lock);
+	strcat(buf, ") at:\n");
+	trace_printk(buf);
+	sprint_ip_sym(buf, ip);
+	trace_printk(buf);
+	trace_printk("but there are no more locks to release!\n");
+	trace_printk("\nother info that might help us debug this:\n");
 	lockdep_print_held_locks(curr);
 
-	printk("\nstack backtrace:\n");
-	dump_stack();
+	trace_printk("\nstack backtrace:\n");
+	trace_dump_stack(0);
 
 	return 0;
 }
@@ -3532,11 +3622,11 @@ static void check_flags(unsigned long flags)
 
 	if (irqs_disabled_flags(flags)) {
 		if (DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)) {
-			printk("possible reason: unannotated irqs-off.\n");
+			trace_printk("possible reason: unannotated irqs-off.\n");
 		}
 	} else {
 		if (DEBUG_LOCKS_WARN_ON(!current->hardirqs_enabled)) {
-			printk("possible reason: unannotated irqs-on.\n");
+			trace_printk("possible reason: unannotated irqs-on.\n");
 		}
 	}
 
@@ -3657,27 +3747,31 @@ static int
 print_lock_contention_bug(struct task_struct *curr, struct lockdep_map *lock,
 			   unsigned long ip)
 {
+	char buf[256];
+
 	if (!debug_locks_off())
 		return 0;
 	if (debug_locks_silent)
 		return 0;
 
-	printk("\n");
-	printk("=================================\n");
-	printk("[ BUG: bad contention detected! ]\n");
+	trace_printk("\n");
+	trace_printk("=================================\n");
+	trace_printk("[ BUG: bad contention detected! ]\n");
 	print_kernel_ident();
-	printk("---------------------------------\n");
-	printk("%s/%d is trying to contend lock (",
+	trace_printk("---------------------------------\n");
+	sprintf(buf, "%s/%d is trying to contend lock (",
 		curr->comm, task_pid_nr(curr));
-	print_lockdep_cache(lock);
-	printk(") at:\n");
-	print_ip_sym(ip);
-	printk("but there are no locks held!\n");
-	printk("\nother info that might help us debug this:\n");
+	sprint_lockdep_cache(buf + strlen(buf), lock);
+	strcat(buf, ") at:\n");
+	trace_printk(buf);
+	sprint_ip_sym(buf, ip);
+	trace_printk(buf);
+	trace_printk("but there are no locks held!\n");
+	trace_printk("\nother info that might help us debug this:\n");
 	lockdep_print_held_locks(curr);
 
-	printk("\nstack backtrace:\n");
-	dump_stack();
+	trace_printk("\nstack backtrace:\n");
+	trace_dump_stack(0);
 
 	return 0;
 }
@@ -4033,23 +4127,26 @@ static void
 print_freed_lock_bug(struct task_struct *curr, const void *mem_from,
 		     const void *mem_to, struct held_lock *hlock)
 {
+	char buf[256];
+
 	if (!debug_locks_off())
 		return;
 	if (debug_locks_silent)
 		return;
 
-	printk("\n");
-	printk("=========================\n");
-	printk("[ BUG: held lock freed! ]\n");
+	trace_printk("\n");
+	trace_printk("=========================\n");
+	trace_printk("[ BUG: held lock freed! ]\n");
 	print_kernel_ident();
-	printk("-------------------------\n");
-	printk("%s/%d is freeing memory %p-%p, with a lock still held there!\n",
+	trace_printk("-------------------------\n");
+	trace_printk("%s/%d is freeing memory %p-%p, with a lock still held there!\n",
 		curr->comm, task_pid_nr(curr), mem_from, mem_to-1);
-	print_lock(hlock);
+	sprint_lock(buf, hlock);
+	trace_printk(buf);
 	lockdep_print_held_locks(curr);
 
-	printk("\nstack backtrace:\n");
-	dump_stack();
+	trace_printk("\nstack backtrace:\n");
+	trace_dump_stack(0);
 }
 
 static inline int not_in_range(const void* mem_from, unsigned long mem_len,
@@ -4096,15 +4193,15 @@ static void print_held_locks_bug(void)
 	if (debug_locks_silent)
 		return;
 
-	printk("\n");
-	printk("=====================================\n");
-	printk("[ BUG: %s/%d still has locks held! ]\n",
+	trace_printk("\n");
+	trace_printk("=====================================\n");
+	trace_printk("[ BUG: %s/%d still has locks held! ]\n",
 	       current->comm, task_pid_nr(current));
 	print_kernel_ident();
-	printk("-------------------------------------\n");
+	trace_printk("-------------------------------------\n");
 	lockdep_print_held_locks(current);
-	printk("\nstack backtrace:\n");
-	dump_stack();
+	trace_printk("\nstack backtrace:\n");
+	trace_dump_stack(0);
 }
 
 void debug_check_no_locks_held(void)
@@ -4195,12 +4292,12 @@ asmlinkage __visible void lockdep_sys_exit(void)
 	if (unlikely(curr->lockdep_depth)) {
 		if (!debug_locks_off())
 			return;
-		printk("\n");
-		printk("================================================\n");
-		printk("[ BUG: lock held when returning to user space! ]\n");
+		trace_printk("\n");
+		trace_printk("================================================\n");
+		trace_printk("[ BUG: lock held when returning to user space! ]\n");
 		print_kernel_ident();
-		printk("------------------------------------------------\n");
-		printk("%s/%d is leaving the kernel with locks still held!\n",
+		trace_printk("------------------------------------------------\n");
+		trace_printk("%s/%d is leaving the kernel with locks still held!\n",
 				curr->comm, curr->pid);
 		lockdep_print_held_locks(curr);
 	}
@@ -4215,14 +4312,14 @@ void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
 		return;
 #endif /* #ifdef CONFIG_PROVE_RCU_REPEATEDLY */
 	/* Note: the following can be executed concurrently, so be careful. */
-	printk("\n");
-	printk("===============================\n");
-	printk("[ INFO: suspicious RCU usage. ]\n");
+	trace_printk("\n");
+	trace_printk("===============================\n");
+	trace_printk("[ INFO: suspicious RCU usage. ]\n");
 	print_kernel_ident();
-	printk("-------------------------------\n");
-	printk("%s:%d %s!\n", file, line, s);
-	printk("\nother info that might help us debug this:\n\n");
-	printk("\n%srcu_scheduler_active = %d, debug_locks = %d\n",
+	trace_printk("-------------------------------\n");
+	trace_printk("%s:%d %s!\n", file, line, s);
+	trace_printk("\nother info that might help us debug this:\n\n");
+	trace_printk("\n%srcu_scheduler_active = %d, debug_locks = %d\n",
 	       !rcu_lockdep_current_cpu_online()
 			? "RCU used illegally from offline CPU!\n"
 			: !rcu_is_watching()
@@ -4249,10 +4346,10 @@ void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
 	 * rcu_read_lock_bh() and so on from extended quiescent states.
 	 */
 	if (!rcu_is_watching())
-		printk("RCU used illegally from extended quiescent state!\n");
+		trace_printk("RCU used illegally from extended quiescent state!\n");
 
 	lockdep_print_held_locks(curr);
-	printk("\nstack backtrace:\n");
-	dump_stack();
+	trace_printk("\nstack backtrace:\n");
+	trace_dump_stack(0);
 }
 EXPORT_SYMBOL_GPL(lockdep_rcu_suspicious);
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: console: lockup on boot
  2014-06-12  8:54                   ` Mike Galbraith
@ 2014-07-08 13:02                     ` Peter Zijlstra
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Zijlstra @ 2014-07-08 13:02 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Jan Kara, Sasha Levin, Peter Hurley, pmladek, Andrew Morton,
	Jet Chen, LKML, Linus Torvalds

[-- Attachment #1: Type: text/plain, Size: 1140 bytes --]

On Thu, Jun 12, 2014 at 10:54:22AM +0200, Mike Galbraith wrote:
> On Thu, 2014-06-12 at 10:26 +0200, Jan Kara wrote: 
> > On Wed 11-06-14 23:07:04, Sasha Levin wrote:
> 
> > > The first patch fixed it (I assumed that there's no need to try the second).
> >   Good. So that shows that it is the increased lockdep coverage which is
> > causing problems - with my patch, lockdep is able to identify some problem
> > because console drivers are now called with lockdep enabled. But because
> > the problem was found in some difficult context, lockdep just hung the
> > machine when trying to report it... Sadly the stacktraces you posted don't
> > tell us what lockdep found.
> > 
> > Adding Peter Zijlstra to CC. Peter, any idea how lockdep could report
> > problems when holding logbuf_lock? One possibility would be to extend
> > logbuf_cpu recursion logic to every holder of logbuf_lock. That will at
> > least avoid the spinlock recursion killing the machine but we won't be able
> > to see what lockdep found...
> 
> Could tell lockdep to use trace_printk().

lkml.kernel.org/r/20111221111143.401184003@chello.nl



[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2014-07-08 13:02 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-30 13:11 console: lockup on boot Sasha Levin
2014-05-30 13:58 ` Peter Hurley
2014-05-30 14:07   ` Jan Kara
2014-06-06 19:05     ` Sasha Levin
2014-06-10 15:59       ` Peter Hurley
2014-06-11 14:55         ` Sasha Levin
2014-06-11 15:34           ` Peter Hurley
2014-06-11 20:31             ` Jan Kara
2014-06-11 17:38           ` Linus Torvalds
2014-06-11 17:44             ` Linus Torvalds
2014-06-11 20:34           ` Jan Kara
2014-06-11 21:31             ` Jan Kara
2014-06-12  3:07               ` Sasha Levin
2014-06-12  8:26                 ` Jan Kara
2014-06-12  8:54                   ` Mike Galbraith
2014-07-08 13:02                     ` Peter Zijlstra
2014-06-19 17:28                 ` Jan Kara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.