All of lore.kernel.org
 help / color / mirror / Atom feed
* tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
@ 2015-12-30 10:44 Dmitry Vyukov
  2016-01-15  7:51 ` Dmitry Vyukov
                   ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Dmitry Vyukov @ 2015-12-30 10:44 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee
  Cc: syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

Hello,

I've got the following lock order inversion report on
8513342170278468bac126640a5d2d12ffbff106 (Dec 28). Unfortunately I
failed to find a reproducer for it.


======================================================
[ INFO: possible circular locking dependency detected ]
4.4.0-rc7+ #181 Not tainted
-------------------------------------------------------
kworker/u9:2/35 is trying to acquire lock:
 (routelock){+.+...}, at: [<ffffffff82bc77c0>]
n_tracerouter_receivebuf+0x20/0x40 drivers/tty/n_tracerouter.c:175

but task is already holding lock:
 (&buf->lock){+.+...}, at: [<ffffffff82baf791>]
flush_to_ldisc+0xe1/0x7f0 drivers/tty/tty_buffer.c:487

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 (&buf->lock){+.+...}:
       [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
kernel/locking/lockdep.c:3585
       [<     inline     >] __raw_spin_lock_irqsave
include/linux/spinlock_api_smp.h:112
       [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70
kernel/locking/spinlock.c:159
       [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
       [<ffffffff82b9a09a>] __isig+0x1a/0x50 drivers/tty/n_tty.c:1112
       [<ffffffff82b9c16e>] isig+0xae/0x2c0 drivers/tty/n_tty.c:1131
       [<ffffffff82b9ef02>] n_tty_receive_signal_char+0x22/0xf0
drivers/tty/n_tty.c:1243
       [<ffffffff82ba4958>] n_tty_receive_char_special+0x1278/0x2bf0
drivers/tty/n_tty.c:1289
       [<     inline     >] n_tty_receive_buf_fast drivers/tty/n_tty.c:1613
       [<     inline     >] __receive_buf drivers/tty/n_tty.c:1647
       [<ffffffff82ba7ca6>] n_tty_receive_buf_common+0x19d6/0x2450
drivers/tty/n_tty.c:1745
       [<ffffffff82ba8753>] n_tty_receive_buf2+0x33/0x40
drivers/tty/n_tty.c:1780
       [<     inline     >] receive_buf drivers/tty/tty_buffer.c:450
       [<ffffffff82bafa6f>] flush_to_ldisc+0x3bf/0x7f0
drivers/tty/tty_buffer.c:517
       [<ffffffff8133833c>] process_one_work+0x76c/0x13e0
kernel/workqueue.c:2030
       [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
       [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
       [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70
arch/x86/entry/entry_64.S:468

-> #2 (&o_tty->termios_rwsem/1){++++..}:
       [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
kernel/locking/lockdep.c:3585
       [<ffffffff85c899cc>] down_read+0x4c/0x70 kernel/locking/rwsem.c:22
       [<ffffffff82ba635d>] n_tty_receive_buf_common+0x8d/0x2450
drivers/tty/n_tty.c:1708
       [<ffffffff82ba8753>] n_tty_receive_buf2+0x33/0x40
drivers/tty/n_tty.c:1780
       [<     inline     >] receive_buf drivers/tty/tty_buffer.c:450
       [<ffffffff82bafa6f>] flush_to_ldisc+0x3bf/0x7f0
drivers/tty/tty_buffer.c:517
       [<ffffffff8133833c>] process_one_work+0x76c/0x13e0
kernel/workqueue.c:2030
       [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
       [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
       [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70
arch/x86/entry/entry_64.S:468

-> #1 (&port->buf.lock/1){+.+...}:
       [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
kernel/locking/lockdep.c:3585
       [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:518
       [<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940
kernel/locking/mutex.c:618
       [<ffffffff82bc77ff>] n_tracerouter_open+0x1f/0x210
drivers/tty/n_tracerouter.c:75
       [<ffffffff82bacd78>] tty_ldisc_open.isra.2+0x78/0xd0
drivers/tty/tty_ldisc.c:447
       [<ffffffff82bad43a>] tty_set_ldisc+0x1ca/0xa30
drivers/tty/tty_ldisc.c:567
       [<     inline     >] tiocsetd drivers/tty/tty_io.c:2650
       [<ffffffff82b9483a>] tty_ioctl+0xb2a/0x2160 drivers/tty/tty_io.c:2883
       [<     inline     >] vfs_ioctl fs/ioctl.c:43
       [<ffffffff81750d41>] do_vfs_ioctl+0x681/0xe40 fs/ioctl.c:607
       [<     inline     >] SYSC_ioctl fs/ioctl.c:622
       [<ffffffff8175158f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:613
       [<ffffffff85c8eb36>] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185

-> #0 (routelock){+.+...}:
       [<     inline     >] check_prev_add kernel/locking/lockdep.c:1853
       [<     inline     >] check_prevs_add kernel/locking/lockdep.c:1958
       [<     inline     >] validate_chain kernel/locking/lockdep.c:2144
       [<ffffffff813ee02e>] __lock_acquire+0x345e/0x3ca0
kernel/locking/lockdep.c:3206
       [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
kernel/locking/lockdep.c:3585
       [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:518
       [<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940
kernel/locking/mutex.c:618
       [<ffffffff82bc77c0>] n_tracerouter_receivebuf+0x20/0x40
drivers/tty/n_tracerouter.c:175
       [<     inline     >] receive_buf drivers/tty/tty_buffer.c:454
       [<ffffffff82bafc34>] flush_to_ldisc+0x584/0x7f0
drivers/tty/tty_buffer.c:517
       [<ffffffff8133833c>] process_one_work+0x76c/0x13e0
kernel/workqueue.c:2030
       [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
       [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
       [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70
arch/x86/entry/entry_64.S:468

other info that might help us debug this:

Chain exists of:
  routelock --> &o_tty->termios_rwsem/1 --> &buf->lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&buf->lock);
                               lock(&o_tty->termios_rwsem/1);
                               lock(&buf->lock);
  lock(routelock);

 *** DEADLOCK ***

4 locks held by kworker/u9:2/35:
 #0:  ("events_unbound"){.+.+.+}, at: [<     inline     >]
__write_once_size include/linux/compiler.h:247
 #0:  ("events_unbound"){.+.+.+}, at: [<     inline     >]
atomic64_set ./arch/x86/include/asm/atomic64_64.h:33
 #0:  ("events_unbound"){.+.+.+}, at: [<     inline     >]
atomic_long_set include/asm-generic/atomic-long.h:56
 #0:  ("events_unbound"){.+.+.+}, at: [<     inline     >]
set_work_data kernel/workqueue.c:614
 #0:  ("events_unbound"){.+.+.+}, at: [<     inline     >]
set_work_pool_and_clear_pending kernel/workqueue.c:641
 #0:  ("events_unbound"){.+.+.+}, at: [<ffffffff81338242>]
process_one_work+0x672/0x13e0 kernel/workqueue.c:2023
 #1:  ((&buf->work)){+.+...}, at: [<ffffffff81338272>]
process_one_work+0x6a2/0x13e0 kernel/workqueue.c:2027
 #2:  (&tty->ldisc_sem){++++++}, at: [<ffffffff82bac99b>]
tty_ldisc_ref+0x1b/0x80 drivers/tty/tty_ldisc.c:283
 #3:  (&buf->lock){+.+...}, at: [<ffffffff82baf791>]
flush_to_ldisc+0xe1/0x7f0 drivers/tty/tty_buffer.c:487

stack backtrace:
CPU: 0 PID: 35 Comm: kworker/u9:2 Not tainted 4.4.0-rc7+ #181
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: events_unbound flush_to_ldisc
 00000000ffffffff ffff88003d8e77f8 ffffffff8289d9dd ffffffff8866fbf0
 ffffffff8863f050 ffffffff886422f0 ffff88003d8e7840 ffffffff813e7081
 ffff88003d8e7930 000000003d8b1780 ffff88003d8b1fc0 ffff88003d8b1fe2
Call Trace:
 [<     inline     >] validate_chain kernel/locking/lockdep.c:2144
 [<ffffffff813ee02e>] __lock_acquire+0x345e/0x3ca0 kernel/locking/lockdep.c:3206
 [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
 [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:518
 [<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940 kernel/locking/mutex.c:618
 [<ffffffff82bc77c0>] n_tracerouter_receivebuf+0x20/0x40
drivers/tty/n_tracerouter.c:175
 [<     inline     >] receive_buf drivers/tty/tty_buffer.c:454
 [<ffffffff82bafc34>] flush_to_ldisc+0x584/0x7f0 drivers/tty/tty_buffer.c:517
 [<ffffffff8133833c>] process_one_work+0x76c/0x13e0 kernel/workqueue.c:2030
 [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
 [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
 [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
======================================================

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2015-12-30 10:44 tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc Dmitry Vyukov
@ 2016-01-15  7:51 ` Dmitry Vyukov
  2016-01-15 16:33   ` One Thousand Gnomes
  2016-01-20 12:47 ` Jiri Slaby
  2016-01-20 13:02 ` Peter Zijlstra
  2 siblings, 1 reply; 31+ messages in thread
From: Dmitry Vyukov @ 2016-01-15  7:51 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee
  Cc: syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

On Wed, Dec 30, 2015 at 11:44 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> Hello,
>
> I've got the following lock order inversion report on
> 8513342170278468bac126640a5d2d12ffbff106 (Dec 28). Unfortunately I
> failed to find a reproducer for it.

Ping. This is the most frequent failure during my testing now. J
Freyensee, you seem to wrote drivers/tty/n_tracerouter.c. Please take
a look. Thanks.


> ======================================================
> [ INFO: possible circular locking dependency detected ]
> 4.4.0-rc7+ #181 Not tainted
> -------------------------------------------------------
> kworker/u9:2/35 is trying to acquire lock:
>  (routelock){+.+...}, at: [<ffffffff82bc77c0>]
> n_tracerouter_receivebuf+0x20/0x40 drivers/tty/n_tracerouter.c:175
>
> but task is already holding lock:
>  (&buf->lock){+.+...}, at: [<ffffffff82baf791>]
> flush_to_ldisc+0xe1/0x7f0 drivers/tty/tty_buffer.c:487
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> -> #3 (&buf->lock){+.+...}:
>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
> kernel/locking/lockdep.c:3585
>        [<     inline     >] __raw_spin_lock_irqsave
> include/linux/spinlock_api_smp.h:112
>        [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70
> kernel/locking/spinlock.c:159
>        [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>        [<ffffffff82b9a09a>] __isig+0x1a/0x50 drivers/tty/n_tty.c:1112
>        [<ffffffff82b9c16e>] isig+0xae/0x2c0 drivers/tty/n_tty.c:1131
>        [<ffffffff82b9ef02>] n_tty_receive_signal_char+0x22/0xf0
> drivers/tty/n_tty.c:1243
>        [<ffffffff82ba4958>] n_tty_receive_char_special+0x1278/0x2bf0
> drivers/tty/n_tty.c:1289
>        [<     inline     >] n_tty_receive_buf_fast drivers/tty/n_tty.c:1613
>        [<     inline     >] __receive_buf drivers/tty/n_tty.c:1647
>        [<ffffffff82ba7ca6>] n_tty_receive_buf_common+0x19d6/0x2450
> drivers/tty/n_tty.c:1745
>        [<ffffffff82ba8753>] n_tty_receive_buf2+0x33/0x40
> drivers/tty/n_tty.c:1780
>        [<     inline     >] receive_buf drivers/tty/tty_buffer.c:450
>        [<ffffffff82bafa6f>] flush_to_ldisc+0x3bf/0x7f0
> drivers/tty/tty_buffer.c:517
>        [<ffffffff8133833c>] process_one_work+0x76c/0x13e0
> kernel/workqueue.c:2030
>        [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
>        [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
>        [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70
> arch/x86/entry/entry_64.S:468
>
> -> #2 (&o_tty->termios_rwsem/1){++++..}:
>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
> kernel/locking/lockdep.c:3585
>        [<ffffffff85c899cc>] down_read+0x4c/0x70 kernel/locking/rwsem.c:22
>        [<ffffffff82ba635d>] n_tty_receive_buf_common+0x8d/0x2450
> drivers/tty/n_tty.c:1708
>        [<ffffffff82ba8753>] n_tty_receive_buf2+0x33/0x40
> drivers/tty/n_tty.c:1780
>        [<     inline     >] receive_buf drivers/tty/tty_buffer.c:450
>        [<ffffffff82bafa6f>] flush_to_ldisc+0x3bf/0x7f0
> drivers/tty/tty_buffer.c:517
>        [<ffffffff8133833c>] process_one_work+0x76c/0x13e0
> kernel/workqueue.c:2030
>        [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
>        [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
>        [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70
> arch/x86/entry/entry_64.S:468
>
> -> #1 (&port->buf.lock/1){+.+...}:
>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
> kernel/locking/lockdep.c:3585
>        [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:518
>        [<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940
> kernel/locking/mutex.c:618
>        [<ffffffff82bc77ff>] n_tracerouter_open+0x1f/0x210
> drivers/tty/n_tracerouter.c:75
>        [<ffffffff82bacd78>] tty_ldisc_open.isra.2+0x78/0xd0
> drivers/tty/tty_ldisc.c:447
>        [<ffffffff82bad43a>] tty_set_ldisc+0x1ca/0xa30
> drivers/tty/tty_ldisc.c:567
>        [<     inline     >] tiocsetd drivers/tty/tty_io.c:2650
>        [<ffffffff82b9483a>] tty_ioctl+0xb2a/0x2160 drivers/tty/tty_io.c:2883
>        [<     inline     >] vfs_ioctl fs/ioctl.c:43
>        [<ffffffff81750d41>] do_vfs_ioctl+0x681/0xe40 fs/ioctl.c:607
>        [<     inline     >] SYSC_ioctl fs/ioctl.c:622
>        [<ffffffff8175158f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:613
>        [<ffffffff85c8eb36>] entry_SYSCALL_64_fastpath+0x16/0x7a
> arch/x86/entry/entry_64.S:185
>
> -> #0 (routelock){+.+...}:
>        [<     inline     >] check_prev_add kernel/locking/lockdep.c:1853
>        [<     inline     >] check_prevs_add kernel/locking/lockdep.c:1958
>        [<     inline     >] validate_chain kernel/locking/lockdep.c:2144
>        [<ffffffff813ee02e>] __lock_acquire+0x345e/0x3ca0
> kernel/locking/lockdep.c:3206
>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
> kernel/locking/lockdep.c:3585
>        [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:518
>        [<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940
> kernel/locking/mutex.c:618
>        [<ffffffff82bc77c0>] n_tracerouter_receivebuf+0x20/0x40
> drivers/tty/n_tracerouter.c:175
>        [<     inline     >] receive_buf drivers/tty/tty_buffer.c:454
>        [<ffffffff82bafc34>] flush_to_ldisc+0x584/0x7f0
> drivers/tty/tty_buffer.c:517
>        [<ffffffff8133833c>] process_one_work+0x76c/0x13e0
> kernel/workqueue.c:2030
>        [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
>        [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
>        [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70
> arch/x86/entry/entry_64.S:468
>
> other info that might help us debug this:
>
> Chain exists of:
>   routelock --> &o_tty->termios_rwsem/1 --> &buf->lock
>
>  Possible unsafe locking scenario:
>
>        CPU0                    CPU1
>        ----                    ----
>   lock(&buf->lock);
>                                lock(&o_tty->termios_rwsem/1);
>                                lock(&buf->lock);
>   lock(routelock);
>
>  *** DEADLOCK ***
>
> 4 locks held by kworker/u9:2/35:
>  #0:  ("events_unbound"){.+.+.+}, at: [<     inline     >]
> __write_once_size include/linux/compiler.h:247
>  #0:  ("events_unbound"){.+.+.+}, at: [<     inline     >]
> atomic64_set ./arch/x86/include/asm/atomic64_64.h:33
>  #0:  ("events_unbound"){.+.+.+}, at: [<     inline     >]
> atomic_long_set include/asm-generic/atomic-long.h:56
>  #0:  ("events_unbound"){.+.+.+}, at: [<     inline     >]
> set_work_data kernel/workqueue.c:614
>  #0:  ("events_unbound"){.+.+.+}, at: [<     inline     >]
> set_work_pool_and_clear_pending kernel/workqueue.c:641
>  #0:  ("events_unbound"){.+.+.+}, at: [<ffffffff81338242>]
> process_one_work+0x672/0x13e0 kernel/workqueue.c:2023
>  #1:  ((&buf->work)){+.+...}, at: [<ffffffff81338272>]
> process_one_work+0x6a2/0x13e0 kernel/workqueue.c:2027
>  #2:  (&tty->ldisc_sem){++++++}, at: [<ffffffff82bac99b>]
> tty_ldisc_ref+0x1b/0x80 drivers/tty/tty_ldisc.c:283
>  #3:  (&buf->lock){+.+...}, at: [<ffffffff82baf791>]
> flush_to_ldisc+0xe1/0x7f0 drivers/tty/tty_buffer.c:487
>
> stack backtrace:
> CPU: 0 PID: 35 Comm: kworker/u9:2 Not tainted 4.4.0-rc7+ #181
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Workqueue: events_unbound flush_to_ldisc
>  00000000ffffffff ffff88003d8e77f8 ffffffff8289d9dd ffffffff8866fbf0
>  ffffffff8863f050 ffffffff886422f0 ffff88003d8e7840 ffffffff813e7081
>  ffff88003d8e7930 000000003d8b1780 ffff88003d8b1fc0 ffff88003d8b1fe2
> Call Trace:
>  [<     inline     >] validate_chain kernel/locking/lockdep.c:2144
>  [<ffffffff813ee02e>] __lock_acquire+0x345e/0x3ca0 kernel/locking/lockdep.c:3206
>  [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>  [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:518
>  [<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940 kernel/locking/mutex.c:618
>  [<ffffffff82bc77c0>] n_tracerouter_receivebuf+0x20/0x40
> drivers/tty/n_tracerouter.c:175
>  [<     inline     >] receive_buf drivers/tty/tty_buffer.c:454
>  [<ffffffff82bafc34>] flush_to_ldisc+0x584/0x7f0 drivers/tty/tty_buffer.c:517
>  [<ffffffff8133833c>] process_one_work+0x76c/0x13e0 kernel/workqueue.c:2030
>  [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
>  [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
>  [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
> ======================================================

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-15  7:51 ` Dmitry Vyukov
@ 2016-01-15 16:33   ` One Thousand Gnomes
  2016-01-15 17:22     ` Dmitry Vyukov
  2016-01-20  2:09     ` J Freyensee
  0 siblings, 2 replies; 31+ messages in thread
From: One Thousand Gnomes @ 2016-01-15 16:33 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

On Fri, 15 Jan 2016 08:51:39 +0100
Dmitry Vyukov <dvyukov@google.com> wrote:

> On Wed, Dec 30, 2015 at 11:44 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> > Hello,
> >
> > I've got the following lock order inversion report on
> > 8513342170278468bac126640a5d2d12ffbff106 (Dec 28). Unfortunately I
> > failed to find a reproducer for it.
> 
> Ping. This is the most frequent failure during my testing now. J
> Freyensee, you seem to wrote drivers/tty/n_tracerouter.c. Please take
> a look. Thanks.

This driver hasn't changed since forever, nor does it look like a bug in
n_tracerouter.

This first two threads of execution is for the old line discipline so
don't occur at the same time as executing tracerouter.

This is opening the port with the tracerouter discipline as you change
line discipline. The core code locks the tty and then takes the ldisc
lock. Closes the old ldisc, opens the new, opening the new takes the
router lock.

	(tty_lock, tty_ldisc_lock, termios_sem, route_lock)


> > -> #1 (&port->buf.lock/1){+.+...}:
> >        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
> > kernel/locking/lockdep.c:3585
> >        [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:518
> >        [<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940
> > kernel/locking/mutex.c:618
> >        [<ffffffff82bc77ff>] n_tracerouter_open+0x1f/0x210
> > drivers/tty/n_tracerouter.c:75
> >        [<ffffffff82bacd78>] tty_ldisc_open.isra.2+0x78/0xd0
> > drivers/tty/tty_ldisc.c:447
> >        [<ffffffff82bad43a>] tty_set_ldisc+0x1ca/0xa30
> > drivers/tty/tty_ldisc.c:567
> >        [<     inline     >] tiocsetd drivers/tty/tty_io.c:2650
> >        [<ffffffff82b9483a>] tty_ioctl+0xb2a/0x2160 drivers/tty/tty_io.c:2883
> >        [<     inline     >] vfs_ioctl fs/ioctl.c:43
> >        [<ffffffff81750d41>] do_vfs_ioctl+0x681/0xe40 fs/ioctl.c:607
> >        [<     inline     >] SYSC_ioctl fs/ioctl.c:622
> >        [<ffffffff8175158f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:613
> >        [<ffffffff85c8eb36>] entry_SYSCALL_64_fastpath+0x16/0x7a
> > arch/x86/entry/entry_64.S:185

This is data being received by the new line discipline.
	
	(buf->lock, routelock)

> >
> > -> #0 (routelock){+.+...}:
> >        [<     inline     >] check_prev_add kernel/locking/lockdep.c:1853
> >        [<     inline     >] check_prevs_add kernel/locking/lockdep.c:1958
> >        [<     inline     >] validate_chain kernel/locking/lockdep.c:2144
> >        [<ffffffff813ee02e>] __lock_acquire+0x345e/0x3ca0
> > kernel/locking/lockdep.c:3206
> >        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
> > kernel/locking/lockdep.c:3585
> >        [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:518
> >        [<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940
> > kernel/locking/mutex.c:618
> >        [<ffffffff82bc77c0>] n_tracerouter_receivebuf+0x20/0x40
> > drivers/tty/n_tracerouter.c:175
> >        [<     inline     >] receive_buf drivers/tty/tty_buffer.c:454
> >        [<ffffffff82bafc34>] flush_to_ldisc+0x584/0x7f0
> > drivers/tty/tty_buffer.c:517
> >        [<ffffffff8133833c>] process_one_work+0x76c/0x13e0
> > kernel/workqueue.c:2030
> >        [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
> >        [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
> >        [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70
> > arch/x86/entry/entry_64.S:468
> >

The lock inversion the log shows seems to be a problem in n_tty not in
the tracerouter code.

However that seems to be

tty_set_ldisc to N_TTY
	takes tty lock
	takes tty_ldisc lock
	down termios_rwsem (R)
	tty->ops->set_ldisc
	down termios rwsem (W)

verus

receive_driver_bufs
	buf->lock
	termios rwsem (R)


which also looks fine
	
So I think it's just a false report caused by other changes in the tty
layer, and nothing to do with the tracerouter code at all.

Alan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-15 16:33   ` One Thousand Gnomes
@ 2016-01-15 17:22     ` Dmitry Vyukov
  2016-01-20  9:36       ` Dmitry Vyukov
  2016-01-20  2:09     ` J Freyensee
  1 sibling, 1 reply; 31+ messages in thread
From: Dmitry Vyukov @ 2016-01-15 17:22 UTC (permalink / raw)
  To: One Thousand Gnomes
  Cc: Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

On Fri, Jan 15, 2016 at 5:33 PM, One Thousand Gnomes
<gnomes@lxorguk.ukuu.org.uk> wrote:
> On Fri, 15 Jan 2016 08:51:39 +0100
> Dmitry Vyukov <dvyukov@google.com> wrote:
>
>> On Wed, Dec 30, 2015 at 11:44 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> > Hello,
>> >
>> > I've got the following lock order inversion report on
>> > 8513342170278468bac126640a5d2d12ffbff106 (Dec 28). Unfortunately I
>> > failed to find a reproducer for it.
>>
>> Ping. This is the most frequent failure during my testing now. J
>> Freyensee, you seem to wrote drivers/tty/n_tracerouter.c. Please take
>> a look. Thanks.
>
> This driver hasn't changed since forever, nor does it look like a bug in
> n_tracerouter.
>
> This first two threads of execution is for the old line discipline so
> don't occur at the same time as executing tracerouter.
>
> This is opening the port with the tracerouter discipline as you change
> line discipline. The core code locks the tty and then takes the ldisc
> lock. Closes the old ldisc, opens the new, opening the new takes the
> router lock.
>
>         (tty_lock, tty_ldisc_lock, termios_sem, route_lock)
>
>
>> > -> #1 (&port->buf.lock/1){+.+...}:
>> >        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
>> > kernel/locking/lockdep.c:3585
>> >        [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:518
>> >        [<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940
>> > kernel/locking/mutex.c:618
>> >        [<ffffffff82bc77ff>] n_tracerouter_open+0x1f/0x210
>> > drivers/tty/n_tracerouter.c:75
>> >        [<ffffffff82bacd78>] tty_ldisc_open.isra.2+0x78/0xd0
>> > drivers/tty/tty_ldisc.c:447
>> >        [<ffffffff82bad43a>] tty_set_ldisc+0x1ca/0xa30
>> > drivers/tty/tty_ldisc.c:567
>> >        [<     inline     >] tiocsetd drivers/tty/tty_io.c:2650
>> >        [<ffffffff82b9483a>] tty_ioctl+0xb2a/0x2160 drivers/tty/tty_io.c:2883
>> >        [<     inline     >] vfs_ioctl fs/ioctl.c:43
>> >        [<ffffffff81750d41>] do_vfs_ioctl+0x681/0xe40 fs/ioctl.c:607
>> >        [<     inline     >] SYSC_ioctl fs/ioctl.c:622
>> >        [<ffffffff8175158f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:613
>> >        [<ffffffff85c8eb36>] entry_SYSCALL_64_fastpath+0x16/0x7a
>> > arch/x86/entry/entry_64.S:185
>
> This is data being received by the new line discipline.
>
>         (buf->lock, routelock)
>
>> >
>> > -> #0 (routelock){+.+...}:
>> >        [<     inline     >] check_prev_add kernel/locking/lockdep.c:1853
>> >        [<     inline     >] check_prevs_add kernel/locking/lockdep.c:1958
>> >        [<     inline     >] validate_chain kernel/locking/lockdep.c:2144
>> >        [<ffffffff813ee02e>] __lock_acquire+0x345e/0x3ca0
>> > kernel/locking/lockdep.c:3206
>> >        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
>> > kernel/locking/lockdep.c:3585
>> >        [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:518
>> >        [<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940
>> > kernel/locking/mutex.c:618
>> >        [<ffffffff82bc77c0>] n_tracerouter_receivebuf+0x20/0x40
>> > drivers/tty/n_tracerouter.c:175
>> >        [<     inline     >] receive_buf drivers/tty/tty_buffer.c:454
>> >        [<ffffffff82bafc34>] flush_to_ldisc+0x584/0x7f0
>> > drivers/tty/tty_buffer.c:517
>> >        [<ffffffff8133833c>] process_one_work+0x76c/0x13e0
>> > kernel/workqueue.c:2030
>> >        [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
>> >        [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
>> >        [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70
>> > arch/x86/entry/entry_64.S:468
>> >
>
> The lock inversion the log shows seems to be a problem in n_tty not in
> the tracerouter code.
>
> However that seems to be
>
> tty_set_ldisc to N_TTY
>         takes tty lock
>         takes tty_ldisc lock
>         down termios_rwsem (R)
>         tty->ops->set_ldisc
>         down termios rwsem (W)
>
> verus
>
> receive_driver_bufs
>         buf->lock
>         termios rwsem (R)
>
>
> which also looks fine
>
> So I think it's just a false report caused by other changes in the tty
> layer, and nothing to do with the tracerouter code at all.


Thanks for looking into this.

The report says:

       CPU0                    CPU1
       ----                    ----
  lock(&buf->lock);
                               lock(&o_tty->termios_rwsem/1);
                               lock(&buf->lock);
  lock(routelock);

I don't see routelock in your stacks.
But I also I don't understand why this is a deadlock. I would
understand if one thread takes A then B; while another takes B then A.
But here are 3 non-circular lock acquisitions. Does it have something
to do with spinlocks vs mutex/rwsem?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-15 16:33   ` One Thousand Gnomes
  2016-01-15 17:22     ` Dmitry Vyukov
@ 2016-01-20  2:09     ` J Freyensee
  1 sibling, 0 replies; 31+ messages in thread
From: J Freyensee @ 2016-01-20  2:09 UTC (permalink / raw)
  To: One Thousand Gnomes, Dmitry Vyukov
  Cc: Greg Kroah-Hartman, Jiri Slaby, LKML, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

On Fri, 2016-01-15 at 16:33 +0000, One Thousand Gnomes wrote:
> On Fri, 15 Jan 2016 08:51:39 +0100
> Dmitry Vyukov <dvyukov@google.com> wrote:
> 
> > On Wed, Dec 30, 2015 at 11:44 AM, Dmitry Vyukov <dvyukov@google.com
> > > wrote:
> > > Hello,
> > > 
> > > I've got the following lock order inversion report on
> > > 8513342170278468bac126640a5d2d12ffbff106 (Dec 28). Unfortunately
> > > I
> > > failed to find a reproducer for it.
> > 
> > Ping. This is the most frequent failure during my testing now. J
> > Freyensee, you seem to wrote drivers/tty/n_tracerouter.c. Please
> > take
> > a look. Thanks.

I did look at the report but didn't see how n_tracerouter could be
causing the issue.  I apologize for not responding, and I thank you so
very much for providing the thorough explanation, Alan.

Jay

> 
> Alan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-15 17:22     ` Dmitry Vyukov
@ 2016-01-20  9:36       ` Dmitry Vyukov
  2016-01-20 11:44         ` Peter Zijlstra
  0 siblings, 1 reply; 31+ messages in thread
From: Dmitry Vyukov @ 2016-01-20  9:36 UTC (permalink / raw)
  To: One Thousand Gnomes, Peter Zijlstra, Ingo Molnar
  Cc: Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

On Fri, Jan 15, 2016 at 6:22 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Fri, Jan 15, 2016 at 5:33 PM, One Thousand Gnomes
> <gnomes@lxorguk.ukuu.org.uk> wrote:
>> On Fri, 15 Jan 2016 08:51:39 +0100
>> Dmitry Vyukov <dvyukov@google.com> wrote:
>>
>>> On Wed, Dec 30, 2015 at 11:44 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>> > Hello,
>>> >
>>> > I've got the following lock order inversion report on
>>> > 8513342170278468bac126640a5d2d12ffbff106 (Dec 28). Unfortunately I
>>> > failed to find a reproducer for it.
>>>
>>> Ping. This is the most frequent failure during my testing now. J
>>> Freyensee, you seem to wrote drivers/tty/n_tracerouter.c. Please take
>>> a look. Thanks.
>>
>> This driver hasn't changed since forever, nor does it look like a bug in
>> n_tracerouter.
>>
>> This first two threads of execution is for the old line discipline so
>> don't occur at the same time as executing tracerouter.
>>
>> This is opening the port with the tracerouter discipline as you change
>> line discipline. The core code locks the tty and then takes the ldisc
>> lock. Closes the old ldisc, opens the new, opening the new takes the
>> router lock.
>>
>>         (tty_lock, tty_ldisc_lock, termios_sem, route_lock)
>>
>>
>>> > -> #1 (&port->buf.lock/1){+.+...}:
>>> >        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
>>> > kernel/locking/lockdep.c:3585
>>> >        [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:518
>>> >        [<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940
>>> > kernel/locking/mutex.c:618
>>> >        [<ffffffff82bc77ff>] n_tracerouter_open+0x1f/0x210
>>> > drivers/tty/n_tracerouter.c:75
>>> >        [<ffffffff82bacd78>] tty_ldisc_open.isra.2+0x78/0xd0
>>> > drivers/tty/tty_ldisc.c:447
>>> >        [<ffffffff82bad43a>] tty_set_ldisc+0x1ca/0xa30
>>> > drivers/tty/tty_ldisc.c:567
>>> >        [<     inline     >] tiocsetd drivers/tty/tty_io.c:2650
>>> >        [<ffffffff82b9483a>] tty_ioctl+0xb2a/0x2160 drivers/tty/tty_io.c:2883
>>> >        [<     inline     >] vfs_ioctl fs/ioctl.c:43
>>> >        [<ffffffff81750d41>] do_vfs_ioctl+0x681/0xe40 fs/ioctl.c:607
>>> >        [<     inline     >] SYSC_ioctl fs/ioctl.c:622
>>> >        [<ffffffff8175158f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:613
>>> >        [<ffffffff85c8eb36>] entry_SYSCALL_64_fastpath+0x16/0x7a
>>> > arch/x86/entry/entry_64.S:185
>>
>> This is data being received by the new line discipline.
>>
>>         (buf->lock, routelock)
>>
>>> >
>>> > -> #0 (routelock){+.+...}:
>>> >        [<     inline     >] check_prev_add kernel/locking/lockdep.c:1853
>>> >        [<     inline     >] check_prevs_add kernel/locking/lockdep.c:1958
>>> >        [<     inline     >] validate_chain kernel/locking/lockdep.c:2144
>>> >        [<ffffffff813ee02e>] __lock_acquire+0x345e/0x3ca0
>>> > kernel/locking/lockdep.c:3206
>>> >        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
>>> > kernel/locking/lockdep.c:3585
>>> >        [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:518
>>> >        [<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940
>>> > kernel/locking/mutex.c:618
>>> >        [<ffffffff82bc77c0>] n_tracerouter_receivebuf+0x20/0x40
>>> > drivers/tty/n_tracerouter.c:175
>>> >        [<     inline     >] receive_buf drivers/tty/tty_buffer.c:454
>>> >        [<ffffffff82bafc34>] flush_to_ldisc+0x584/0x7f0
>>> > drivers/tty/tty_buffer.c:517
>>> >        [<ffffffff8133833c>] process_one_work+0x76c/0x13e0
>>> > kernel/workqueue.c:2030
>>> >        [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
>>> >        [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
>>> >        [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70
>>> > arch/x86/entry/entry_64.S:468
>>> >
>>
>> The lock inversion the log shows seems to be a problem in n_tty not in
>> the tracerouter code.
>>
>> However that seems to be
>>
>> tty_set_ldisc to N_TTY
>>         takes tty lock
>>         takes tty_ldisc lock
>>         down termios_rwsem (R)
>>         tty->ops->set_ldisc
>>         down termios rwsem (W)
>>
>> verus
>>
>> receive_driver_bufs
>>         buf->lock
>>         termios rwsem (R)
>>
>>
>> which also looks fine
>>
>> So I think it's just a false report caused by other changes in the tty
>> layer, and nothing to do with the tracerouter code at all.
>
>
> Thanks for looking into this.
>
> The report says:
>
>        CPU0                    CPU1
>        ----                    ----
>   lock(&buf->lock);
>                                lock(&o_tty->termios_rwsem/1);
>                                lock(&buf->lock);
>   lock(routelock);
>
> I don't see routelock in your stacks.
> But I also I don't understand why this is a deadlock. I would
> understand if one thread takes A then B; while another takes B then A.
> But here are 3 non-circular lock acquisitions. Does it have something
> to do with spinlocks vs mutex/rwsem?

+Peter, Ingo, can you please suggest a way to annotate this locking
pattern to avoid lockdep false positive?
It is cause by the fact that old and new ldisc locks are aliased?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-20  9:36       ` Dmitry Vyukov
@ 2016-01-20 11:44         ` Peter Zijlstra
  2016-01-20 11:54           ` Dmitry Vyukov
  0 siblings, 1 reply; 31+ messages in thread
From: Peter Zijlstra @ 2016-01-20 11:44 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: One Thousand Gnomes, Ingo Molnar, Greg Kroah-Hartman, Jiri Slaby,
	LKML, J Freyensee, syzkaller, Kostya Serebryany,
	Alexander Potapenko, Sasha Levin, Eric Dumazet

On Wed, Jan 20, 2016 at 10:36:01AM +0100, Dmitry Vyukov wrote:

> +Peter, Ingo, can you please suggest a way to annotate this locking
> pattern to avoid lockdep false positive?

No, because I haven't a clue what the actual problem is.

The report is fairly clear on how:

 routelock
   &port->buf.lock/1
     &o_tty->termios_rwsem/1
       &buf->lock

gets established, and shows where we try:

  &buf->lock
    routelock

Which gives a circle, ergo deadlock.

Show which link is wrong and why, and I can suggest ways of annotating
that.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-20 11:44         ` Peter Zijlstra
@ 2016-01-20 11:54           ` Dmitry Vyukov
  2016-01-20 12:07             ` Peter Zijlstra
  0 siblings, 1 reply; 31+ messages in thread
From: Dmitry Vyukov @ 2016-01-20 11:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: One Thousand Gnomes, Ingo Molnar, Greg Kroah-Hartman, Jiri Slaby,
	LKML, J Freyensee, syzkaller, Kostya Serebryany,
	Alexander Potapenko, Sasha Levin, Eric Dumazet

On Wed, Jan 20, 2016 at 12:44 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Wed, Jan 20, 2016 at 10:36:01AM +0100, Dmitry Vyukov wrote:
>
>> +Peter, Ingo, can you please suggest a way to annotate this locking
>> pattern to avoid lockdep false positive?
>
> No, because I haven't a clue what the actual problem is.
>
> The report is fairly clear on how:
>
>  routelock
>    &port->buf.lock/1
>      &o_tty->termios_rwsem/1
>        &buf->lock
>
> gets established, and shows where we try:
>
>   &buf->lock
>     routelock
>
> Which gives a circle, ergo deadlock.
>
> Show which link is wrong and why, and I can suggest ways of annotating
> that.


Alan provided an explanation calling this "false report":
https://groups.google.com/d/msg/syzkaller/YrV0bzdfa-g/cCVoUf1OFQAJ


And I don't understand how the following is a deadlock, since there is
no cycle...

 Possible unsafe locking scenario:
      CPU0                    CPU1
       ----                    ----
  lock(&buf->lock);
                               lock(&o_tty->termios_rwsem/1);
                               lock(&buf->lock);
  lock(routelock);


Shouldn't it look like:

 Possible unsafe locking scenario:
      CPU0                    CPU1
       ----                    ----
  lock(&buf->lock);
                               lock(routelock);
                               lock(&buf->lock);
  lock(routelock);

?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-20 11:54           ` Dmitry Vyukov
@ 2016-01-20 12:07             ` Peter Zijlstra
  2016-01-20 14:58               ` One Thousand Gnomes
  0 siblings, 1 reply; 31+ messages in thread
From: Peter Zijlstra @ 2016-01-20 12:07 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: One Thousand Gnomes, Ingo Molnar, Greg Kroah-Hartman, Jiri Slaby,
	LKML, J Freyensee, syzkaller, Kostya Serebryany,
	Alexander Potapenko, Sasha Levin, Eric Dumazet

On Wed, Jan 20, 2016 at 12:54:55PM +0100, Dmitry Vyukov wrote:
> On Wed, Jan 20, 2016 at 12:44 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Wed, Jan 20, 2016 at 10:36:01AM +0100, Dmitry Vyukov wrote:
> >
> >> +Peter, Ingo, can you please suggest a way to annotate this locking
> >> pattern to avoid lockdep false positive?
> >
> > No, because I haven't a clue what the actual problem is.
> >
> > The report is fairly clear on how:
> >
> >  routelock
> >    &port->buf.lock/1
> >      &o_tty->termios_rwsem/1
> >        &buf->lock
> >
> > gets established, and shows where we try:
> >
> >   &buf->lock
> >     routelock
> >
> > Which gives a circle, ergo deadlock.
> >
> > Show which link is wrong and why, and I can suggest ways of annotating
> > that.
> 
> 
> Alan provided an explanation calling this "false report":
> https://groups.google.com/d/msg/syzkaller/YrV0bzdfa-g/cCVoUf1OFQAJ

I read that, I didn't understand it. Which link is wrong and why?

> And I don't understand how the following is a deadlock, since there is
> no cycle...
> 
>  Possible unsafe locking scenario:
>       CPU0                    CPU1
>        ----                    ----
>   lock(&buf->lock);
>                                lock(&o_tty->termios_rwsem/1);
>                                lock(&buf->lock);
>   lock(routelock);

Ignore the stupid picture, it only really works for simple cases.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2015-12-30 10:44 tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc Dmitry Vyukov
  2016-01-15  7:51 ` Dmitry Vyukov
@ 2016-01-20 12:47 ` Jiri Slaby
  2016-01-20 13:02 ` Peter Zijlstra
  2 siblings, 0 replies; 31+ messages in thread
From: Jiri Slaby @ 2016-01-20 12:47 UTC (permalink / raw)
  To: Dmitry Vyukov, J Freyensee, Greg Kroah-Hartman, LKML
  Cc: Eric Dumazet, Alexander Potapenko, Kostya Serebryany, syzkaller,
	Sasha Levin, Peter Zijlstra

On 12/30/2015, 11:44 AM, Dmitry Vyukov wrote:
> the existing dependency chain (in reverse order) is:
> 
> -> #3 (&buf->lock){+.+...}:
>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
> kernel/locking/lockdep.c:3585
>        [<     inline     >] __raw_spin_lock_irqsave
> include/linux/spinlock_api_smp.h:112
>        [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70
> kernel/locking/spinlock.c:159
>        [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502

But at this line, there is &tty->ctrl_lock, not &buf->lock. I am
confused now :/.
spin_lock_irqsave(&tty->ctrl_lock, flags);

>        [<ffffffff82b9a09a>] __isig+0x1a/0x50 drivers/tty/n_tty.c:1112
>        [<ffffffff82b9c16e>] isig+0xae/0x2c0 drivers/tty/n_tty.c:1131
>        [<ffffffff82b9ef02>] n_tty_receive_signal_char+0x22/0xf0
> drivers/tty/n_tty.c:1243
>        [<ffffffff82ba4958>] n_tty_receive_char_special+0x1278/0x2bf0
> drivers/tty/n_tty.c:1289
>        [<     inline     >] n_tty_receive_buf_fast drivers/tty/n_tty.c:1613
>        [<     inline     >] __receive_buf drivers/tty/n_tty.c:1647
>        [<ffffffff82ba7ca6>] n_tty_receive_buf_common+0x19d6/0x2450
> drivers/tty/n_tty.c:1745
>        [<ffffffff82ba8753>] n_tty_receive_buf2+0x33/0x40
> drivers/tty/n_tty.c:1780
>        [<     inline     >] receive_buf drivers/tty/tty_buffer.c:450
>        [<ffffffff82bafa6f>] flush_to_ldisc+0x3bf/0x7f0
> drivers/tty/tty_buffer.c:517
>        [<ffffffff8133833c>] process_one_work+0x76c/0x13e0
> kernel/workqueue.c:2030
>        [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
>        [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
>        [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70
> arch/x86/entry/entry_64.S:468


-- 
js
suse labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2015-12-30 10:44 tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc Dmitry Vyukov
  2016-01-15  7:51 ` Dmitry Vyukov
  2016-01-20 12:47 ` Jiri Slaby
@ 2016-01-20 13:02 ` Peter Zijlstra
  2016-01-20 13:07   ` Dmitry Vyukov
  2016-01-20 16:08   ` Peter Hurley
  2 siblings, 2 replies; 31+ messages in thread
From: Peter Zijlstra @ 2016-01-20 13:02 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet, gnomes

On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
> -> #3 (&buf->lock){+.+...}:
>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>        [<     inline     >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>        [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>        [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502

So in any recent code that I look at this function tries to acquire
tty->ctrl_lock, not buf->lock. Am I missing something ?!

>        [<ffffffff82b9a09a>] __isig+0x1a/0x50 drivers/tty/n_tty.c:1112
>        [<ffffffff82b9c16e>] isig+0xae/0x2c0 drivers/tty/n_tty.c:1131
>        [<ffffffff82b9ef02>] n_tty_receive_signal_char+0x22/0xf0 drivers/tty/n_tty.c:1243
>        [<ffffffff82ba4958>] n_tty_receive_char_special+0x1278/0x2bf0 drivers/tty/n_tty.c:1289
>        [<     inline     >] n_tty_receive_buf_fast drivers/tty/n_tty.c:1613
>        [<     inline     >] __receive_buf drivers/tty/n_tty.c:1647
>        [<ffffffff82ba7ca6>] n_tty_receive_buf_common+0x19d6/0x2450 drivers/tty/n_tty.c:1745
>        [<ffffffff82ba8753>] n_tty_receive_buf2+0x33/0x40 drivers/tty/n_tty.c:1780
>        [<     inline     >] receive_buf drivers/tty/tty_buffer.c:450
>        [<ffffffff82bafa6f>] flush_to_ldisc+0x3bf/0x7f0 drivers/tty/tty_buffer.c:517
>        [<ffffffff8133833c>] process_one_work+0x76c/0x13e0 kernel/workqueue.c:2030
>        [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
>        [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
>        [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-20 13:02 ` Peter Zijlstra
@ 2016-01-20 13:07   ` Dmitry Vyukov
  2016-01-20 16:08   ` Peter Hurley
  1 sibling, 0 replies; 31+ messages in thread
From: Dmitry Vyukov @ 2016-01-20 13:07 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Greg Kroah-Hartman, Jiri Slaby, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet, One Thousand Gnomes

On Wed, Jan 20, 2016 at 2:02 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>> -> #3 (&buf->lock){+.+...}:
>>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>>        [<     inline     >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>>        [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>>        [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>
> So in any recent code that I look at this function tries to acquire
> tty->ctrl_lock, not buf->lock. Am I missing something ?!

I am pretty sure that I did not alter the report... let me collect
another report on a new kernel.

>>        [<ffffffff82b9a09a>] __isig+0x1a/0x50 drivers/tty/n_tty.c:1112
>>        [<ffffffff82b9c16e>] isig+0xae/0x2c0 drivers/tty/n_tty.c:1131
>>        [<ffffffff82b9ef02>] n_tty_receive_signal_char+0x22/0xf0 drivers/tty/n_tty.c:1243
>>        [<ffffffff82ba4958>] n_tty_receive_char_special+0x1278/0x2bf0 drivers/tty/n_tty.c:1289
>>        [<     inline     >] n_tty_receive_buf_fast drivers/tty/n_tty.c:1613
>>        [<     inline     >] __receive_buf drivers/tty/n_tty.c:1647
>>        [<ffffffff82ba7ca6>] n_tty_receive_buf_common+0x19d6/0x2450 drivers/tty/n_tty.c:1745
>>        [<ffffffff82ba8753>] n_tty_receive_buf2+0x33/0x40 drivers/tty/n_tty.c:1780
>>        [<     inline     >] receive_buf drivers/tty/tty_buffer.c:450
>>        [<ffffffff82bafa6f>] flush_to_ldisc+0x3bf/0x7f0 drivers/tty/tty_buffer.c:517
>>        [<ffffffff8133833c>] process_one_work+0x76c/0x13e0 kernel/workqueue.c:2030
>>        [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
>>        [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
>>        [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-20 12:07             ` Peter Zijlstra
@ 2016-01-20 14:58               ` One Thousand Gnomes
  2016-01-20 15:16                 ` Dmitry Vyukov
  0 siblings, 1 reply; 31+ messages in thread
From: One Thousand Gnomes @ 2016-01-20 14:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dmitry Vyukov, Ingo Molnar, Greg Kroah-Hartman, Jiri Slaby, LKML,
	J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko,
	Sasha Levin, Eric Dumazet

> I read that, I didn't understand it. Which link is wrong and why?
> 
> > And I don't understand how the following is a deadlock, since there is
> > no cycle...
> > 
> >  Possible unsafe locking scenario:
> >       CPU0                    CPU1
> >        ----                    ----
> >   lock(&buf->lock);
> >                                lock(&o_tty->termios_rwsem/1);
> >                                lock(&buf->lock);
> >   lock(routelock);
> 
> Ignore the stupid picture, it only really works for simple cases.

There are two line disciplines using two different locking orders

The two line disciplines never execute at once. A given tty is either
using one or the other and there is a clear and correctly locked
changeover.


semantically its something a bit like


 foo(x)
 {
          if (x == 1) {
		lock(A)
		lock(B)
	  } else {
		lock(B)
		lock(A)
	  }

	  Do stuff();

	  if (x == 1) {
		unlock(B)
		unlock(A)
	  } else {
		unlock(A)
		unlock(B)
	  }
}

with the guarantee made elsewhere that no instances of foo(1) and foo(0)
are ever executing at the same time.

That's not by dumb design - it's an interesting "nobody ever noticed
this" turned up by the lock detector between two totaly unrelated bits of
code.

Alan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-20 14:58               ` One Thousand Gnomes
@ 2016-01-20 15:16                 ` Dmitry Vyukov
  2016-01-20 16:32                   ` Peter Zijlstra
  0 siblings, 1 reply; 31+ messages in thread
From: Dmitry Vyukov @ 2016-01-20 15:16 UTC (permalink / raw)
  To: One Thousand Gnomes
  Cc: Peter Zijlstra, Ingo Molnar, Greg Kroah-Hartman, Jiri Slaby,
	LKML, J Freyensee, syzkaller, Kostya Serebryany,
	Alexander Potapenko, Sasha Levin, Eric Dumazet

On Wed, Jan 20, 2016 at 3:58 PM, One Thousand Gnomes
<gnomes@lxorguk.ukuu.org.uk> wrote:
>> I read that, I didn't understand it. Which link is wrong and why?
>>
>> > And I don't understand how the following is a deadlock, since there is
>> > no cycle...
>> >
>> >  Possible unsafe locking scenario:
>> >       CPU0                    CPU1
>> >        ----                    ----
>> >   lock(&buf->lock);
>> >                                lock(&o_tty->termios_rwsem/1);
>> >                                lock(&buf->lock);
>> >   lock(routelock);
>>
>> Ignore the stupid picture, it only really works for simple cases.
>
> There are two line disciplines using two different locking orders
>
> The two line disciplines never execute at once. A given tty is either
> using one or the other and there is a clear and correctly locked
> changeover.
>
>
> semantically its something a bit like
>
>
>  foo(x)
>  {
>           if (x == 1) {
>                 lock(A)
>                 lock(B)
>           } else {
>                 lock(B)
>                 lock(A)
>           }
>
>           Do stuff();
>
>           if (x == 1) {
>                 unlock(B)
>                 unlock(A)
>           } else {
>                 unlock(A)
>                 unlock(B)
>           }
> }
>
> with the guarantee made elsewhere that no instances of foo(1) and foo(0)
> are ever executing at the same time.
>
> That's not by dumb design - it's an interesting "nobody ever noticed
> this" turned up by the lock detector between two totaly unrelated bits of
> code.

In out user-space deadlock detector we have an annotation along the
lines of "forget all info this particular mutex" for such cases
(between foo(0) and foo(1)). Is there something similar in lockdep?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-20 13:02 ` Peter Zijlstra
  2016-01-20 13:07   ` Dmitry Vyukov
@ 2016-01-20 16:08   ` Peter Hurley
  2016-01-20 20:47     ` Peter Hurley
  2016-01-21 10:06     ` Dmitry Vyukov
  1 sibling, 2 replies; 31+ messages in thread
From: Peter Hurley @ 2016-01-20 16:08 UTC (permalink / raw)
  To: Peter Zijlstra, Dmitry Vyukov, Jiri Slaby, gnomes
  Cc: Greg Kroah-Hartman, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

On 01/20/2016 05:02 AM, Peter Zijlstra wrote:
> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>> -> #3 (&buf->lock){+.+...}:
>>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>>        [<     inline     >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>>        [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>>        [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
> 
> So in any recent code that I look at this function tries to acquire
> tty->ctrl_lock, not buf->lock. Am I missing something ?!

Yes.

The tty locks were annotated with __lockfunc so were being elided from lockdep
stacktraces. Greg has a patch in his queue from me that removes the __lockfunc
annotation ("tty: Remove __lockfunc annotation from tty lock functions").

Unfortunately, I think syzkaller's post-processing stack trace isn't helping
either, giving the impression that the stack is still inside tty_get_pgrp().

It's not.

It's in pty_flush_buffer(), which is taking the 'other tty' buf->lock.

Looks to me like the lock inversion is caused by the tty_driver_flush_buffer()
in n_tracerouter_open()/_close(), but I need to look at this mess a little
closer.

Regards,
Peter Hurley


>>        [<ffffffff82b9a09a>] __isig+0x1a/0x50 drivers/tty/n_tty.c:1112
>>        [<ffffffff82b9c16e>] isig+0xae/0x2c0 drivers/tty/n_tty.c:1131
>>        [<ffffffff82b9ef02>] n_tty_receive_signal_char+0x22/0xf0 drivers/tty/n_tty.c:1243
>>        [<ffffffff82ba4958>] n_tty_receive_char_special+0x1278/0x2bf0 drivers/tty/n_tty.c:1289
>>        [<     inline     >] n_tty_receive_buf_fast drivers/tty/n_tty.c:1613
>>        [<     inline     >] __receive_buf drivers/tty/n_tty.c:1647
>>        [<ffffffff82ba7ca6>] n_tty_receive_buf_common+0x19d6/0x2450 drivers/tty/n_tty.c:1745
>>        [<ffffffff82ba8753>] n_tty_receive_buf2+0x33/0x40 drivers/tty/n_tty.c:1780
>>        [<     inline     >] receive_buf drivers/tty/tty_buffer.c:450
>>        [<ffffffff82bafa6f>] flush_to_ldisc+0x3bf/0x7f0 drivers/tty/tty_buffer.c:517
>>        [<ffffffff8133833c>] process_one_work+0x76c/0x13e0 kernel/workqueue.c:2030
>>        [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
>>        [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
>>        [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-20 15:16                 ` Dmitry Vyukov
@ 2016-01-20 16:32                   ` Peter Zijlstra
  0 siblings, 0 replies; 31+ messages in thread
From: Peter Zijlstra @ 2016-01-20 16:32 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: One Thousand Gnomes, Ingo Molnar, Greg Kroah-Hartman, Jiri Slaby,
	LKML, J Freyensee, syzkaller, Kostya Serebryany,
	Alexander Potapenko, Sasha Levin, Eric Dumazet

On Wed, Jan 20, 2016 at 04:16:52PM +0100, Dmitry Vyukov wrote:
> On Wed, Jan 20, 2016 at 3:58 PM, One Thousand Gnomes
> <gnomes@lxorguk.ukuu.org.uk> wrote:
> >> I read that, I didn't understand it. Which link is wrong and why?
> >>
> >> > And I don't understand how the following is a deadlock, since there is
> >> > no cycle...
> >> >
> >> >  Possible unsafe locking scenario:
> >> >       CPU0                    CPU1
> >> >        ----                    ----
> >> >   lock(&buf->lock);
> >> >                                lock(&o_tty->termios_rwsem/1);
> >> >                                lock(&buf->lock);
> >> >   lock(routelock);
> >>
> >> Ignore the stupid picture, it only really works for simple cases.
> >
> > There are two line disciplines using two different locking orders
> >
> > The two line disciplines never execute at once. A given tty is either
> > using one or the other and there is a clear and correctly locked
> > changeover.
> >
> >
> > semantically its something a bit like
> >
> >
> >  foo(x)
> >  {
> >           if (x == 1) {
> >                 lock(A)
> >                 lock(B)
> >           } else {
> >                 lock(B)
> >                 lock(A)
> >           }
> >
> >           Do stuff();
> >
> >           if (x == 1) {
> >                 unlock(B)
> >                 unlock(A)
> >           } else {
> >                 unlock(A)
> >                 unlock(B)
> >           }
> > }
> >
> > with the guarantee made elsewhere that no instances of foo(1) and foo(0)
> > are ever executing at the same time.
> >
> > That's not by dumb design - it's an interesting "nobody ever noticed
> > this" turned up by the lock detector between two totaly unrelated bits of
> > code.

Its a fairly common thing; the typical solution uses lockdep_class_key
instances per type. See for example inodes, they have a per
file_system_type classes exactly because of this. See
inode_init_always()'s use of lockdep_set_class() and struct file_system_type.

The thing with lockdep_class_key's is that they must reside in static
storage, a quick grep shows that all tty_ldisc_ops structures are indeed
in static storage.

I'm just not sure when to apply the lock classes and to which locks though.
The one thing to keep in mind is that lockdep_set_class() cannot be used
on a held lock (I'm not sure we have debug msgs for that, /me makes a
note).

> In out user-space deadlock detector we have an annotation along the
> lines of "forget all info this particular mutex" for such cases
> (between foo(0) and foo(1)). Is there something similar in lockdep?

No, since we don't track this data per instance, and the lock chains
that were recorded are still valid, just not for this instance.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-20 16:08   ` Peter Hurley
@ 2016-01-20 20:47     ` Peter Hurley
  2016-01-21 10:06     ` Dmitry Vyukov
  1 sibling, 0 replies; 31+ messages in thread
From: Peter Hurley @ 2016-01-20 20:47 UTC (permalink / raw)
  To: Dmitry Vyukov, Jiri Slaby, gnomes
  Cc: Peter Zijlstra, Greg Kroah-Hartman, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

On 01/20/2016 08:08 AM, Peter Hurley wrote:
> On 01/20/2016 05:02 AM, Peter Zijlstra wrote:
>> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>>> -> #3 (&buf->lock){+.+...}:
>>>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>>>        [<     inline     >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>>>        [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>>>        [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>>
>> So in any recent code that I look at this function tries to acquire
>> tty->ctrl_lock, not buf->lock. Am I missing something ?!
> 
> Yes.
> 
> The tty locks were annotated with __lockfunc so were being elided from lockdep
> stacktraces. Greg has a patch in his queue from me that removes the __lockfunc
> annotation ("tty: Remove __lockfunc annotation from tty lock functions").
> 
> Unfortunately, I think syzkaller's post-processing stack trace isn't helping
> either, giving the impression that the stack is still inside tty_get_pgrp().
> 
> It's not.
> 
> It's in pty_flush_buffer(), which is taking the 'other tty' buf->lock.
> 
> Looks to me like the lock inversion is caused by the tty_driver_flush_buffer()
> in n_tracerouter_open()/_close(), but I need to look at this mess a little
> closer.

Unfortunately, there's not enough information in the lockdep report to
conclusively determine if there really is a potential for deadlock
(for one, the reports don't show what locks are held by each trace which
is a problem if one of the traces terminates at a global mutex while holding
subclassed locks).

Nevertheless, the patch below should "fix" the problem.

Regards,
Peter Hurley

--- >% ---
Subject: [PATCH] tty: Fix lock inversion in N_TRACEROUTER

n_tracerouter_open()/_close() may cause a lock inversion when
N_TRACEROUTER is set as the ldisc for a master pty.

Unfortunately, the original lockdep report [1] does not contain
enough information to conclusively show a deadlock is possible.

However, the call to tty_driver_flush_buffer() is completely pointless
as this line discipline does not allow and never performs output to
this tty driver; remove.

[1] Email subject "tty: deadlock between n_tracerouter_receivebuf and
flush_to_ldisc" https://lkml.org/lkml/2015/12/30/71

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
---
 drivers/tty/n_tracerouter.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/tty/n_tracerouter.c b/drivers/tty/n_tracerouter.c
index ac57169..a5fd45d 100644
--- a/drivers/tty/n_tracerouter.c
+++ b/drivers/tty/n_tracerouter.c
@@ -82,7 +82,6 @@ static int n_tracerouter_open(struct tty_struct *tty)
 			tr_data->opencalled = 1;
 			tty->disc_data      = tr_data;
 			tty->receive_room   = RECEIVE_ROOM;
-			tty_driver_flush_buffer(tty);
 			retval = 0;
 		}
 	}
@@ -102,7 +101,6 @@ static void n_tracerouter_close(struct tty_struct *tty)
 
 	mutex_lock(&routelock);
 	WARN_ON(tptr->kref_tty != tr_data->kref_tty);
-	tty_driver_flush_buffer(tty);
 	tty_kref_put(tr_data->kref_tty);
 	tr_data->kref_tty = NULL;
 	tr_data->opencalled = 0;
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-20 16:08   ` Peter Hurley
  2016-01-20 20:47     ` Peter Hurley
@ 2016-01-21 10:06     ` Dmitry Vyukov
  2016-01-21 10:20       ` Peter Zijlstra
  2016-01-21 17:43       ` Peter Hurley
  1 sibling, 2 replies; 31+ messages in thread
From: Dmitry Vyukov @ 2016-01-21 10:06 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Peter Zijlstra, Jiri Slaby, One Thousand Gnomes,
	Greg Kroah-Hartman, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

On Wed, Jan 20, 2016 at 5:08 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
> On 01/20/2016 05:02 AM, Peter Zijlstra wrote:
>> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>>> -> #3 (&buf->lock){+.+...}:
>>>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>>>        [<     inline     >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>>>        [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>>>        [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>>
>> So in any recent code that I look at this function tries to acquire
>> tty->ctrl_lock, not buf->lock. Am I missing something ?!
>
> Yes.
>
> The tty locks were annotated with __lockfunc so were being elided from lockdep
> stacktraces. Greg has a patch in his queue from me that removes the __lockfunc
> annotation ("tty: Remove __lockfunc annotation from tty lock functions").
>
> Unfortunately, I think syzkaller's post-processing stack trace isn't helping
> either, giving the impression that the stack is still inside tty_get_pgrp().
>
> It's not.

I've got a new report on commit
a200dcb34693084e56496960d855afdeaaf9578f  (Jan 18).
Here is unprocessed version:
https://gist.githubusercontent.com/dvyukov/428a0c9bfaa867d8ce84/raw/0754db31668602ad07947f9964238b2f9cf63315/gistfile1.txt
and here is processed one:
https://gist.githubusercontent.com/dvyukov/42b874213de82d94c35e/raw/2bbced252035821243678de0112e2ed3a766fb5d/gistfile1.txt

Peter, what exactly is wrong with the post-processed version? I would
be interested in fixing the processing script.

As far as I see it contains the same stacks just with line numbers and
inlined frames. I am using a significantly different compilation mode
(kasan + kcov + very recent gcc), so nobody except me won't be able to
figure out line numbers based on offsets.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-21 10:06     ` Dmitry Vyukov
@ 2016-01-21 10:20       ` Peter Zijlstra
  2016-01-21 17:51         ` Peter Hurley
  2016-01-21 17:43       ` Peter Hurley
  1 sibling, 1 reply; 31+ messages in thread
From: Peter Zijlstra @ 2016-01-21 10:20 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Peter Hurley, Jiri Slaby, One Thousand Gnomes,
	Greg Kroah-Hartman, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

On Thu, Jan 21, 2016 at 11:06:45AM +0100, Dmitry Vyukov wrote:
> On Wed, Jan 20, 2016 at 5:08 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
> > On 01/20/2016 05:02 AM, Peter Zijlstra wrote:
> >> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
> >>> -> #3 (&buf->lock){+.+...}:
> >>>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
> >>>        [<     inline     >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
> >>>        [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
> >>>        [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
> >>
> >> So in any recent code that I look at this function tries to acquire
> >> tty->ctrl_lock, not buf->lock. Am I missing something ?!
> >
> > Yes.
> >
> > The tty locks were annotated with __lockfunc so were being elided from lockdep
> > stacktraces. Greg has a patch in his queue from me that removes the __lockfunc
> > annotation ("tty: Remove __lockfunc annotation from tty lock functions").
> >
> > Unfortunately, I think syzkaller's post-processing stack trace isn't helping
> > either, giving the impression that the stack is still inside tty_get_pgrp().
> >
> > It's not.
> 
> I've got a new report on commit
> a200dcb34693084e56496960d855afdeaaf9578f  (Jan 18).
> Here is unprocessed version:
> https://gist.githubusercontent.com/dvyukov/428a0c9bfaa867d8ce84/raw/0754db31668602ad07947f9964238b2f9cf63315/gistfile1.txt
> and here is processed one:
> https://gist.githubusercontent.com/dvyukov/42b874213de82d94c35e/raw/2bbced252035821243678de0112e2ed3a766fb5d/gistfile1.txt
> 
> Peter, what exactly is wrong with the post-processed version? I would
> be interested in fixing the processing script.
> 
> As far as I see it contains the same stacks just with line numbers and
> inlined frames. I am using a significantly different compilation mode
> (kasan + kcov + very recent gcc), so nobody except me won't be able to
> figure out line numbers based on offsets.

I'm not sure anything is wrong with the post-processing. My confusion
(and Jiri) was that the stack trace lists
tty_get_pgrp()->_raw_spin_lock_irqsave() but that function acquires
tty->ctrl_lock, not as the report claims buf->lock.

PeterH says that lockdep omits functions tagged with __lockfunc, but my
reading disagrees with that.

kernel/locking/lockdep.c:save_trace()
  save_stack_trace()
    dump_trace(.ops = &save_stack_ops)

which has: .address = save_stack_address
  __save_stack_address(.nosched = false)

So lockdep should very much include lock functions.


But this confusion is part of the original report, not because of the
post-processing.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-21 10:06     ` Dmitry Vyukov
  2016-01-21 10:20       ` Peter Zijlstra
@ 2016-01-21 17:43       ` Peter Hurley
  2016-02-03  4:24         ` Peter Hurley
  1 sibling, 1 reply; 31+ messages in thread
From: Peter Hurley @ 2016-01-21 17:43 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Peter Zijlstra, Jiri Slaby, One Thousand Gnomes,
	Greg Kroah-Hartman, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

On 01/21/2016 02:06 AM, Dmitry Vyukov wrote:
> On Wed, Jan 20, 2016 at 5:08 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
>> On 01/20/2016 05:02 AM, Peter Zijlstra wrote:
>>> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>>>> -> #3 (&buf->lock){+.+...}:
>>>>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>>>>        [<     inline     >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>>>>        [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>>>>        [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>>>
>>> So in any recent code that I look at this function tries to acquire
>>> tty->ctrl_lock, not buf->lock. Am I missing something ?!
>>
>> Yes.
>>
>> The tty locks were annotated with __lockfunc so were being elided from lockdep
>> stacktraces. Greg has a patch in his queue from me that removes the __lockfunc
>> annotation ("tty: Remove __lockfunc annotation from tty lock functions").
>>
>> Unfortunately, I think syzkaller's post-processing stack trace isn't helping
>> either, giving the impression that the stack is still inside tty_get_pgrp().
>>
>> It's not.
> 
> I've got a new report on commit
> a200dcb34693084e56496960d855afdeaaf9578f  (Jan 18).
> Here is unprocessed version:
> https://gist.githubusercontent.com/dvyukov/428a0c9bfaa867d8ce84/raw/0754db31668602ad07947f9964238b2f9cf63315/gistfile1.txt
> and here is processed one:
> https://gist.githubusercontent.com/dvyukov/42b874213de82d94c35e/raw/2bbced252035821243678de0112e2ed3a766fb5d/gistfile1.txt
> 
> Peter, what exactly is wrong with the post-processed version?

Yeah, ok, I assumed the problem with this report was post-processing
because of the other report that had mixed-up info.

However, the #3 stacktrace is obviously wrong, as others have already noted.
Plus, the #1 stacktrace is wrong as well.

> I would be interested in fixing the processing script.

Not that it's related (since the original, not-edited report has bogus
stacktraces), but how are you doing debug symbol lookup?

Because below is not correct. Should be kernel/kthread.c:177 (or thereabouts)

       [<ffffffff813b423f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303


> As far as I see it contains the same stacks just with line numbers and
> inlined frames.

Agree, now that I see the original report.

> I am using a significantly different compilation mode
> (kasan + kcov + very recent gcc), so nobody except me won't be able to
> figure out line numbers based on offsets.

Weird. Maybe something to do with the compiler.

Can you get me the dmesg output running the patch below?

Please first just get me the output shortly after boot (please do a
ctrl+C in a terminal window to trigger the expected stacktrace
corresponding to #3 stacktrace from your report).

Then please run your test that generates the circular lockdep report.

--- >% ---
Subject: [PATCH] debug: dump stacktraces for tty lock dependencies

---
 kernel/locking/lockdep.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 60ace56..b67cafb 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -332,7 +332,7 @@ EXPORT_SYMBOL(lockdep_on);
  * Debugging switches:
  */
 
-#define VERBOSE			0
+#define VERBOSE			1
 #define VERY_VERBOSE		0
 
 #if VERBOSE
@@ -351,13 +351,15 @@ EXPORT_SYMBOL(lockdep_on);
  */
 static int class_filter(struct lock_class *class)
 {
-#if 0
-	/* Example */
+#if 1
 	if (class->name_version == 1 &&
-			!strcmp(class->name, "lockname"))
+			!strcmp(class->name, "&buf->lock"))
 		return 1;
 	if (class->name_version == 1 &&
-			!strcmp(class->name, "&struct->lockfield"))
+			!strcmp(class->name, "&port->buf.lock/1"))
+		return 1;
+	if (class->name_version == 1 &&
+			!strcmp(class->name, "routelock"))
 		return 1;
 #endif
 	/* Filter everything else. 1 would be to allow everything else */
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-21 10:20       ` Peter Zijlstra
@ 2016-01-21 17:51         ` Peter Hurley
  2016-01-22 14:10           ` Dmitry Vyukov
  0 siblings, 1 reply; 31+ messages in thread
From: Peter Hurley @ 2016-01-21 17:51 UTC (permalink / raw)
  To: Peter Zijlstra, Dmitry Vyukov
  Cc: Jiri Slaby, One Thousand Gnomes, Greg Kroah-Hartman, LKML,
	J Freyensee, syzkaller, Kostya Serebryany, Alexander Potapenko,
	Sasha Levin, Eric Dumazet

On 01/21/2016 02:20 AM, Peter Zijlstra wrote:
> On Thu, Jan 21, 2016 at 11:06:45AM +0100, Dmitry Vyukov wrote:
>> On Wed, Jan 20, 2016 at 5:08 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
>>> On 01/20/2016 05:02 AM, Peter Zijlstra wrote:
>>>> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>>>>> -> #3 (&buf->lock){+.+...}:
>>>>>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>>>>>        [<     inline     >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>>>>>        [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>>>>>        [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>>>>
>>>> So in any recent code that I look at this function tries to acquire
>>>> tty->ctrl_lock, not buf->lock. Am I missing something ?!
>>>
>>> Yes.
>>>
>>> The tty locks were annotated with __lockfunc so were being elided from lockdep
>>> stacktraces. Greg has a patch in his queue from me that removes the __lockfunc
>>> annotation ("tty: Remove __lockfunc annotation from tty lock functions").
>>>
>>> Unfortunately, I think syzkaller's post-processing stack trace isn't helping
>>> either, giving the impression that the stack is still inside tty_get_pgrp().
>>>
>>> It's not.
>>
>> I've got a new report on commit
>> a200dcb34693084e56496960d855afdeaaf9578f  (Jan 18).
>> Here is unprocessed version:
>> https://gist.githubusercontent.com/dvyukov/428a0c9bfaa867d8ce84/raw/0754db31668602ad07947f9964238b2f9cf63315/gistfile1.txt
>> and here is processed one:
>> https://gist.githubusercontent.com/dvyukov/42b874213de82d94c35e/raw/2bbced252035821243678de0112e2ed3a766fb5d/gistfile1.txt
>>
>> Peter, what exactly is wrong with the post-processed version? I would
>> be interested in fixing the processing script.
>>
>> As far as I see it contains the same stacks just with line numbers and
>> inlined frames. I am using a significantly different compilation mode
>> (kasan + kcov + very recent gcc), so nobody except me won't be able to
>> figure out line numbers based on offsets.
> 
> I'm not sure anything is wrong with the post-processing. My confusion
> (and Jiri) was that the stack trace lists
> tty_get_pgrp()->_raw_spin_lock_irqsave() but that function acquires
> tty->ctrl_lock, not as the report claims buf->lock.

I think this is the other way around; that lockdep has graphed a circular
dependency chain, but that something is wrong with the stack traces.

> PeterH says that lockdep omits functions tagged with __lockfunc, but my
> reading disagrees with that.
> 
> kernel/locking/lockdep.c:save_trace()
>   save_stack_trace()
>     dump_trace(.ops = &save_stack_ops)
> 
> which has: .address = save_stack_address
>   __save_stack_address(.nosched = false)
> 
> So lockdep should very much include lock functions.

Wild hypothesis on my part.

> But this confusion is part of the original report, not because of the
> post-processing.

Yeah, agreed. The original report is very odd.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-21 17:51         ` Peter Hurley
@ 2016-01-22 14:10           ` Dmitry Vyukov
  2016-01-25 16:56             ` Peter Hurley
  0 siblings, 1 reply; 31+ messages in thread
From: Dmitry Vyukov @ 2016-01-22 14:10 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Peter Zijlstra, Jiri Slaby, One Thousand Gnomes,
	Greg Kroah-Hartman, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

On Thu, Jan 21, 2016 at 6:51 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
> On 01/21/2016 02:20 AM, Peter Zijlstra wrote:
>> On Thu, Jan 21, 2016 at 11:06:45AM +0100, Dmitry Vyukov wrote:
>>> On Wed, Jan 20, 2016 at 5:08 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
>>>> On 01/20/2016 05:02 AM, Peter Zijlstra wrote:
>>>>> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>>>>>> -> #3 (&buf->lock){+.+...}:
>>>>>>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>>>>>>        [<     inline     >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>>>>>>        [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>>>>>>        [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>>>>>
>>>>> So in any recent code that I look at this function tries to acquire
>>>>> tty->ctrl_lock, not buf->lock. Am I missing something ?!
>>>>
>>>> Yes.
>>>>
>>>> The tty locks were annotated with __lockfunc so were being elided from lockdep
>>>> stacktraces. Greg has a patch in his queue from me that removes the __lockfunc
>>>> annotation ("tty: Remove __lockfunc annotation from tty lock functions").
>>>>
>>>> Unfortunately, I think syzkaller's post-processing stack trace isn't helping
>>>> either, giving the impression that the stack is still inside tty_get_pgrp().
>>>>
>>>> It's not.
>>>
>>> I've got a new report on commit
>>> a200dcb34693084e56496960d855afdeaaf9578f  (Jan 18).
>>> Here is unprocessed version:
>>> https://gist.githubusercontent.com/dvyukov/428a0c9bfaa867d8ce84/raw/0754db31668602ad07947f9964238b2f9cf63315/gistfile1.txt
>>> and here is processed one:
>>> https://gist.githubusercontent.com/dvyukov/42b874213de82d94c35e/raw/2bbced252035821243678de0112e2ed3a766fb5d/gistfile1.txt
>>>
>>> Peter, what exactly is wrong with the post-processed version? I would
>>> be interested in fixing the processing script.
>>>
>>> As far as I see it contains the same stacks just with line numbers and
>>> inlined frames. I am using a significantly different compilation mode
>>> (kasan + kcov + very recent gcc), so nobody except me won't be able to
>>> figure out line numbers based on offsets.
>>
>> I'm not sure anything is wrong with the post-processing. My confusion
>> (and Jiri) was that the stack trace lists
>> tty_get_pgrp()->_raw_spin_lock_irqsave() but that function acquires
>> tty->ctrl_lock, not as the report claims buf->lock.
>
> I think this is the other way around; that lockdep has graphed a circular
> dependency chain, but that something is wrong with the stack traces.
>
>> PeterH says that lockdep omits functions tagged with __lockfunc, but my
>> reading disagrees with that.
>>
>> kernel/locking/lockdep.c:save_trace()
>>   save_stack_trace()
>>     dump_trace(.ops = &save_stack_ops)
>>
>> which has: .address = save_stack_address
>>   __save_stack_address(.nosched = false)
>>
>> So lockdep should very much include lock functions.
>
> Wild hypothesis on my part.
>
>> But this confusion is part of the original report, not because of the
>> post-processing.
>
> Yeah, agreed. The original report is very odd.






Peter H, I have not yet applied your patch, but got a slightly
differently looking report (n_tracesink_datadrain wasn't mentioned
before):

[ 1483.472724] ======================================================
[ 1483.473033] [ INFO: possible circular locking dependency detected ]
[ 1483.473033] 4.4.0+ #275 Not tainted
[ 1483.473033] -------------------------------------------------------
[ 1483.473033] kworker/u9:0/10808 is trying to acquire lock:
[ 1483.473033]  (writelock){+.+...}, at: [<ffffffff82d1ee14>]
n_tracesink_datadrain+0x24/0xc0
[ 1483.473033]
[ 1483.473033] but task is already holding lock:
[ 1483.473033]  (routelock){+.+...}, at: [<ffffffff82d1e9b0>]
n_tracerouter_receivebuf+0x20/0x40
[ 1483.473033]
[ 1483.473033] which lock already depends on the new lock.
[ 1483.473033]
[ 1483.473033]
[ 1483.473033] the existing dependency chain (in reverse order) is:
[ 1483.473033]
-> #4 (routelock){+.+...}:
[ 1483.473033]        [<ffffffff8145a28c>] lock_acquire+0x1dc/0x430
[ 1483.473033]        [<ffffffff86328ef1>] mutex_lock_nested+0xb1/0xa50
[ 1483.473033]        [<ffffffff82d1e9b0>] n_tracerouter_receivebuf+0x20/0x40
[ 1483.473033]        [<ffffffff82d06484>] flush_to_ldisc+0x584/0x7f0
[ 1483.473033]        [<ffffffff8139fd36>] process_one_work+0x796/0x1440
[ 1483.473033]        [<ffffffff813a0abb>] worker_thread+0xdb/0xfc0
[ 1483.473033]        [<ffffffff813b424f>] kthread+0x23f/0x2d0
[ 1483.473033]        [<ffffffff86332f6f>] ret_from_fork+0x3f/0x70
[ 1483.473033]
-> #3 (&buf->lock){+.+...}:
[ 1483.473033]        [<ffffffff8145a28c>] lock_acquire+0x1dc/0x430
[ 1483.473033]        [<ffffffff8633255f>] _raw_spin_lock_irqsave+0x9f/0xd0
[ 1483.473033]        [<ffffffff82ce4210>] tty_get_pgrp+0x20/0x80
[ 1483.473033]        [<ffffffff82cf28da>] __isig+0x1a/0x50
[ 1483.473033]        [<ffffffff82cf49ae>] isig+0xae/0x2c0
[ 1483.473033]        [<ffffffff82cf7762>] n_tty_receive_signal_char+0x22/0xf0
[ 1483.473033]        [<ffffffff82cfb332>]
n_tty_receive_char_special+0x10d2/0x2b30
[ 1483.473033]        [<ffffffff82cfe733>]
n_tty_receive_buf_common+0x19a3/0x2400
[ 1483.473033]        [<ffffffff82cff1c3>] n_tty_receive_buf2+0x33/0x40
[ 1483.473033]        [<ffffffff82d062bf>] flush_to_ldisc+0x3bf/0x7f0
[ 1483.473033]        [<ffffffff8139fd36>] process_one_work+0x796/0x1440
[ 1483.473033]        [<ffffffff813a0abb>] worker_thread+0xdb/0xfc0
[ 1483.473033]        [<ffffffff813b424f>] kthread+0x23f/0x2d0
[ 1483.473033]        [<ffffffff86332f6f>] ret_from_fork+0x3f/0x70
[ 1483.473033]
-> #2 (&o_tty->termios_rwsem/1){++++..}:
[ 1483.473033]        [<ffffffff8145a28c>] lock_acquire+0x1dc/0x430
[ 1483.473033]        [<ffffffff8632e337>] down_read+0x47/0x60
[ 1483.473033]        [<ffffffff82cfce1d>] n_tty_receive_buf_common+0x8d/0x2400
[ 1483.473033]        [<ffffffff82cff1c3>] n_tty_receive_buf2+0x33/0x40
[ 1483.473033]        [<ffffffff82d062bf>] flush_to_ldisc+0x3bf/0x7f0
[ 1483.473033]        [<ffffffff8139fd36>] process_one_work+0x796/0x1440
[ 1483.473033]        [<ffffffff813a0abb>] worker_thread+0xdb/0xfc0
[ 1483.473033]        [<ffffffff813b424f>] kthread+0x23f/0x2d0
[ 1483.473033]        [<ffffffff86332f6f>] ret_from_fork+0x3f/0x70
[ 1483.473033]
-> #1 (&port->buf.lock/1){+.+...}:
[ 1483.473033]        [<ffffffff8145a28c>] lock_acquire+0x1dc/0x430
[ 1483.473033]        [<ffffffff86328ef1>] mutex_lock_nested+0xb1/0xa50
[ 1483.473033]        [<ffffffff82d1ef53>] n_tracesink_open+0x23/0xf0
[ 1483.473033]        [<ffffffff82d03978>] tty_ldisc_open.isra.2+0x78/0xd0
[ 1483.473033]        [<ffffffff82d03f62>] tty_set_ldisc+0x292/0x8a0
[ 1483.473033]        [<ffffffff82cecbbe>] tty_ioctl+0xb2e/0x2160
[ 1483.473033]        [<ffffffff817ebfbc>] do_vfs_ioctl+0x18c/0xfb0
[ 1483.473033]        [<ffffffff817ece6f>] SyS_ioctl+0x8f/0xc0
[ 1483.473033]        [<ffffffff86332bb6>] entry_SYSCALL_64_fastpath+0x16/0x7a
[ 1483.473033]
-> #0 (writelock){+.+...}:
[ 1483.473033]        [<ffffffff8145692b>] __lock_acquire+0x31eb/0x4700
[ 1483.473033]        [<ffffffff8145a28c>] lock_acquire+0x1dc/0x430
[ 1483.473033]        [<ffffffff86328ef1>] mutex_lock_nested+0xb1/0xa50
[ 1483.473033]        [<ffffffff82d1ee14>] n_tracesink_datadrain+0x24/0xc0
[ 1483.473033]        [<ffffffff82d1e9bb>] n_tracerouter_receivebuf+0x2b/0x40
[ 1483.473033]        [<ffffffff82d06484>] flush_to_ldisc+0x584/0x7f0
[ 1483.473033]        [<ffffffff8139fd36>] process_one_work+0x796/0x1440
[ 1483.473033]        [<ffffffff813a0abb>] worker_thread+0xdb/0xfc0
[ 1483.473033]        [<ffffffff813b424f>] kthread+0x23f/0x2d0
[ 1483.473033]        [<ffffffff86332f6f>] ret_from_fork+0x3f/0x70
[ 1483.473033]
[ 1483.473033] other info that might help us debug this:
[ 1483.473033]
[ 1483.473033] Chain exists of:
  writelock --> &buf->lock --> routelock

[ 1483.473033]  Possible unsafe locking scenario:
[ 1483.473033]
[ 1483.473033]        CPU0                    CPU1
[ 1483.473033]        ----                    ----
[ 1483.473033]   lock(routelock);
[ 1483.473033]                                lock(&buf->lock);
[ 1483.473033]                                lock(routelock);
[ 1483.473033]   lock(writelock);
[ 1483.473033]
[ 1483.473033]  *** DEADLOCK ***
[ 1483.473033]
[ 1483.473033] 5 locks held by kworker/u9:0/10808:
[ 1483.473033]  #0:  ("events_unbound"){.+.+.+}, at:
[<ffffffff8139fc3a>] process_one_work+0x69a/0x1440
[ 1483.473033]  #1:  ((&buf->work)){+.+...}, at: [<ffffffff8139fc6a>]
process_one_work+0x6ca/0x1440
[ 1483.473033]  #2:  (&tty->ldisc_sem){++++++}, at:
[<ffffffff82d0339b>] tty_ldisc_ref+0x1b/0x80
[ 1483.473033]  #3:  (&buf->lock){+.+...}, at: [<ffffffff82d05fe1>]
flush_to_ldisc+0xe1/0x7f0
[ 1483.473033]  #4:  (routelock){+.+...}, at: [<ffffffff82d1e9b0>]
n_tracerouter_receivebuf+0x20/0x40
[ 1483.473033]
[ 1483.473033] stack backtrace:
[ 1483.473033] CPU: 1 PID: 10808 Comm: kworker/u9:0 Not tainted 4.4.0+ #275
[ 1483.473033] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Bochs 01/01/2011
[ 1483.473033] Workqueue: events_unbound flush_to_ldisc
[ 1483.473033]  00000000ffffffff ffff8800334ff7a0 ffffffff82994c8d
ffffffff88faf110
[ 1483.473033]  ffffffff88fdabb0 ffffffff88ff1ef0 ffff8800334ff7f0
ffffffff8144fb58
[ 1483.473033]  ffff88003a290000 ffff88003a2908ea 0000000000000000
ffff88003a2908c8
[ 1483.473033] Call Trace:
[ 1483.473033]  [<ffffffff82994c8d>] dump_stack+0x6f/0xa2
[ 1483.473033]  [<ffffffff8144fb58>] print_circular_bug+0x288/0x340
[ 1483.473033]  [<ffffffff8145692b>] __lock_acquire+0x31eb/0x4700
[ 1483.473033]  [<ffffffff81453740>] ? debug_check_no_locks_freed+0x3c0/0x3c0
[ 1483.473033]  [<ffffffff81453740>] ? debug_check_no_locks_freed+0x3c0/0x3c0
[ 1483.473033]  [<ffffffff81453740>] ? debug_check_no_locks_freed+0x3c0/0x3c0
[ 1483.473033]  [<ffffffff8145a28c>] lock_acquire+0x1dc/0x430
[ 1483.473033]  [<ffffffff82d1ee14>] ? n_tracesink_datadrain+0x24/0xc0
[ 1483.473033]  [<ffffffff86328ef1>] mutex_lock_nested+0xb1/0xa50
[ 1483.473033]  [<ffffffff82d1ee14>] ? n_tracesink_datadrain+0x24/0xc0
[ 1483.473033]  [<ffffffff81452edd>] ? trace_hardirqs_on+0xd/0x10
[ 1483.473033]  [<ffffffff82d1e9b0>] ? n_tracerouter_receivebuf+0x20/0x40
[ 1483.473033]  [<ffffffff81452edd>] ? trace_hardirqs_on+0xd/0x10
[ 1483.473033]  [<ffffffff86328e40>] ?
mutex_lock_interruptible_nested+0xbe0/0xbe0
[ 1483.473033]  [<ffffffff86328e40>] ?
mutex_lock_interruptible_nested+0xbe0/0xbe0
[ 1483.473033]  [<ffffffff86328e40>] ?
mutex_lock_interruptible_nested+0xbe0/0xbe0
[ 1483.473033]  [<ffffffff82d1ee14>] n_tracesink_datadrain+0x24/0xc0
[ 1483.473033]  [<ffffffff82d1e990>] ? n_tracerouter_write+0x20/0x20
[ 1483.473033]  [<ffffffff82d1e9bb>] n_tracerouter_receivebuf+0x2b/0x40
[ 1483.473033]  [<ffffffff82d06484>] flush_to_ldisc+0x584/0x7f0
[ 1483.473033]  [<ffffffff82d05f00>] ? tty_buffer_unlock_exclusive+0x120/0x120
[ 1483.473033]  [<ffffffff8139fd36>] process_one_work+0x796/0x1440
[ 1483.473033]  [<ffffffff8139fc6a>] ? process_one_work+0x6ca/0x1440
[ 1483.473033]  [<ffffffff8139f5a0>] ? pwq_dec_nr_in_flight+0x2e0/0x2e0
[ 1483.473033]  [<ffffffff813a0abb>] worker_thread+0xdb/0xfc0
[ 1483.473033]  [<ffffffff813b424f>] kthread+0x23f/0x2d0
[ 1483.473033]  [<ffffffff813a09e0>] ? process_one_work+0x1440/0x1440
[ 1483.473033]  [<ffffffff813b4010>] ? kthread_create_on_node+0x3b0/0x3b0
[ 1483.473033]  [<ffffffff813b4010>] ? kthread_create_on_node+0x3b0/0x3b0
[ 1483.473033]  [<ffffffff86332f6f>] ret_from_fork+0x3f/0x70
[ 1483.473033]  [<ffffffff813b4010>] ? kthread_create_on_node+0x3b0/0x3b0

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-22 14:10           ` Dmitry Vyukov
@ 2016-01-25 16:56             ` Peter Hurley
  0 siblings, 0 replies; 31+ messages in thread
From: Peter Hurley @ 2016-01-25 16:56 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Peter Zijlstra, Jiri Slaby, One Thousand Gnomes,
	Greg Kroah-Hartman, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

On 01/22/2016 06:10 AM, Dmitry Vyukov wrote:
> On Thu, Jan 21, 2016 at 6:51 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
>> On 01/21/2016 02:20 AM, Peter Zijlstra wrote:
>>> On Thu, Jan 21, 2016 at 11:06:45AM +0100, Dmitry Vyukov wrote:
>>>> On Wed, Jan 20, 2016 at 5:08 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
>>>>> On 01/20/2016 05:02 AM, Peter Zijlstra wrote:
>>>>>> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>>>>>>> -> #3 (&buf->lock){+.+...}:
>>>>>>>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>>>>>>>        [<     inline     >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>>>>>>>        [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>>>>>>>        [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>>>>>>
>>>>>> So in any recent code that I look at this function tries to acquire
>>>>>> tty->ctrl_lock, not buf->lock. Am I missing something ?!
>>>>>
>>>>> Yes.
>>>>>
>>>>> The tty locks were annotated with __lockfunc so were being elided from lockdep
>>>>> stacktraces. Greg has a patch in his queue from me that removes the __lockfunc
>>>>> annotation ("tty: Remove __lockfunc annotation from tty lock functions").
>>>>>
>>>>> Unfortunately, I think syzkaller's post-processing stack trace isn't helping
>>>>> either, giving the impression that the stack is still inside tty_get_pgrp().
>>>>>
>>>>> It's not.
>>>>
>>>> I've got a new report on commit
>>>> a200dcb34693084e56496960d855afdeaaf9578f  (Jan 18).
>>>> Here is unprocessed version:
>>>> https://gist.githubusercontent.com/dvyukov/428a0c9bfaa867d8ce84/raw/0754db31668602ad07947f9964238b2f9cf63315/gistfile1.txt
>>>> and here is processed one:
>>>> https://gist.githubusercontent.com/dvyukov/42b874213de82d94c35e/raw/2bbced252035821243678de0112e2ed3a766fb5d/gistfile1.txt
>>>>
>>>> Peter, what exactly is wrong with the post-processed version? I would
>>>> be interested in fixing the processing script.
>>>>
>>>> As far as I see it contains the same stacks just with line numbers and
>>>> inlined frames. I am using a significantly different compilation mode
>>>> (kasan + kcov + very recent gcc), so nobody except me won't be able to
>>>> figure out line numbers based on offsets.
>>>
>>> I'm not sure anything is wrong with the post-processing. My confusion
>>> (and Jiri) was that the stack trace lists
>>> tty_get_pgrp()->_raw_spin_lock_irqsave() but that function acquires
>>> tty->ctrl_lock, not as the report claims buf->lock.
>>
>> I think this is the other way around; that lockdep has graphed a circular
>> dependency chain, but that something is wrong with the stack traces.
>>
>>> PeterH says that lockdep omits functions tagged with __lockfunc, but my
>>> reading disagrees with that.
>>>
>>> kernel/locking/lockdep.c:save_trace()
>>>   save_stack_trace()
>>>     dump_trace(.ops = &save_stack_ops)
>>>
>>> which has: .address = save_stack_address
>>>   __save_stack_address(.nosched = false)
>>>
>>> So lockdep should very much include lock functions.
>>
>> Wild hypothesis on my part.
>>
>>> But this confusion is part of the original report, not because of the
>>> post-processing.
>>
>> Yeah, agreed. The original report is very odd.
> 
> 
> 
> 
> 
> 
> Peter H, I have not yet applied your patch, but got a slightly
> differently looking report (n_tracesink_datadrain wasn't mentioned
> before):

May or may not be related. And this report suffers from the same
deficiencies as the previous report which is that the saved
stack trace doesn't match the lock dependency.

Without understanding why the stack trace doesn't match the lock
dependency, any solution is just based on conjecture.

Regards,
Peter Hurley


> [ 1483.472724] ======================================================
> [ 1483.473033] [ INFO: possible circular locking dependency detected ]
> [ 1483.473033] 4.4.0+ #275 Not tainted
> [ 1483.473033] -------------------------------------------------------
> [ 1483.473033] kworker/u9:0/10808 is trying to acquire lock:
> [ 1483.473033]  (writelock){+.+...}, at: [<ffffffff82d1ee14>]
> n_tracesink_datadrain+0x24/0xc0
> [ 1483.473033]
> [ 1483.473033] but task is already holding lock:
> [ 1483.473033]  (routelock){+.+...}, at: [<ffffffff82d1e9b0>]
> n_tracerouter_receivebuf+0x20/0x40
> [ 1483.473033]
> [ 1483.473033] which lock already depends on the new lock.
> [ 1483.473033]
> [ 1483.473033]
> [ 1483.473033] the existing dependency chain (in reverse order) is:
> [ 1483.473033]
> -> #4 (routelock){+.+...}:
> [ 1483.473033]        [<ffffffff8145a28c>] lock_acquire+0x1dc/0x430
> [ 1483.473033]        [<ffffffff86328ef1>] mutex_lock_nested+0xb1/0xa50
> [ 1483.473033]        [<ffffffff82d1e9b0>] n_tracerouter_receivebuf+0x20/0x40
> [ 1483.473033]        [<ffffffff82d06484>] flush_to_ldisc+0x584/0x7f0
> [ 1483.473033]        [<ffffffff8139fd36>] process_one_work+0x796/0x1440
> [ 1483.473033]        [<ffffffff813a0abb>] worker_thread+0xdb/0xfc0
> [ 1483.473033]        [<ffffffff813b424f>] kthread+0x23f/0x2d0
> [ 1483.473033]        [<ffffffff86332f6f>] ret_from_fork+0x3f/0x70
> [ 1483.473033]
> -> #3 (&buf->lock){+.+...}:
> [ 1483.473033]        [<ffffffff8145a28c>] lock_acquire+0x1dc/0x430
> [ 1483.473033]        [<ffffffff8633255f>] _raw_spin_lock_irqsave+0x9f/0xd0
> [ 1483.473033]        [<ffffffff82ce4210>] tty_get_pgrp+0x20/0x80
> [ 1483.473033]        [<ffffffff82cf28da>] __isig+0x1a/0x50
> [ 1483.473033]        [<ffffffff82cf49ae>] isig+0xae/0x2c0
> [ 1483.473033]        [<ffffffff82cf7762>] n_tty_receive_signal_char+0x22/0xf0
> [ 1483.473033]        [<ffffffff82cfb332>]
> n_tty_receive_char_special+0x10d2/0x2b30
> [ 1483.473033]        [<ffffffff82cfe733>]
> n_tty_receive_buf_common+0x19a3/0x2400
> [ 1483.473033]        [<ffffffff82cff1c3>] n_tty_receive_buf2+0x33/0x40
> [ 1483.473033]        [<ffffffff82d062bf>] flush_to_ldisc+0x3bf/0x7f0
> [ 1483.473033]        [<ffffffff8139fd36>] process_one_work+0x796/0x1440
> [ 1483.473033]        [<ffffffff813a0abb>] worker_thread+0xdb/0xfc0
> [ 1483.473033]        [<ffffffff813b424f>] kthread+0x23f/0x2d0
> [ 1483.473033]        [<ffffffff86332f6f>] ret_from_fork+0x3f/0x70
> [ 1483.473033]
> -> #2 (&o_tty->termios_rwsem/1){++++..}:
> [ 1483.473033]        [<ffffffff8145a28c>] lock_acquire+0x1dc/0x430
> [ 1483.473033]        [<ffffffff8632e337>] down_read+0x47/0x60
> [ 1483.473033]        [<ffffffff82cfce1d>] n_tty_receive_buf_common+0x8d/0x2400
> [ 1483.473033]        [<ffffffff82cff1c3>] n_tty_receive_buf2+0x33/0x40
> [ 1483.473033]        [<ffffffff82d062bf>] flush_to_ldisc+0x3bf/0x7f0
> [ 1483.473033]        [<ffffffff8139fd36>] process_one_work+0x796/0x1440
> [ 1483.473033]        [<ffffffff813a0abb>] worker_thread+0xdb/0xfc0
> [ 1483.473033]        [<ffffffff813b424f>] kthread+0x23f/0x2d0
> [ 1483.473033]        [<ffffffff86332f6f>] ret_from_fork+0x3f/0x70
> [ 1483.473033]
> -> #1 (&port->buf.lock/1){+.+...}:
> [ 1483.473033]        [<ffffffff8145a28c>] lock_acquire+0x1dc/0x430
> [ 1483.473033]        [<ffffffff86328ef1>] mutex_lock_nested+0xb1/0xa50
> [ 1483.473033]        [<ffffffff82d1ef53>] n_tracesink_open+0x23/0xf0
> [ 1483.473033]        [<ffffffff82d03978>] tty_ldisc_open.isra.2+0x78/0xd0
> [ 1483.473033]        [<ffffffff82d03f62>] tty_set_ldisc+0x292/0x8a0
> [ 1483.473033]        [<ffffffff82cecbbe>] tty_ioctl+0xb2e/0x2160
> [ 1483.473033]        [<ffffffff817ebfbc>] do_vfs_ioctl+0x18c/0xfb0
> [ 1483.473033]        [<ffffffff817ece6f>] SyS_ioctl+0x8f/0xc0
> [ 1483.473033]        [<ffffffff86332bb6>] entry_SYSCALL_64_fastpath+0x16/0x7a
> [ 1483.473033]
> -> #0 (writelock){+.+...}:
> [ 1483.473033]        [<ffffffff8145692b>] __lock_acquire+0x31eb/0x4700
> [ 1483.473033]        [<ffffffff8145a28c>] lock_acquire+0x1dc/0x430
> [ 1483.473033]        [<ffffffff86328ef1>] mutex_lock_nested+0xb1/0xa50
> [ 1483.473033]        [<ffffffff82d1ee14>] n_tracesink_datadrain+0x24/0xc0
> [ 1483.473033]        [<ffffffff82d1e9bb>] n_tracerouter_receivebuf+0x2b/0x40
> [ 1483.473033]        [<ffffffff82d06484>] flush_to_ldisc+0x584/0x7f0
> [ 1483.473033]        [<ffffffff8139fd36>] process_one_work+0x796/0x1440
> [ 1483.473033]        [<ffffffff813a0abb>] worker_thread+0xdb/0xfc0
> [ 1483.473033]        [<ffffffff813b424f>] kthread+0x23f/0x2d0
> [ 1483.473033]        [<ffffffff86332f6f>] ret_from_fork+0x3f/0x70
> [ 1483.473033]
> [ 1483.473033] other info that might help us debug this:
> [ 1483.473033]
> [ 1483.473033] Chain exists of:
>   writelock --> &buf->lock --> routelock
> 
> [ 1483.473033]  Possible unsafe locking scenario:
> [ 1483.473033]
> [ 1483.473033]        CPU0                    CPU1
> [ 1483.473033]        ----                    ----
> [ 1483.473033]   lock(routelock);
> [ 1483.473033]                                lock(&buf->lock);
> [ 1483.473033]                                lock(routelock);
> [ 1483.473033]   lock(writelock);
> [ 1483.473033]
> [ 1483.473033]  *** DEADLOCK ***
> [ 1483.473033]
> [ 1483.473033] 5 locks held by kworker/u9:0/10808:
> [ 1483.473033]  #0:  ("events_unbound"){.+.+.+}, at:
> [<ffffffff8139fc3a>] process_one_work+0x69a/0x1440
> [ 1483.473033]  #1:  ((&buf->work)){+.+...}, at: [<ffffffff8139fc6a>]
> process_one_work+0x6ca/0x1440
> [ 1483.473033]  #2:  (&tty->ldisc_sem){++++++}, at:
> [<ffffffff82d0339b>] tty_ldisc_ref+0x1b/0x80
> [ 1483.473033]  #3:  (&buf->lock){+.+...}, at: [<ffffffff82d05fe1>]
> flush_to_ldisc+0xe1/0x7f0
> [ 1483.473033]  #4:  (routelock){+.+...}, at: [<ffffffff82d1e9b0>]
> n_tracerouter_receivebuf+0x20/0x40
> [ 1483.473033]
> [ 1483.473033] stack backtrace:
> [ 1483.473033] CPU: 1 PID: 10808 Comm: kworker/u9:0 Not tainted 4.4.0+ #275
> [ 1483.473033] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS Bochs 01/01/2011
> [ 1483.473033] Workqueue: events_unbound flush_to_ldisc
> [ 1483.473033]  00000000ffffffff ffff8800334ff7a0 ffffffff82994c8d
> ffffffff88faf110
> [ 1483.473033]  ffffffff88fdabb0 ffffffff88ff1ef0 ffff8800334ff7f0
> ffffffff8144fb58
> [ 1483.473033]  ffff88003a290000 ffff88003a2908ea 0000000000000000
> ffff88003a2908c8
> [ 1483.473033] Call Trace:
> [ 1483.473033]  [<ffffffff82994c8d>] dump_stack+0x6f/0xa2
> [ 1483.473033]  [<ffffffff8144fb58>] print_circular_bug+0x288/0x340
> [ 1483.473033]  [<ffffffff8145692b>] __lock_acquire+0x31eb/0x4700
> [ 1483.473033]  [<ffffffff81453740>] ? debug_check_no_locks_freed+0x3c0/0x3c0
> [ 1483.473033]  [<ffffffff81453740>] ? debug_check_no_locks_freed+0x3c0/0x3c0
> [ 1483.473033]  [<ffffffff81453740>] ? debug_check_no_locks_freed+0x3c0/0x3c0
> [ 1483.473033]  [<ffffffff8145a28c>] lock_acquire+0x1dc/0x430
> [ 1483.473033]  [<ffffffff82d1ee14>] ? n_tracesink_datadrain+0x24/0xc0
> [ 1483.473033]  [<ffffffff86328ef1>] mutex_lock_nested+0xb1/0xa50
> [ 1483.473033]  [<ffffffff82d1ee14>] ? n_tracesink_datadrain+0x24/0xc0
> [ 1483.473033]  [<ffffffff81452edd>] ? trace_hardirqs_on+0xd/0x10
> [ 1483.473033]  [<ffffffff82d1e9b0>] ? n_tracerouter_receivebuf+0x20/0x40
> [ 1483.473033]  [<ffffffff81452edd>] ? trace_hardirqs_on+0xd/0x10
> [ 1483.473033]  [<ffffffff86328e40>] ?
> mutex_lock_interruptible_nested+0xbe0/0xbe0
> [ 1483.473033]  [<ffffffff86328e40>] ?
> mutex_lock_interruptible_nested+0xbe0/0xbe0
> [ 1483.473033]  [<ffffffff86328e40>] ?
> mutex_lock_interruptible_nested+0xbe0/0xbe0
> [ 1483.473033]  [<ffffffff82d1ee14>] n_tracesink_datadrain+0x24/0xc0
> [ 1483.473033]  [<ffffffff82d1e990>] ? n_tracerouter_write+0x20/0x20
> [ 1483.473033]  [<ffffffff82d1e9bb>] n_tracerouter_receivebuf+0x2b/0x40
> [ 1483.473033]  [<ffffffff82d06484>] flush_to_ldisc+0x584/0x7f0
> [ 1483.473033]  [<ffffffff82d05f00>] ? tty_buffer_unlock_exclusive+0x120/0x120
> [ 1483.473033]  [<ffffffff8139fd36>] process_one_work+0x796/0x1440
> [ 1483.473033]  [<ffffffff8139fc6a>] ? process_one_work+0x6ca/0x1440
> [ 1483.473033]  [<ffffffff8139f5a0>] ? pwq_dec_nr_in_flight+0x2e0/0x2e0
> [ 1483.473033]  [<ffffffff813a0abb>] worker_thread+0xdb/0xfc0
> [ 1483.473033]  [<ffffffff813b424f>] kthread+0x23f/0x2d0
> [ 1483.473033]  [<ffffffff813a09e0>] ? process_one_work+0x1440/0x1440
> [ 1483.473033]  [<ffffffff813b4010>] ? kthread_create_on_node+0x3b0/0x3b0
> [ 1483.473033]  [<ffffffff813b4010>] ? kthread_create_on_node+0x3b0/0x3b0
> [ 1483.473033]  [<ffffffff86332f6f>] ret_from_fork+0x3f/0x70
> [ 1483.473033]  [<ffffffff813b4010>] ? kthread_create_on_node+0x3b0/0x3b0
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-01-21 17:43       ` Peter Hurley
@ 2016-02-03  4:24         ` Peter Hurley
  2016-02-03 17:32           ` Dmitry Vyukov
  0 siblings, 1 reply; 31+ messages in thread
From: Peter Hurley @ 2016-02-03  4:24 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Peter Zijlstra, Jiri Slaby, One Thousand Gnomes,
	Greg Kroah-Hartman, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

Hi Dmitry,

On 01/21/2016 09:43 AM, Peter Hurley wrote:
> On 01/21/2016 02:06 AM, Dmitry Vyukov wrote:
>> On Wed, Jan 20, 2016 at 5:08 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
>>> On 01/20/2016 05:02 AM, Peter Zijlstra wrote:
>>>> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>>>>> -> #3 (&buf->lock){+.+...}:
>>>>>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>>>>>        [<     inline     >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>>>>>        [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>>>>>        [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>>>>
>>>> So in any recent code that I look at this function tries to acquire
>>>> tty->ctrl_lock, not buf->lock. Am I missing something ?!
>>>
>>> Yes.
>>>
>>> The tty locks were annotated with __lockfunc so were being elided from lockdep
>>> stacktraces. Greg has a patch in his queue from me that removes the __lockfunc
>>> annotation ("tty: Remove __lockfunc annotation from tty lock functions").
>>>
>>> Unfortunately, I think syzkaller's post-processing stack trace isn't helping
>>> either, giving the impression that the stack is still inside tty_get_pgrp().
>>>
>>> It's not.
>>
>> I've got a new report on commit
>> a200dcb34693084e56496960d855afdeaaf9578f  (Jan 18).
>> Here is unprocessed version:
>> https://gist.githubusercontent.com/dvyukov/428a0c9bfaa867d8ce84/raw/0754db31668602ad07947f9964238b2f9cf63315/gistfile1.txt
>> and here is processed one:
>> https://gist.githubusercontent.com/dvyukov/42b874213de82d94c35e/raw/2bbced252035821243678de0112e2ed3a766fb5d/gistfile1.txt
>>
>> Peter, what exactly is wrong with the post-processed version?
> 
> Yeah, ok, I assumed the problem with this report was post-processing
> because of the other report that had mixed-up info.
> 
> However, the #3 stacktrace is obviously wrong, as others have already noted.
> Plus, the #1 stacktrace is wrong as well.
> 
>> I would be interested in fixing the processing script.
> 
> Not that it's related (since the original, not-edited report has bogus
> stacktraces), but how are you doing debug symbol lookup?
> 
> Because below is not correct. Should be kernel/kthread.c:177 (or thereabouts)
> 
>        [<ffffffff813b423f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
> 
> 
>> As far as I see it contains the same stacks just with line numbers and
>> inlined frames.
> 
> Agree, now that I see the original report.
> 
>> I am using a significantly different compilation mode
>> (kasan + kcov + very recent gcc), so nobody except me won't be able to
>> figure out line numbers based on offsets.
> 
> Weird. Maybe something to do with the compiler.
> 
> Can you get me the dmesg output running the patch below?

Wondering if this is still the priority it was not so long ago?
If not, that's fine and I'll drop this from my followup list.



> Please first just get me the output shortly after boot (please do a
> ctrl+C in a terminal window to trigger the expected stacktrace
> corresponding to #3 stacktrace from your report).
> 
> Then please run your test that generates the circular lockdep report.
> 
> --- >% ---
> Subject: [PATCH] debug: dump stacktraces for tty lock dependencies
> 
> ---
>  kernel/locking/lockdep.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 60ace56..b67cafb 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -332,7 +332,7 @@ EXPORT_SYMBOL(lockdep_on);
>   * Debugging switches:
>   */
>  
> -#define VERBOSE			0
> +#define VERBOSE			1
>  #define VERY_VERBOSE		0
>  
>  #if VERBOSE
> @@ -351,13 +351,15 @@ EXPORT_SYMBOL(lockdep_on);
>   */
>  static int class_filter(struct lock_class *class)
>  {
> -#if 0
> -	/* Example */
> +#if 1
>  	if (class->name_version == 1 &&
> -			!strcmp(class->name, "lockname"))
> +			!strcmp(class->name, "&buf->lock"))
>  		return 1;
>  	if (class->name_version == 1 &&
> -			!strcmp(class->name, "&struct->lockfield"))
> +			!strcmp(class->name, "&port->buf.lock/1"))
> +		return 1;
> +	if (class->name_version == 1 &&
> +			!strcmp(class->name, "routelock"))
>  		return 1;
>  #endif
>  	/* Filter everything else. 1 would be to allow everything else */
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-02-03  4:24         ` Peter Hurley
@ 2016-02-03 17:32           ` Dmitry Vyukov
  2016-02-03 19:09             ` Peter Hurley
  0 siblings, 1 reply; 31+ messages in thread
From: Dmitry Vyukov @ 2016-02-03 17:32 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Peter Zijlstra, Jiri Slaby, One Thousand Gnomes,
	Greg Kroah-Hartman, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

On Wed, Feb 3, 2016 at 5:24 AM, Peter Hurley <peter@hurleysoftware.com> wrote:
> Hi Dmitry,
>
> On 01/21/2016 09:43 AM, Peter Hurley wrote:
>> On 01/21/2016 02:06 AM, Dmitry Vyukov wrote:
>>> On Wed, Jan 20, 2016 at 5:08 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
>>>> On 01/20/2016 05:02 AM, Peter Zijlstra wrote:
>>>>> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>>>>>> -> #3 (&buf->lock){+.+...}:
>>>>>>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>>>>>>        [<     inline     >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>>>>>>        [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>>>>>>        [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>>>>>
>>>>> So in any recent code that I look at this function tries to acquire
>>>>> tty->ctrl_lock, not buf->lock. Am I missing something ?!
>>>>
>>>> Yes.
>>>>
>>>> The tty locks were annotated with __lockfunc so were being elided from lockdep
>>>> stacktraces. Greg has a patch in his queue from me that removes the __lockfunc
>>>> annotation ("tty: Remove __lockfunc annotation from tty lock functions").
>>>>
>>>> Unfortunately, I think syzkaller's post-processing stack trace isn't helping
>>>> either, giving the impression that the stack is still inside tty_get_pgrp().
>>>>
>>>> It's not.
>>>
>>> I've got a new report on commit
>>> a200dcb34693084e56496960d855afdeaaf9578f  (Jan 18).
>>> Here is unprocessed version:
>>> https://gist.githubusercontent.com/dvyukov/428a0c9bfaa867d8ce84/raw/0754db31668602ad07947f9964238b2f9cf63315/gistfile1.txt
>>> and here is processed one:
>>> https://gist.githubusercontent.com/dvyukov/42b874213de82d94c35e/raw/2bbced252035821243678de0112e2ed3a766fb5d/gistfile1.txt
>>>
>>> Peter, what exactly is wrong with the post-processed version?
>>
>> Yeah, ok, I assumed the problem with this report was post-processing
>> because of the other report that had mixed-up info.
>>
>> However, the #3 stacktrace is obviously wrong, as others have already noted.
>> Plus, the #1 stacktrace is wrong as well.
>>
>>> I would be interested in fixing the processing script.
>>
>> Not that it's related (since the original, not-edited report has bogus
>> stacktraces), but how are you doing debug symbol lookup?
>>
>> Because below is not correct. Should be kernel/kthread.c:177 (or thereabouts)
>>
>>        [<ffffffff813b423f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
>>
>>
>>> As far as I see it contains the same stacks just with line numbers and
>>> inlined frames.
>>
>> Agree, now that I see the original report.
>>
>>> I am using a significantly different compilation mode
>>> (kasan + kcov + very recent gcc), so nobody except me won't be able to
>>> figure out line numbers based on offsets.
>>
>> Weird. Maybe something to do with the compiler.
>>
>> Can you get me the dmesg output running the patch below?
>
> Wondering if this is still the priority it was not so long ago?
> If not, that's fine and I'll drop this from my followup list.


Yes, it is still the priority for me.
I've tried to apply your debugging patch, but I noticed that it prints
dependencies stacks as it discovers them. But in my setup I don't have
all output from machine start (there is just too many of it). And I
don't have a localized reproducer for this.

I will try again.

Do you want me to debug with your "tty: Fix lock inversion in
N_TRACEROUTER" patch applied or not (I still see slightly different
deadlock reports with it)?


>> Please first just get me the output shortly after boot (please do a
>> ctrl+C in a terminal window to trigger the expected stacktrace
>> corresponding to #3 stacktrace from your report).
>>
>> Then please run your test that generates the circular lockdep report.
>>
>> --- >% ---
>> Subject: [PATCH] debug: dump stacktraces for tty lock dependencies
>>
>> ---
>>  kernel/locking/lockdep.c | 12 +++++++-----
>>  1 file changed, 7 insertions(+), 5 deletions(-)
>>
>> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
>> index 60ace56..b67cafb 100644
>> --- a/kernel/locking/lockdep.c
>> +++ b/kernel/locking/lockdep.c
>> @@ -332,7 +332,7 @@ EXPORT_SYMBOL(lockdep_on);
>>   * Debugging switches:
>>   */
>>
>> -#define VERBOSE                      0
>> +#define VERBOSE                      1
>>  #define VERY_VERBOSE         0
>>
>>  #if VERBOSE
>> @@ -351,13 +351,15 @@ EXPORT_SYMBOL(lockdep_on);
>>   */
>>  static int class_filter(struct lock_class *class)
>>  {
>> -#if 0
>> -     /* Example */
>> +#if 1
>>       if (class->name_version == 1 &&
>> -                     !strcmp(class->name, "lockname"))
>> +                     !strcmp(class->name, "&buf->lock"))
>>               return 1;
>>       if (class->name_version == 1 &&
>> -                     !strcmp(class->name, "&struct->lockfield"))
>> +                     !strcmp(class->name, "&port->buf.lock/1"))
>> +             return 1;
>> +     if (class->name_version == 1 &&
>> +                     !strcmp(class->name, "routelock"))
>>               return 1;
>>  #endif
>>       /* Filter everything else. 1 would be to allow everything else */
>>
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-02-03 17:32           ` Dmitry Vyukov
@ 2016-02-03 19:09             ` Peter Hurley
  2016-02-04 12:39               ` Dmitry Vyukov
  0 siblings, 1 reply; 31+ messages in thread
From: Peter Hurley @ 2016-02-03 19:09 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Peter Zijlstra, Jiri Slaby, One Thousand Gnomes,
	Greg Kroah-Hartman, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

On 02/03/2016 09:32 AM, Dmitry Vyukov wrote:
> On Wed, Feb 3, 2016 at 5:24 AM, Peter Hurley <peter@hurleysoftware.com> wrote:
>> Hi Dmitry,
>>
>> On 01/21/2016 09:43 AM, Peter Hurley wrote:
>>> On 01/21/2016 02:06 AM, Dmitry Vyukov wrote:
>>>> On Wed, Jan 20, 2016 at 5:08 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
>>>>> On 01/20/2016 05:02 AM, Peter Zijlstra wrote:
>>>>>> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>>>>>>> -> #3 (&buf->lock){+.+...}:
>>>>>>>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>>>>>>>        [<     inline     >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>>>>>>>        [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>>>>>>>        [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>>>>>>
>>>>>> So in any recent code that I look at this function tries to acquire
>>>>>> tty->ctrl_lock, not buf->lock. Am I missing something ?!
>>>>>
>>>>> Yes.
>>>>>
>>>>> The tty locks were annotated with __lockfunc so were being elided from lockdep
>>>>> stacktraces. Greg has a patch in his queue from me that removes the __lockfunc
>>>>> annotation ("tty: Remove __lockfunc annotation from tty lock functions").
>>>>>
>>>>> Unfortunately, I think syzkaller's post-processing stack trace isn't helping
>>>>> either, giving the impression that the stack is still inside tty_get_pgrp().
>>>>>
>>>>> It's not.
>>>>
>>>> I've got a new report on commit
>>>> a200dcb34693084e56496960d855afdeaaf9578f  (Jan 18).
>>>> Here is unprocessed version:
>>>> https://gist.githubusercontent.com/dvyukov/428a0c9bfaa867d8ce84/raw/0754db31668602ad07947f9964238b2f9cf63315/gistfile1.txt
>>>> and here is processed one:
>>>> https://gist.githubusercontent.com/dvyukov/42b874213de82d94c35e/raw/2bbced252035821243678de0112e2ed3a766fb5d/gistfile1.txt
>>>>
>>>> Peter, what exactly is wrong with the post-processed version?
>>>
>>> Yeah, ok, I assumed the problem with this report was post-processing
>>> because of the other report that had mixed-up info.
>>>
>>> However, the #3 stacktrace is obviously wrong, as others have already noted.
>>> Plus, the #1 stacktrace is wrong as well.
>>>
>>>> I would be interested in fixing the processing script.
>>>
>>> Not that it's related (since the original, not-edited report has bogus
>>> stacktraces), but how are you doing debug symbol lookup?
>>>
>>> Because below is not correct. Should be kernel/kthread.c:177 (or thereabouts)
>>>
>>>        [<ffffffff813b423f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
>>>
>>>
>>>> As far as I see it contains the same stacks just with line numbers and
>>>> inlined frames.
>>>
>>> Agree, now that I see the original report.
>>>
>>>> I am using a significantly different compilation mode
>>>> (kasan + kcov + very recent gcc), so nobody except me won't be able to
>>>> figure out line numbers based on offsets.
>>>
>>> Weird. Maybe something to do with the compiler.
>>>
>>> Can you get me the dmesg output running the patch below?
>>
>> Wondering if this is still the priority it was not so long ago?
>> If not, that's fine and I'll drop this from my followup list.
> 
> 
> Yes, it is still the priority for me.
> I've tried to apply your debugging patch, but I noticed that it prints
> dependencies stacks as it discovers them.

Yeah, that's the point; I need to understand why lockdep doesn't
store the correct stack trace at dependency discovery.

Since the correct stack trace will be printed instead, it will help
debug the lockdep problem.

Hopefully, once the problem with the bad stacktraces are fixed, the
actual circular lock dependencies will be clear.

> But in my setup I don't have
> all output from machine start (there is just too many of it).

Kernel parameter:

	log_buf_len=1G


> And I don't have a localized reproducer for this.

I really just need the lockdep dependency stacks generated during boot,
and the ctrl+C in a terminal window to trigger one of the dependency
stacks.

> I will try again.

Ok.

> Do you want me to debug with your "tty: Fix lock inversion in
> N_TRACEROUTER" patch applied or not (I still see slightly different
> deadlock reports with it)?

Not.

I think that probably does fix at least one circular dependency, but
I want to figure out the bad stack trace problem first.

There's probably another circular dependency there, as indicated by
your other report.

Regards,
Peter Hurley


>>> Please first just get me the output shortly after boot (please do a
>>> ctrl+C in a terminal window to trigger the expected stacktrace
>>> corresponding to #3 stacktrace from your report).
>>>
>>> Then please run your test that generates the circular lockdep report.
>>>
>>> --- >% ---
>>> Subject: [PATCH] debug: dump stacktraces for tty lock dependencies
>>>
>>> ---
>>>  kernel/locking/lockdep.c | 12 +++++++-----
>>>  1 file changed, 7 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
>>> index 60ace56..b67cafb 100644
>>> --- a/kernel/locking/lockdep.c
>>> +++ b/kernel/locking/lockdep.c
>>> @@ -332,7 +332,7 @@ EXPORT_SYMBOL(lockdep_on);
>>>   * Debugging switches:
>>>   */
>>>
>>> -#define VERBOSE                      0
>>> +#define VERBOSE                      1
>>>  #define VERY_VERBOSE         0
>>>
>>>  #if VERBOSE
>>> @@ -351,13 +351,15 @@ EXPORT_SYMBOL(lockdep_on);
>>>   */
>>>  static int class_filter(struct lock_class *class)
>>>  {
>>> -#if 0
>>> -     /* Example */
>>> +#if 1
>>>       if (class->name_version == 1 &&
>>> -                     !strcmp(class->name, "lockname"))
>>> +                     !strcmp(class->name, "&buf->lock"))
>>>               return 1;
>>>       if (class->name_version == 1 &&
>>> -                     !strcmp(class->name, "&struct->lockfield"))
>>> +                     !strcmp(class->name, "&port->buf.lock/1"))
>>> +             return 1;
>>> +     if (class->name_version == 1 &&
>>> +                     !strcmp(class->name, "routelock"))
>>>               return 1;
>>>  #endif
>>>       /* Filter everything else. 1 would be to allow everything else */
>>>
>>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-02-03 19:09             ` Peter Hurley
@ 2016-02-04 12:39               ` Dmitry Vyukov
  2016-02-04 13:17                 ` Dmitry Vyukov
  0 siblings, 1 reply; 31+ messages in thread
From: Dmitry Vyukov @ 2016-02-04 12:39 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Peter Zijlstra, Jiri Slaby, One Thousand Gnomes,
	Greg Kroah-Hartman, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

On Wed, Feb 3, 2016 at 8:09 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
> On 02/03/2016 09:32 AM, Dmitry Vyukov wrote:
>> On Wed, Feb 3, 2016 at 5:24 AM, Peter Hurley <peter@hurleysoftware.com> wrote:
>>> Hi Dmitry,
>>>
>>> On 01/21/2016 09:43 AM, Peter Hurley wrote:
>>>> On 01/21/2016 02:06 AM, Dmitry Vyukov wrote:
>>>>> On Wed, Jan 20, 2016 at 5:08 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
>>>>>> On 01/20/2016 05:02 AM, Peter Zijlstra wrote:
>>>>>>> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>>>>>>>> -> #3 (&buf->lock){+.+...}:
>>>>>>>>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>>>>>>>>        [<     inline     >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>>>>>>>>        [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>>>>>>>>        [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>>>>>>>
>>>>>>> So in any recent code that I look at this function tries to acquire
>>>>>>> tty->ctrl_lock, not buf->lock. Am I missing something ?!
>>>>>>
>>>>>> Yes.
>>>>>>
>>>>>> The tty locks were annotated with __lockfunc so were being elided from lockdep
>>>>>> stacktraces. Greg has a patch in his queue from me that removes the __lockfunc
>>>>>> annotation ("tty: Remove __lockfunc annotation from tty lock functions").
>>>>>>
>>>>>> Unfortunately, I think syzkaller's post-processing stack trace isn't helping
>>>>>> either, giving the impression that the stack is still inside tty_get_pgrp().
>>>>>>
>>>>>> It's not.
>>>>>
>>>>> I've got a new report on commit
>>>>> a200dcb34693084e56496960d855afdeaaf9578f  (Jan 18).
>>>>> Here is unprocessed version:
>>>>> https://gist.githubusercontent.com/dvyukov/428a0c9bfaa867d8ce84/raw/0754db31668602ad07947f9964238b2f9cf63315/gistfile1.txt
>>>>> and here is processed one:
>>>>> https://gist.githubusercontent.com/dvyukov/42b874213de82d94c35e/raw/2bbced252035821243678de0112e2ed3a766fb5d/gistfile1.txt
>>>>>
>>>>> Peter, what exactly is wrong with the post-processed version?
>>>>
>>>> Yeah, ok, I assumed the problem with this report was post-processing
>>>> because of the other report that had mixed-up info.
>>>>
>>>> However, the #3 stacktrace is obviously wrong, as others have already noted.
>>>> Plus, the #1 stacktrace is wrong as well.
>>>>
>>>>> I would be interested in fixing the processing script.
>>>>
>>>> Not that it's related (since the original, not-edited report has bogus
>>>> stacktraces), but how are you doing debug symbol lookup?
>>>>
>>>> Because below is not correct. Should be kernel/kthread.c:177 (or thereabouts)
>>>>
>>>>        [<ffffffff813b423f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
>>>>
>>>>
>>>>> As far as I see it contains the same stacks just with line numbers and
>>>>> inlined frames.
>>>>
>>>> Agree, now that I see the original report.
>>>>
>>>>> I am using a significantly different compilation mode
>>>>> (kasan + kcov + very recent gcc), so nobody except me won't be able to
>>>>> figure out line numbers based on offsets.
>>>>
>>>> Weird. Maybe something to do with the compiler.
>>>>
>>>> Can you get me the dmesg output running the patch below?
>>>
>>> Wondering if this is still the priority it was not so long ago?
>>> If not, that's fine and I'll drop this from my followup list.
>>
>>
>> Yes, it is still the priority for me.
>> I've tried to apply your debugging patch, but I noticed that it prints
>> dependencies stacks as it discovers them.
>
> Yeah, that's the point; I need to understand why lockdep doesn't
> store the correct stack trace at dependency discovery.
>
> Since the correct stack trace will be printed instead, it will help
> debug the lockdep problem.
>
> Hopefully, once the problem with the bad stacktraces are fixed, the
> actual circular lock dependencies will be clear.
>
>> But in my setup I don't have
>> all output from machine start (there is just too many of it).
>
> Kernel parameter:
>
>         log_buf_len=1G
>
>
>> And I don't have a localized reproducer for this.
>
> I really just need the lockdep dependency stacks generated during boot,
> and the ctrl+C in a terminal window to trigger one of the dependency
> stacks.
>
>> I will try again.
>
> Ok.
>
>> Do you want me to debug with your "tty: Fix lock inversion in
>> N_TRACEROUTER" patch applied or not (I still see slightly different
>> deadlock reports with it)?
>
> Not.
>
> I think that probably does fix at least one circular dependency, but
> I want to figure out the bad stack trace problem first.
>
> There's probably another circular dependency there, as indicated by
> your other report.


Here is debug output:
https://gist.githubusercontent.com/dvyukov/b18181c849fdd3d51c80/raw/e91ead683fec020f64eed6750aa9f6347d43b9f9/gistfile1.txt

In particular the ctrl+C dependency is:

 new dependency:  (&o_tty->termios_rwsem/1){++++..} =>  (&buf->lock){+.+...}
[  216.817400] Call Trace:
[  216.817400]  [<ffffffff82be450d>] dump_stack+0x6f/0xa2
[  216.817400]  [<ffffffff8145b149>] __lock_acquire+0x4859/0x5710
[  216.817400]  [<ffffffff8145e61c>] lock_acquire+0x1dc/0x430
[  216.817400]  [<ffffffff86656871>] mutex_lock_nested+0xb1/0xa50
[  216.817400]  [<ffffffff82f9f08f>] tty_buffer_flush+0xbf/0x3c0
[  216.817400]  [<ffffffff82fa330c>] pty_flush_buffer+0x5c/0x180
[  216.817400]  [<ffffffff82f97a05>] tty_driver_flush_buffer+0x65/0x80
[  216.817400]  [<ffffffff82f8d162>] isig+0x172/0x2c0
[  216.817400]  [<ffffffff82f8fe52>] n_tty_receive_signal_char+0x22/0xf0
[  216.817400]  [<ffffffff82f93a4e>] n_tty_receive_char_special+0x126e/0x2b30
[  216.817400]  [<ffffffff82f96cb3>] n_tty_receive_buf_common+0x19a3/0x2400
[  216.817400]  [<ffffffff82f97743>] n_tty_receive_buf2+0x33/0x40
[  216.817400]  [<ffffffff82f9e83f>] flush_to_ldisc+0x3bf/0x7f0
[  216.817400]  [<ffffffff813a3eb6>] process_one_work+0x796/0x1440
[  216.817400]  [<ffffffff813a4c3b>] worker_thread+0xdb/0xfc0
[  216.817400]  [<ffffffff813b7edf>] kthread+0x23f/0x2d0
[  216.817400]  [<ffffffff866608ef>] ret_from_fork+0x3f/0x70

While in report it still looks as:

-> #3 (&buf->lock){+.+...}:
[ 1544.187872]        [<ffffffff8145e61c>] lock_acquire+0x1dc/0x430
[ 1544.187872]        [<ffffffff8665fecf>] _raw_spin_lock_irqsave+0x9f/0xd0
[ 1544.187872]        [<ffffffff82f7c810>] tty_get_pgrp+0x20/0x80
[ 1544.187872]        [<ffffffff82f8afca>] __isig+0x1a/0x50
[ 1544.187872]        [<ffffffff82f8d09e>] isig+0xae/0x2c0
[ 1544.187872]        [<ffffffff82f8fe52>] n_tty_receive_signal_char+0x22/0xf0
[ 1544.187872]        [<ffffffff82f93a6d>]
n_tty_receive_char_special+0x128d/0x2b30
[ 1544.187872]        [<ffffffff82f96cb3>]
n_tty_receive_buf_common+0x19a3/0x2400
[ 1544.187872]        [<ffffffff82f97743>] n_tty_receive_buf2+0x33/0x40
[ 1544.187872]        [<ffffffff82f9e83f>] flush_to_ldisc+0x3bf/0x7f0
[ 1544.187872]        [<ffffffff813a3eb6>] process_one_work+0x796/0x1440
[ 1544.187872]        [<ffffffff813a4c3b>] worker_thread+0xdb/0xfc0
[ 1544.187872]        [<ffffffff813b7edf>] kthread+0x23f/0x2d0
[ 1544.187872]        [<ffffffff866608ef>] ret_from_fork+0x3f/0x70


It seems to me that tty_get_pgrp is red herring. Ctrl lock is not
mentioned in reports, and isig indeed calls __isig/tty_get_pgrp just
before tty_driver_flush_buffer, so it looks like stack unwinding bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-02-04 12:39               ` Dmitry Vyukov
@ 2016-02-04 13:17                 ` Dmitry Vyukov
  2016-02-04 18:46                   ` Peter Hurley
  0 siblings, 1 reply; 31+ messages in thread
From: Dmitry Vyukov @ 2016-02-04 13:17 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Peter Zijlstra, Jiri Slaby, One Thousand Gnomes,
	Greg Kroah-Hartman, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

On Thu, Feb 4, 2016 at 1:39 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Wed, Feb 3, 2016 at 8:09 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
>> On 02/03/2016 09:32 AM, Dmitry Vyukov wrote:
>>> On Wed, Feb 3, 2016 at 5:24 AM, Peter Hurley <peter@hurleysoftware.com> wrote:
>>>> Hi Dmitry,
>>>>
>>>> On 01/21/2016 09:43 AM, Peter Hurley wrote:
>>>>> On 01/21/2016 02:06 AM, Dmitry Vyukov wrote:
>>>>>> On Wed, Jan 20, 2016 at 5:08 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
>>>>>>> On 01/20/2016 05:02 AM, Peter Zijlstra wrote:
>>>>>>>> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>>>>>>>>> -> #3 (&buf->lock){+.+...}:
>>>>>>>>>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>>>>>>>>>        [<     inline     >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>>>>>>>>>        [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>>>>>>>>>        [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>>>>>>>>
>>>>>>>> So in any recent code that I look at this function tries to acquire
>>>>>>>> tty->ctrl_lock, not buf->lock. Am I missing something ?!
>>>>>>>
>>>>>>> Yes.
>>>>>>>
>>>>>>> The tty locks were annotated with __lockfunc so were being elided from lockdep
>>>>>>> stacktraces. Greg has a patch in his queue from me that removes the __lockfunc
>>>>>>> annotation ("tty: Remove __lockfunc annotation from tty lock functions").
>>>>>>>
>>>>>>> Unfortunately, I think syzkaller's post-processing stack trace isn't helping
>>>>>>> either, giving the impression that the stack is still inside tty_get_pgrp().
>>>>>>>
>>>>>>> It's not.
>>>>>>
>>>>>> I've got a new report on commit
>>>>>> a200dcb34693084e56496960d855afdeaaf9578f  (Jan 18).
>>>>>> Here is unprocessed version:
>>>>>> https://gist.githubusercontent.com/dvyukov/428a0c9bfaa867d8ce84/raw/0754db31668602ad07947f9964238b2f9cf63315/gistfile1.txt
>>>>>> and here is processed one:
>>>>>> https://gist.githubusercontent.com/dvyukov/42b874213de82d94c35e/raw/2bbced252035821243678de0112e2ed3a766fb5d/gistfile1.txt
>>>>>>
>>>>>> Peter, what exactly is wrong with the post-processed version?
>>>>>
>>>>> Yeah, ok, I assumed the problem with this report was post-processing
>>>>> because of the other report that had mixed-up info.
>>>>>
>>>>> However, the #3 stacktrace is obviously wrong, as others have already noted.
>>>>> Plus, the #1 stacktrace is wrong as well.
>>>>>
>>>>>> I would be interested in fixing the processing script.
>>>>>
>>>>> Not that it's related (since the original, not-edited report has bogus
>>>>> stacktraces), but how are you doing debug symbol lookup?
>>>>>
>>>>> Because below is not correct. Should be kernel/kthread.c:177 (or thereabouts)
>>>>>
>>>>>        [<ffffffff813b423f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
>>>>>
>>>>>
>>>>>> As far as I see it contains the same stacks just with line numbers and
>>>>>> inlined frames.
>>>>>
>>>>> Agree, now that I see the original report.
>>>>>
>>>>>> I am using a significantly different compilation mode
>>>>>> (kasan + kcov + very recent gcc), so nobody except me won't be able to
>>>>>> figure out line numbers based on offsets.
>>>>>
>>>>> Weird. Maybe something to do with the compiler.
>>>>>
>>>>> Can you get me the dmesg output running the patch below?
>>>>
>>>> Wondering if this is still the priority it was not so long ago?
>>>> If not, that's fine and I'll drop this from my followup list.
>>>
>>>
>>> Yes, it is still the priority for me.
>>> I've tried to apply your debugging patch, but I noticed that it prints
>>> dependencies stacks as it discovers them.
>>
>> Yeah, that's the point; I need to understand why lockdep doesn't
>> store the correct stack trace at dependency discovery.
>>
>> Since the correct stack trace will be printed instead, it will help
>> debug the lockdep problem.
>>
>> Hopefully, once the problem with the bad stacktraces are fixed, the
>> actual circular lock dependencies will be clear.
>>
>>> But in my setup I don't have
>>> all output from machine start (there is just too many of it).
>>
>> Kernel parameter:
>>
>>         log_buf_len=1G
>>
>>
>>> And I don't have a localized reproducer for this.
>>
>> I really just need the lockdep dependency stacks generated during boot,
>> and the ctrl+C in a terminal window to trigger one of the dependency
>> stacks.
>>
>>> I will try again.
>>
>> Ok.
>>
>>> Do you want me to debug with your "tty: Fix lock inversion in
>>> N_TRACEROUTER" patch applied or not (I still see slightly different
>>> deadlock reports with it)?
>>
>> Not.
>>
>> I think that probably does fix at least one circular dependency, but
>> I want to figure out the bad stack trace problem first.
>>
>> There's probably another circular dependency there, as indicated by
>> your other report.
>
>
> Here is debug output:
> https://gist.githubusercontent.com/dvyukov/b18181c849fdd3d51c80/raw/e91ead683fec020f64eed6750aa9f6347d43b9f9/gistfile1.txt
>
> In particular the ctrl+C dependency is:
>
>  new dependency:  (&o_tty->termios_rwsem/1){++++..} =>  (&buf->lock){+.+...}
> [  216.817400] Call Trace:
> [  216.817400]  [<ffffffff82be450d>] dump_stack+0x6f/0xa2
> [  216.817400]  [<ffffffff8145b149>] __lock_acquire+0x4859/0x5710
> [  216.817400]  [<ffffffff8145e61c>] lock_acquire+0x1dc/0x430
> [  216.817400]  [<ffffffff86656871>] mutex_lock_nested+0xb1/0xa50
> [  216.817400]  [<ffffffff82f9f08f>] tty_buffer_flush+0xbf/0x3c0
> [  216.817400]  [<ffffffff82fa330c>] pty_flush_buffer+0x5c/0x180
> [  216.817400]  [<ffffffff82f97a05>] tty_driver_flush_buffer+0x65/0x80
> [  216.817400]  [<ffffffff82f8d162>] isig+0x172/0x2c0
> [  216.817400]  [<ffffffff82f8fe52>] n_tty_receive_signal_char+0x22/0xf0
> [  216.817400]  [<ffffffff82f93a4e>] n_tty_receive_char_special+0x126e/0x2b30
> [  216.817400]  [<ffffffff82f96cb3>] n_tty_receive_buf_common+0x19a3/0x2400
> [  216.817400]  [<ffffffff82f97743>] n_tty_receive_buf2+0x33/0x40
> [  216.817400]  [<ffffffff82f9e83f>] flush_to_ldisc+0x3bf/0x7f0
> [  216.817400]  [<ffffffff813a3eb6>] process_one_work+0x796/0x1440
> [  216.817400]  [<ffffffff813a4c3b>] worker_thread+0xdb/0xfc0
> [  216.817400]  [<ffffffff813b7edf>] kthread+0x23f/0x2d0
> [  216.817400]  [<ffffffff866608ef>] ret_from_fork+0x3f/0x70
>
> While in report it still looks as:
>
> -> #3 (&buf->lock){+.+...}:
> [ 1544.187872]        [<ffffffff8145e61c>] lock_acquire+0x1dc/0x430
> [ 1544.187872]        [<ffffffff8665fecf>] _raw_spin_lock_irqsave+0x9f/0xd0
> [ 1544.187872]        [<ffffffff82f7c810>] tty_get_pgrp+0x20/0x80
> [ 1544.187872]        [<ffffffff82f8afca>] __isig+0x1a/0x50
> [ 1544.187872]        [<ffffffff82f8d09e>] isig+0xae/0x2c0
> [ 1544.187872]        [<ffffffff82f8fe52>] n_tty_receive_signal_char+0x22/0xf0
> [ 1544.187872]        [<ffffffff82f93a6d>]
> n_tty_receive_char_special+0x128d/0x2b30
> [ 1544.187872]        [<ffffffff82f96cb3>]
> n_tty_receive_buf_common+0x19a3/0x2400
> [ 1544.187872]        [<ffffffff82f97743>] n_tty_receive_buf2+0x33/0x40
> [ 1544.187872]        [<ffffffff82f9e83f>] flush_to_ldisc+0x3bf/0x7f0
> [ 1544.187872]        [<ffffffff813a3eb6>] process_one_work+0x796/0x1440
> [ 1544.187872]        [<ffffffff813a4c3b>] worker_thread+0xdb/0xfc0
> [ 1544.187872]        [<ffffffff813b7edf>] kthread+0x23f/0x2d0
> [ 1544.187872]        [<ffffffff866608ef>] ret_from_fork+0x3f/0x70
>
>
> It seems to me that tty_get_pgrp is red herring. Ctrl lock is not
> mentioned in reports, and isig indeed calls __isig/tty_get_pgrp just
> before tty_driver_flush_buffer, so it looks like stack unwinding bug.


Found a bug in lockdep. Yes, the first stack is correct, and the saved
stack is just a random, unrelated stack. Will mail a fix.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-02-04 13:17                 ` Dmitry Vyukov
@ 2016-02-04 18:46                   ` Peter Hurley
  2016-02-04 18:48                     ` Dmitry Vyukov
  0 siblings, 1 reply; 31+ messages in thread
From: Peter Hurley @ 2016-02-04 18:46 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Peter Zijlstra, Jiri Slaby, One Thousand Gnomes,
	Greg Kroah-Hartman, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

On 02/04/2016 05:17 AM, Dmitry Vyukov wrote:
> On Thu, Feb 4, 2016 at 1:39 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> On Wed, Feb 3, 2016 at 8:09 PM, Peter Hurley <peter@hurleysoftware.com> wrote:

>>> Hopefully, once the problem with the bad stacktraces are fixed, the
>>> actual circular lock dependencies will be clear.

> Found a bug in lockdep. Yes, the first stack is correct, and the saved
> stack is just a random, unrelated stack. Will mail a fix.

Ok, so with the lockdep bug fixed, those reports should now have the
correct stack traces.

I'd appreciate if you could re-generate the original report, so I can examine
if my speculative fix was appropriate.

And then if you would re-generate the more recent report, I'll relook at
that.

Regards,
Peter Hurley

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-02-04 18:46                   ` Peter Hurley
@ 2016-02-04 18:48                     ` Dmitry Vyukov
  2016-02-05 21:22                       ` Dmitry Vyukov
  0 siblings, 1 reply; 31+ messages in thread
From: Dmitry Vyukov @ 2016-02-04 18:48 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Peter Zijlstra, Jiri Slaby, One Thousand Gnomes,
	Greg Kroah-Hartman, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

On Thu, Feb 4, 2016 at 7:46 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
> On 02/04/2016 05:17 AM, Dmitry Vyukov wrote:
>> On Thu, Feb 4, 2016 at 1:39 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>> On Wed, Feb 3, 2016 at 8:09 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
>
>>>> Hopefully, once the problem with the bad stacktraces are fixed, the
>>>> actual circular lock dependencies will be clear.
>
>> Found a bug in lockdep. Yes, the first stack is correct, and the saved
>> stack is just a random, unrelated stack. Will mail a fix.
>
> Ok, so with the lockdep bug fixed, those reports should now have the
> correct stack traces.
>
> I'd appreciate if you could re-generate the original report, so I can examine
> if my speculative fix was appropriate.
>
> And then if you would re-generate the more recent report, I'll relook at
> that.

Working hard on it now.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc
  2016-02-04 18:48                     ` Dmitry Vyukov
@ 2016-02-05 21:22                       ` Dmitry Vyukov
  0 siblings, 0 replies; 31+ messages in thread
From: Dmitry Vyukov @ 2016-02-05 21:22 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Peter Zijlstra, Jiri Slaby, One Thousand Gnomes,
	Greg Kroah-Hartman, LKML, J Freyensee, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

On Thu, Feb 4, 2016 at 7:48 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>>>> Hopefully, once the problem with the bad stacktraces are fixed, the
>>>>> actual circular lock dependencies will be clear.
>>
>>> Found a bug in lockdep. Yes, the first stack is correct, and the saved
>>> stack is just a random, unrelated stack. Will mail a fix.
>>
>> Ok, so with the lockdep bug fixed, those reports should now have the
>> correct stack traces.
>>
>> I'd appreciate if you could re-generate the original report, so I can examine
>> if my speculative fix was appropriate.
>>
>> And then if you would re-generate the more recent report, I'll relook at
>> that.
>
> Working hard on it now.

No luck so far. I will leave it running, but I will be OOO whole next week.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2016-02-05 21:22 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-30 10:44 tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc Dmitry Vyukov
2016-01-15  7:51 ` Dmitry Vyukov
2016-01-15 16:33   ` One Thousand Gnomes
2016-01-15 17:22     ` Dmitry Vyukov
2016-01-20  9:36       ` Dmitry Vyukov
2016-01-20 11:44         ` Peter Zijlstra
2016-01-20 11:54           ` Dmitry Vyukov
2016-01-20 12:07             ` Peter Zijlstra
2016-01-20 14:58               ` One Thousand Gnomes
2016-01-20 15:16                 ` Dmitry Vyukov
2016-01-20 16:32                   ` Peter Zijlstra
2016-01-20  2:09     ` J Freyensee
2016-01-20 12:47 ` Jiri Slaby
2016-01-20 13:02 ` Peter Zijlstra
2016-01-20 13:07   ` Dmitry Vyukov
2016-01-20 16:08   ` Peter Hurley
2016-01-20 20:47     ` Peter Hurley
2016-01-21 10:06     ` Dmitry Vyukov
2016-01-21 10:20       ` Peter Zijlstra
2016-01-21 17:51         ` Peter Hurley
2016-01-22 14:10           ` Dmitry Vyukov
2016-01-25 16:56             ` Peter Hurley
2016-01-21 17:43       ` Peter Hurley
2016-02-03  4:24         ` Peter Hurley
2016-02-03 17:32           ` Dmitry Vyukov
2016-02-03 19:09             ` Peter Hurley
2016-02-04 12:39               ` Dmitry Vyukov
2016-02-04 13:17                 ` Dmitry Vyukov
2016-02-04 18:46                   ` Peter Hurley
2016-02-04 18:48                     ` Dmitry Vyukov
2016-02-05 21:22                       ` Dmitry Vyukov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.