linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* possible deadlock in sk_diag_fill
@ 2018-05-05 17:59 syzbot
  2018-05-11 18:33 ` Andrei Vagin
  0 siblings, 1 reply; 7+ messages in thread
From: syzbot @ 2018-05-05 17:59 UTC (permalink / raw)
  To: avagin, davem, linux-kernel, netdev, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    c1c07416cdd4 Merge tag 'kbuild-fixes-v4.17' of git://git.k..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=12164c97800000
kernel config:  https://syzkaller.appspot.com/x/.config?x=5a1dc06635c10d27
dashboard link: https://syzkaller.appspot.com/bug?extid=c1872be62e587eae9669
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
userspace arch: i386

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+c1872be62e587eae9669@syzkaller.appspotmail.com


======================================================
WARNING: possible circular locking dependency detected
4.17.0-rc3+ #59 Not tainted
------------------------------------------------------
syz-executor1/25282 is trying to acquire lock:
000000004fddf743 (&(&u->lock)->rlock/1){+.+.}, at: sk_diag_dump_icons  
net/unix/diag.c:82 [inline]
000000004fddf743 (&(&u->lock)->rlock/1){+.+.}, at:  
sk_diag_fill.isra.5+0xa43/0x10d0 net/unix/diag.c:144

but task is already holding lock:
00000000b6895645 (rlock-AF_UNIX){+.+.}, at: spin_lock  
include/linux/spinlock.h:310 [inline]
00000000b6895645 (rlock-AF_UNIX){+.+.}, at: sk_diag_dump_icons  
net/unix/diag.c:64 [inline]
00000000b6895645 (rlock-AF_UNIX){+.+.}, at:  
sk_diag_fill.isra.5+0x94e/0x10d0 net/unix/diag.c:144

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (rlock-AF_UNIX){+.+.}:
        __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
        _raw_spin_lock_irqsave+0x96/0xc0 kernel/locking/spinlock.c:152
        skb_queue_tail+0x26/0x150 net/core/skbuff.c:2900
        unix_dgram_sendmsg+0xf77/0x1730 net/unix/af_unix.c:1797
        sock_sendmsg_nosec net/socket.c:629 [inline]
        sock_sendmsg+0xd5/0x120 net/socket.c:639
        ___sys_sendmsg+0x525/0x940 net/socket.c:2117
        __sys_sendmmsg+0x3bb/0x6f0 net/socket.c:2205
        __compat_sys_sendmmsg net/compat.c:770 [inline]
        __do_compat_sys_sendmmsg net/compat.c:777 [inline]
        __se_compat_sys_sendmmsg net/compat.c:774 [inline]
        __ia32_compat_sys_sendmmsg+0x9f/0x100 net/compat.c:774
        do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline]
        do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394
        entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139

-> #0 (&(&u->lock)->rlock/1){+.+.}:
        lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
        _raw_spin_lock_nested+0x28/0x40 kernel/locking/spinlock.c:354
        sk_diag_dump_icons net/unix/diag.c:82 [inline]
        sk_diag_fill.isra.5+0xa43/0x10d0 net/unix/diag.c:144
        sk_diag_dump net/unix/diag.c:178 [inline]
        unix_diag_dump+0x35f/0x550 net/unix/diag.c:206
        netlink_dump+0x507/0xd20 net/netlink/af_netlink.c:2226
        __netlink_dump_start+0x51a/0x780 net/netlink/af_netlink.c:2323
        netlink_dump_start include/linux/netlink.h:214 [inline]
        unix_diag_handler_dump+0x3f4/0x7b0 net/unix/diag.c:307
        __sock_diag_cmd net/core/sock_diag.c:230 [inline]
        sock_diag_rcv_msg+0x2e0/0x3d0 net/core/sock_diag.c:261
        netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2448
        sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:272
        netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
        netlink_unicast+0x58b/0x740 net/netlink/af_netlink.c:1336
        netlink_sendmsg+0x9f0/0xfa0 net/netlink/af_netlink.c:1901
        sock_sendmsg_nosec net/socket.c:629 [inline]
        sock_sendmsg+0xd5/0x120 net/socket.c:639
        sock_write_iter+0x35a/0x5a0 net/socket.c:908
        call_write_iter include/linux/fs.h:1784 [inline]
        new_sync_write fs/read_write.c:474 [inline]
        __vfs_write+0x64d/0x960 fs/read_write.c:487
        vfs_write+0x1f8/0x560 fs/read_write.c:549
        ksys_write+0xf9/0x250 fs/read_write.c:598
        __do_sys_write fs/read_write.c:610 [inline]
        __se_sys_write fs/read_write.c:607 [inline]
        __ia32_sys_write+0x71/0xb0 fs/read_write.c:607
        do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline]
        do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394
        entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139

other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(rlock-AF_UNIX);
                                lock(&(&u->lock)->rlock/1);
                                lock(rlock-AF_UNIX);
   lock(&(&u->lock)->rlock/1);

  *** DEADLOCK ***

5 locks held by syz-executor1/25282:
  #0: 000000003919e1bd (sock_diag_mutex){+.+.}, at: sock_diag_rcv+0x1b/0x40  
net/core/sock_diag.c:271
  #1: 000000004f328d3e (sock_diag_table_mutex){+.+.}, at: __sock_diag_cmd  
net/core/sock_diag.c:225 [inline]
  #1: 000000004f328d3e (sock_diag_table_mutex){+.+.}, at:  
sock_diag_rcv_msg+0x169/0x3d0 net/core/sock_diag.c:261
  #2: 000000004cc04dbb (nlk_cb_mutex-SOCK_DIAG){+.+.}, at:  
netlink_dump+0x98/0xd20 net/netlink/af_netlink.c:2182
  #3: 00000000accdef41 (unix_table_lock){+.+.}, at: spin_lock  
include/linux/spinlock.h:310 [inline]
  #3: 00000000accdef41 (unix_table_lock){+.+.}, at:  
unix_diag_dump+0x10a/0x550 net/unix/diag.c:192
  #4: 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: spin_lock  
include/linux/spinlock.h:310 [inline]
  #4: 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: sk_diag_dump_icons  
net/unix/diag.c:64 [inline]
  #4: 00000000b6895645 (rlock-AF_UNIX){+.+.}, at:  
sk_diag_fill.isra.5+0x94e/0x10d0 net/unix/diag.c:144

stack backtrace:
CPU: 1 PID: 25282 Comm: syz-executor1 Not tainted 4.17.0-rc3+ #59
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
  print_circular_bug.isra.36.cold.54+0x1bd/0x27d  
kernel/locking/lockdep.c:1223
  check_prev_add kernel/locking/lockdep.c:1863 [inline]
  check_prevs_add kernel/locking/lockdep.c:1976 [inline]
  validate_chain kernel/locking/lockdep.c:2417 [inline]
  __lock_acquire+0x343e/0x5140 kernel/locking/lockdep.c:3431
  lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
  _raw_spin_lock_nested+0x28/0x40 kernel/locking/spinlock.c:354
  sk_diag_dump_icons net/unix/diag.c:82 [inline]
  sk_diag_fill.isra.5+0xa43/0x10d0 net/unix/diag.c:144
  sk_diag_dump net/unix/diag.c:178 [inline]
  unix_diag_dump+0x35f/0x550 net/unix/diag.c:206
  netlink_dump+0x507/0xd20 net/netlink/af_netlink.c:2226
  __netlink_dump_start+0x51a/0x780 net/netlink/af_netlink.c:2323
  netlink_dump_start include/linux/netlink.h:214 [inline]
  unix_diag_handler_dump+0x3f4/0x7b0 net/unix/diag.c:307
  __sock_diag_cmd net/core/sock_diag.c:230 [inline]
  sock_diag_rcv_msg+0x2e0/0x3d0 net/core/sock_diag.c:261
  netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2448
  sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:272
  netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
  netlink_unicast+0x58b/0x740 net/netlink/af_netlink.c:1336
  netlink_sendmsg+0x9f0/0xfa0 net/netlink/af_netlink.c:1901
  sock_sendmsg_nosec net/socket.c:629 [inline]
  sock_sendmsg+0xd5/0x120 net/socket.c:639
  sock_write_iter+0x35a/0x5a0 net/socket.c:908
  call_write_iter include/linux/fs.h:1784 [inline]
  new_sync_write fs/read_write.c:474 [inline]
  __vfs_write+0x64d/0x960 fs/read_write.c:487
  vfs_write+0x1f8/0x560 fs/read_write.c:549
  ksys_write+0xf9/0x250 fs/read_write.c:598
  __do_sys_write fs/read_write.c:610 [inline]
  __se_sys_write fs/read_write.c:607 [inline]
  __ia32_sys_write+0x71/0xb0 fs/read_write.c:607
  do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline]
  do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394
  entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
RIP: 0023:0xf7f8ccb9
RSP: 002b:00000000f5f880ac EFLAGS: 00000282 ORIG_RAX: 0000000000000004
RAX: ffffffffffffffda RBX: 0000000000000017 RCX: 000000002058bfe4
RDX: 0000000000000029 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is  
merged
into any tree, please reply to this email with:
#syz fix: exact-commit-title
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug  
report.
Note: all commands must start from beginning of the line in the email body.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: possible deadlock in sk_diag_fill
  2018-05-05 17:59 possible deadlock in sk_diag_fill syzbot
@ 2018-05-11 18:33 ` Andrei Vagin
  2018-05-12  7:46   ` Dmitry Vyukov
  0 siblings, 1 reply; 7+ messages in thread
From: Andrei Vagin @ 2018-05-11 18:33 UTC (permalink / raw)
  To: syzbot; +Cc: avagin, davem, linux-kernel, netdev, syzkaller-bugs

On Sat, May 05, 2018 at 10:59:02AM -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    c1c07416cdd4 Merge tag 'kbuild-fixes-v4.17' of git://git.k..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=12164c97800000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=5a1dc06635c10d27
> dashboard link: https://syzkaller.appspot.com/bug?extid=c1872be62e587eae9669
> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> userspace arch: i386
> 
> Unfortunately, I don't have any reproducer for this crash yet.
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+c1872be62e587eae9669@syzkaller.appspotmail.com
> 
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 4.17.0-rc3+ #59 Not tainted
> ------------------------------------------------------
> syz-executor1/25282 is trying to acquire lock:
> 000000004fddf743 (&(&u->lock)->rlock/1){+.+.}, at: sk_diag_dump_icons
> net/unix/diag.c:82 [inline]
> 000000004fddf743 (&(&u->lock)->rlock/1){+.+.}, at:
> sk_diag_fill.isra.5+0xa43/0x10d0 net/unix/diag.c:144
> 
> but task is already holding lock:
> 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: spin_lock
> include/linux/spinlock.h:310 [inline]
> 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: sk_diag_dump_icons
> net/unix/diag.c:64 [inline]
> 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: sk_diag_fill.isra.5+0x94e/0x10d0
> net/unix/diag.c:144
> 
> which lock already depends on the new lock.

In the code, we have a comment which explains why it is safe to take this lock

/*
 * The state lock is outer for the same sk's
 * queue lock. With the other's queue locked it's
 * OK to lock the state.
 */
unix_state_lock_nested(req);

It is a question how to explain this to lockdep.

> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (rlock-AF_UNIX){+.+.}:
>        __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>        _raw_spin_lock_irqsave+0x96/0xc0 kernel/locking/spinlock.c:152
>        skb_queue_tail+0x26/0x150 net/core/skbuff.c:2900
>        unix_dgram_sendmsg+0xf77/0x1730 net/unix/af_unix.c:1797
>        sock_sendmsg_nosec net/socket.c:629 [inline]
>        sock_sendmsg+0xd5/0x120 net/socket.c:639
>        ___sys_sendmsg+0x525/0x940 net/socket.c:2117
>        __sys_sendmmsg+0x3bb/0x6f0 net/socket.c:2205
>        __compat_sys_sendmmsg net/compat.c:770 [inline]
>        __do_compat_sys_sendmmsg net/compat.c:777 [inline]
>        __se_compat_sys_sendmmsg net/compat.c:774 [inline]
>        __ia32_compat_sys_sendmmsg+0x9f/0x100 net/compat.c:774
>        do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline]
>        do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394
>        entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
> 
> -> #0 (&(&u->lock)->rlock/1){+.+.}:
>        lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
>        _raw_spin_lock_nested+0x28/0x40 kernel/locking/spinlock.c:354
>        sk_diag_dump_icons net/unix/diag.c:82 [inline]
>        sk_diag_fill.isra.5+0xa43/0x10d0 net/unix/diag.c:144
>        sk_diag_dump net/unix/diag.c:178 [inline]
>        unix_diag_dump+0x35f/0x550 net/unix/diag.c:206
>        netlink_dump+0x507/0xd20 net/netlink/af_netlink.c:2226
>        __netlink_dump_start+0x51a/0x780 net/netlink/af_netlink.c:2323
>        netlink_dump_start include/linux/netlink.h:214 [inline]
>        unix_diag_handler_dump+0x3f4/0x7b0 net/unix/diag.c:307
>        __sock_diag_cmd net/core/sock_diag.c:230 [inline]
>        sock_diag_rcv_msg+0x2e0/0x3d0 net/core/sock_diag.c:261
>        netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2448
>        sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:272
>        netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
>        netlink_unicast+0x58b/0x740 net/netlink/af_netlink.c:1336
>        netlink_sendmsg+0x9f0/0xfa0 net/netlink/af_netlink.c:1901
>        sock_sendmsg_nosec net/socket.c:629 [inline]
>        sock_sendmsg+0xd5/0x120 net/socket.c:639
>        sock_write_iter+0x35a/0x5a0 net/socket.c:908
>        call_write_iter include/linux/fs.h:1784 [inline]
>        new_sync_write fs/read_write.c:474 [inline]
>        __vfs_write+0x64d/0x960 fs/read_write.c:487
>        vfs_write+0x1f8/0x560 fs/read_write.c:549
>        ksys_write+0xf9/0x250 fs/read_write.c:598
>        __do_sys_write fs/read_write.c:610 [inline]
>        __se_sys_write fs/read_write.c:607 [inline]
>        __ia32_sys_write+0x71/0xb0 fs/read_write.c:607
>        do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline]
>        do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394
>        entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
> 
> other info that might help us debug this:
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(rlock-AF_UNIX);
>                                lock(&(&u->lock)->rlock/1);
>                                lock(rlock-AF_UNIX);
>   lock(&(&u->lock)->rlock/1);
> 
>  *** DEADLOCK ***
> 
> 5 locks held by syz-executor1/25282:
>  #0: 000000003919e1bd (sock_diag_mutex){+.+.}, at: sock_diag_rcv+0x1b/0x40
> net/core/sock_diag.c:271
>  #1: 000000004f328d3e (sock_diag_table_mutex){+.+.}, at: __sock_diag_cmd
> net/core/sock_diag.c:225 [inline]
>  #1: 000000004f328d3e (sock_diag_table_mutex){+.+.}, at:
> sock_diag_rcv_msg+0x169/0x3d0 net/core/sock_diag.c:261
>  #2: 000000004cc04dbb (nlk_cb_mutex-SOCK_DIAG){+.+.}, at:
> netlink_dump+0x98/0xd20 net/netlink/af_netlink.c:2182
>  #3: 00000000accdef41 (unix_table_lock){+.+.}, at: spin_lock
> include/linux/spinlock.h:310 [inline]
>  #3: 00000000accdef41 (unix_table_lock){+.+.}, at:
> unix_diag_dump+0x10a/0x550 net/unix/diag.c:192
>  #4: 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: spin_lock
> include/linux/spinlock.h:310 [inline]
>  #4: 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: sk_diag_dump_icons
> net/unix/diag.c:64 [inline]
>  #4: 00000000b6895645 (rlock-AF_UNIX){+.+.}, at:
> sk_diag_fill.isra.5+0x94e/0x10d0 net/unix/diag.c:144
> 
> stack backtrace:
> CPU: 1 PID: 25282 Comm: syz-executor1 Not tainted 4.17.0-rc3+ #59
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
>  print_circular_bug.isra.36.cold.54+0x1bd/0x27d
> kernel/locking/lockdep.c:1223
>  check_prev_add kernel/locking/lockdep.c:1863 [inline]
>  check_prevs_add kernel/locking/lockdep.c:1976 [inline]
>  validate_chain kernel/locking/lockdep.c:2417 [inline]
>  __lock_acquire+0x343e/0x5140 kernel/locking/lockdep.c:3431
>  lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
>  _raw_spin_lock_nested+0x28/0x40 kernel/locking/spinlock.c:354
>  sk_diag_dump_icons net/unix/diag.c:82 [inline]
>  sk_diag_fill.isra.5+0xa43/0x10d0 net/unix/diag.c:144
>  sk_diag_dump net/unix/diag.c:178 [inline]
>  unix_diag_dump+0x35f/0x550 net/unix/diag.c:206
>  netlink_dump+0x507/0xd20 net/netlink/af_netlink.c:2226
>  __netlink_dump_start+0x51a/0x780 net/netlink/af_netlink.c:2323
>  netlink_dump_start include/linux/netlink.h:214 [inline]
>  unix_diag_handler_dump+0x3f4/0x7b0 net/unix/diag.c:307
>  __sock_diag_cmd net/core/sock_diag.c:230 [inline]
>  sock_diag_rcv_msg+0x2e0/0x3d0 net/core/sock_diag.c:261
>  netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2448
>  sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:272
>  netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
>  netlink_unicast+0x58b/0x740 net/netlink/af_netlink.c:1336
>  netlink_sendmsg+0x9f0/0xfa0 net/netlink/af_netlink.c:1901
>  sock_sendmsg_nosec net/socket.c:629 [inline]
>  sock_sendmsg+0xd5/0x120 net/socket.c:639
>  sock_write_iter+0x35a/0x5a0 net/socket.c:908
>  call_write_iter include/linux/fs.h:1784 [inline]
>  new_sync_write fs/read_write.c:474 [inline]
>  __vfs_write+0x64d/0x960 fs/read_write.c:487
>  vfs_write+0x1f8/0x560 fs/read_write.c:549
>  ksys_write+0xf9/0x250 fs/read_write.c:598
>  __do_sys_write fs/read_write.c:610 [inline]
>  __se_sys_write fs/read_write.c:607 [inline]
>  __ia32_sys_write+0x71/0xb0 fs/read_write.c:607
>  do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline]
>  do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394
>  entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
> RIP: 0023:0xf7f8ccb9
> RSP: 002b:00000000f5f880ac EFLAGS: 00000282 ORIG_RAX: 0000000000000004
> RAX: ffffffffffffffda RBX: 0000000000000017 RCX: 000000002058bfe4
> RDX: 0000000000000029 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> 
> 
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this bug report.
> If you forgot to add the Reported-by tag, once the fix for this bug is
> merged
> into any tree, please reply to this email with:
> #syz fix: exact-commit-title
> To mark this as a duplicate of another syzbot report, please reply with:
> #syz dup: exact-subject-of-another-report
> If it's a one-off invalid bug report, please reply with:
> #syz invalid
> Note: if the crash happens again, it will cause creation of a new bug
> report.
> Note: all commands must start from beginning of the line in the email body.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: possible deadlock in sk_diag_fill
  2018-05-11 18:33 ` Andrei Vagin
@ 2018-05-12  7:46   ` Dmitry Vyukov
  2018-05-14 18:00     ` Andrei Vagin
  0 siblings, 1 reply; 7+ messages in thread
From: Dmitry Vyukov @ 2018-05-12  7:46 UTC (permalink / raw)
  To: Andrei Vagin; +Cc: syzbot, avagin, David Miller, LKML, netdev, syzkaller-bugs

On Fri, May 11, 2018 at 8:33 PM, Andrei Vagin <avagin@virtuozzo.com> wrote:
> On Sat, May 05, 2018 at 10:59:02AM -0700, syzbot wrote:
>> Hello,
>>
>> syzbot found the following crash on:
>>
>> HEAD commit:    c1c07416cdd4 Merge tag 'kbuild-fixes-v4.17' of git://git.k..
>> git tree:       upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=12164c97800000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=5a1dc06635c10d27
>> dashboard link: https://syzkaller.appspot.com/bug?extid=c1872be62e587eae9669
>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>> userspace arch: i386
>>
>> Unfortunately, I don't have any reproducer for this crash yet.
>>
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: syzbot+c1872be62e587eae9669@syzkaller.appspotmail.com
>>
>>
>> ======================================================
>> WARNING: possible circular locking dependency detected
>> 4.17.0-rc3+ #59 Not tainted
>> ------------------------------------------------------
>> syz-executor1/25282 is trying to acquire lock:
>> 000000004fddf743 (&(&u->lock)->rlock/1){+.+.}, at: sk_diag_dump_icons
>> net/unix/diag.c:82 [inline]
>> 000000004fddf743 (&(&u->lock)->rlock/1){+.+.}, at:
>> sk_diag_fill.isra.5+0xa43/0x10d0 net/unix/diag.c:144
>>
>> but task is already holding lock:
>> 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: spin_lock
>> include/linux/spinlock.h:310 [inline]
>> 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: sk_diag_dump_icons
>> net/unix/diag.c:64 [inline]
>> 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: sk_diag_fill.isra.5+0x94e/0x10d0
>> net/unix/diag.c:144
>>
>> which lock already depends on the new lock.
>
> In the code, we have a comment which explains why it is safe to take this lock
>
> /*
>  * The state lock is outer for the same sk's
>  * queue lock. With the other's queue locked it's
>  * OK to lock the state.
>  */
> unix_state_lock_nested(req);
>
> It is a question how to explain this to lockdep.

Do I understand it correctly that (&u->lock)->rlock associated with
AF_UNIX is locked under rlock-AF_UNIX, and then rlock-AF_UNIX is
locked under (&u->lock)->rlock associated with AF_NETLINK? If so, I
think we need to split (&u->lock)->rlock by family too, so that we
have u->lock-AF_UNIX and u->lock-AF_NETLINK.



>> the existing dependency chain (in reverse order) is:
>>
>> -> #1 (rlock-AF_UNIX){+.+.}:
>>        __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>>        _raw_spin_lock_irqsave+0x96/0xc0 kernel/locking/spinlock.c:152
>>        skb_queue_tail+0x26/0x150 net/core/skbuff.c:2900
>>        unix_dgram_sendmsg+0xf77/0x1730 net/unix/af_unix.c:1797
>>        sock_sendmsg_nosec net/socket.c:629 [inline]
>>        sock_sendmsg+0xd5/0x120 net/socket.c:639
>>        ___sys_sendmsg+0x525/0x940 net/socket.c:2117
>>        __sys_sendmmsg+0x3bb/0x6f0 net/socket.c:2205
>>        __compat_sys_sendmmsg net/compat.c:770 [inline]
>>        __do_compat_sys_sendmmsg net/compat.c:777 [inline]
>>        __se_compat_sys_sendmmsg net/compat.c:774 [inline]
>>        __ia32_compat_sys_sendmmsg+0x9f/0x100 net/compat.c:774
>>        do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline]
>>        do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394
>>        entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
>>
>> -> #0 (&(&u->lock)->rlock/1){+.+.}:
>>        lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
>>        _raw_spin_lock_nested+0x28/0x40 kernel/locking/spinlock.c:354
>>        sk_diag_dump_icons net/unix/diag.c:82 [inline]
>>        sk_diag_fill.isra.5+0xa43/0x10d0 net/unix/diag.c:144
>>        sk_diag_dump net/unix/diag.c:178 [inline]
>>        unix_diag_dump+0x35f/0x550 net/unix/diag.c:206
>>        netlink_dump+0x507/0xd20 net/netlink/af_netlink.c:2226
>>        __netlink_dump_start+0x51a/0x780 net/netlink/af_netlink.c:2323
>>        netlink_dump_start include/linux/netlink.h:214 [inline]
>>        unix_diag_handler_dump+0x3f4/0x7b0 net/unix/diag.c:307
>>        __sock_diag_cmd net/core/sock_diag.c:230 [inline]
>>        sock_diag_rcv_msg+0x2e0/0x3d0 net/core/sock_diag.c:261
>>        netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2448
>>        sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:272
>>        netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
>>        netlink_unicast+0x58b/0x740 net/netlink/af_netlink.c:1336
>>        netlink_sendmsg+0x9f0/0xfa0 net/netlink/af_netlink.c:1901
>>        sock_sendmsg_nosec net/socket.c:629 [inline]
>>        sock_sendmsg+0xd5/0x120 net/socket.c:639
>>        sock_write_iter+0x35a/0x5a0 net/socket.c:908
>>        call_write_iter include/linux/fs.h:1784 [inline]
>>        new_sync_write fs/read_write.c:474 [inline]
>>        __vfs_write+0x64d/0x960 fs/read_write.c:487
>>        vfs_write+0x1f8/0x560 fs/read_write.c:549
>>        ksys_write+0xf9/0x250 fs/read_write.c:598
>>        __do_sys_write fs/read_write.c:610 [inline]
>>        __se_sys_write fs/read_write.c:607 [inline]
>>        __ia32_sys_write+0x71/0xb0 fs/read_write.c:607
>>        do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline]
>>        do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394
>>        entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
>>
>> other info that might help us debug this:
>>
>>  Possible unsafe locking scenario:
>>
>>        CPU0                    CPU1
>>        ----                    ----
>>   lock(rlock-AF_UNIX);
>>                                lock(&(&u->lock)->rlock/1);
>>                                lock(rlock-AF_UNIX);
>>   lock(&(&u->lock)->rlock/1);
>>
>>  *** DEADLOCK ***
>>
>> 5 locks held by syz-executor1/25282:
>>  #0: 000000003919e1bd (sock_diag_mutex){+.+.}, at: sock_diag_rcv+0x1b/0x40
>> net/core/sock_diag.c:271
>>  #1: 000000004f328d3e (sock_diag_table_mutex){+.+.}, at: __sock_diag_cmd
>> net/core/sock_diag.c:225 [inline]
>>  #1: 000000004f328d3e (sock_diag_table_mutex){+.+.}, at:
>> sock_diag_rcv_msg+0x169/0x3d0 net/core/sock_diag.c:261
>>  #2: 000000004cc04dbb (nlk_cb_mutex-SOCK_DIAG){+.+.}, at:
>> netlink_dump+0x98/0xd20 net/netlink/af_netlink.c:2182
>>  #3: 00000000accdef41 (unix_table_lock){+.+.}, at: spin_lock
>> include/linux/spinlock.h:310 [inline]
>>  #3: 00000000accdef41 (unix_table_lock){+.+.}, at:
>> unix_diag_dump+0x10a/0x550 net/unix/diag.c:192
>>  #4: 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: spin_lock
>> include/linux/spinlock.h:310 [inline]
>>  #4: 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: sk_diag_dump_icons
>> net/unix/diag.c:64 [inline]
>>  #4: 00000000b6895645 (rlock-AF_UNIX){+.+.}, at:
>> sk_diag_fill.isra.5+0x94e/0x10d0 net/unix/diag.c:144
>>
>> stack backtrace:
>> CPU: 1 PID: 25282 Comm: syz-executor1 Not tainted 4.17.0-rc3+ #59
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> Google 01/01/2011
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:77 [inline]
>>  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
>>  print_circular_bug.isra.36.cold.54+0x1bd/0x27d
>> kernel/locking/lockdep.c:1223
>>  check_prev_add kernel/locking/lockdep.c:1863 [inline]
>>  check_prevs_add kernel/locking/lockdep.c:1976 [inline]
>>  validate_chain kernel/locking/lockdep.c:2417 [inline]
>>  __lock_acquire+0x343e/0x5140 kernel/locking/lockdep.c:3431
>>  lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
>>  _raw_spin_lock_nested+0x28/0x40 kernel/locking/spinlock.c:354
>>  sk_diag_dump_icons net/unix/diag.c:82 [inline]
>>  sk_diag_fill.isra.5+0xa43/0x10d0 net/unix/diag.c:144
>>  sk_diag_dump net/unix/diag.c:178 [inline]
>>  unix_diag_dump+0x35f/0x550 net/unix/diag.c:206
>>  netlink_dump+0x507/0xd20 net/netlink/af_netlink.c:2226
>>  __netlink_dump_start+0x51a/0x780 net/netlink/af_netlink.c:2323
>>  netlink_dump_start include/linux/netlink.h:214 [inline]
>>  unix_diag_handler_dump+0x3f4/0x7b0 net/unix/diag.c:307
>>  __sock_diag_cmd net/core/sock_diag.c:230 [inline]
>>  sock_diag_rcv_msg+0x2e0/0x3d0 net/core/sock_diag.c:261
>>  netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2448
>>  sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:272
>>  netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
>>  netlink_unicast+0x58b/0x740 net/netlink/af_netlink.c:1336
>>  netlink_sendmsg+0x9f0/0xfa0 net/netlink/af_netlink.c:1901
>>  sock_sendmsg_nosec net/socket.c:629 [inline]
>>  sock_sendmsg+0xd5/0x120 net/socket.c:639
>>  sock_write_iter+0x35a/0x5a0 net/socket.c:908
>>  call_write_iter include/linux/fs.h:1784 [inline]
>>  new_sync_write fs/read_write.c:474 [inline]
>>  __vfs_write+0x64d/0x960 fs/read_write.c:487
>>  vfs_write+0x1f8/0x560 fs/read_write.c:549
>>  ksys_write+0xf9/0x250 fs/read_write.c:598
>>  __do_sys_write fs/read_write.c:610 [inline]
>>  __se_sys_write fs/read_write.c:607 [inline]
>>  __ia32_sys_write+0x71/0xb0 fs/read_write.c:607
>>  do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline]
>>  do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394
>>  entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
>> RIP: 0023:0xf7f8ccb9
>> RSP: 002b:00000000f5f880ac EFLAGS: 00000282 ORIG_RAX: 0000000000000004
>> RAX: ffffffffffffffda RBX: 0000000000000017 RCX: 000000002058bfe4
>> RDX: 0000000000000029 RSI: 0000000000000000 RDI: 0000000000000000
>> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000
>> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
>>
>>
>> ---
>> This bug is generated by a bot. It may contain errors.
>> See https://goo.gl/tpsmEJ for more information about syzbot.
>> syzbot engineers can be reached at syzkaller@googlegroups.com.
>>
>> syzbot will keep track of this bug report.
>> If you forgot to add the Reported-by tag, once the fix for this bug is
>> merged
>> into any tree, please reply to this email with:
>> #syz fix: exact-commit-title
>> To mark this as a duplicate of another syzbot report, please reply with:
>> #syz dup: exact-subject-of-another-report
>> If it's a one-off invalid bug report, please reply with:
>> #syz invalid
>> Note: if the crash happens again, it will cause creation of a new bug
>> report.
>> Note: all commands must start from beginning of the line in the email body.
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20180511183358.GA1492%40outlook.office365.com.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: possible deadlock in sk_diag_fill
  2018-05-12  7:46   ` Dmitry Vyukov
@ 2018-05-14 18:00     ` Andrei Vagin
  2018-05-15  5:19       ` Dmitry Vyukov
  0 siblings, 1 reply; 7+ messages in thread
From: Andrei Vagin @ 2018-05-14 18:00 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: syzbot, avagin, David Miller, LKML, netdev, syzkaller-bugs

On Sat, May 12, 2018 at 09:46:25AM +0200, Dmitry Vyukov wrote:
> On Fri, May 11, 2018 at 8:33 PM, Andrei Vagin <avagin@virtuozzo.com> wrote:
> > On Sat, May 05, 2018 at 10:59:02AM -0700, syzbot wrote:
> >> Hello,
> >>
> >> syzbot found the following crash on:
> >>
> >> HEAD commit:    c1c07416cdd4 Merge tag 'kbuild-fixes-v4.17' of git://git.k..
> >> git tree:       upstream
> >> console output: https://syzkaller.appspot.com/x/log.txt?x=12164c97800000
> >> kernel config:  https://syzkaller.appspot.com/x/.config?x=5a1dc06635c10d27
> >> dashboard link: https://syzkaller.appspot.com/bug?extid=c1872be62e587eae9669
> >> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> >> userspace arch: i386
> >>
> >> Unfortunately, I don't have any reproducer for this crash yet.
> >>
> >> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >> Reported-by: syzbot+c1872be62e587eae9669@syzkaller.appspotmail.com
> >>
> >>
> >> ======================================================
> >> WARNING: possible circular locking dependency detected
> >> 4.17.0-rc3+ #59 Not tainted
> >> ------------------------------------------------------
> >> syz-executor1/25282 is trying to acquire lock:
> >> 000000004fddf743 (&(&u->lock)->rlock/1){+.+.}, at: sk_diag_dump_icons
> >> net/unix/diag.c:82 [inline]
> >> 000000004fddf743 (&(&u->lock)->rlock/1){+.+.}, at:
> >> sk_diag_fill.isra.5+0xa43/0x10d0 net/unix/diag.c:144
> >>
> >> but task is already holding lock:
> >> 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: spin_lock
> >> include/linux/spinlock.h:310 [inline]
> >> 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: sk_diag_dump_icons
> >> net/unix/diag.c:64 [inline]
> >> 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: sk_diag_fill.isra.5+0x94e/0x10d0
> >> net/unix/diag.c:144
> >>
> >> which lock already depends on the new lock.
> >
> > In the code, we have a comment which explains why it is safe to take this lock
> >
> > /*
> >  * The state lock is outer for the same sk's
> >  * queue lock. With the other's queue locked it's
> >  * OK to lock the state.
> >  */
> > unix_state_lock_nested(req);
> >
> > It is a question how to explain this to lockdep.
> 
> Do I understand it correctly that (&u->lock)->rlock associated with
> AF_UNIX is locked under rlock-AF_UNIX, and then rlock-AF_UNIX is
> locked under (&u->lock)->rlock associated with AF_NETLINK? If so, I
> think we need to split (&u->lock)->rlock by family too, so that we
> have u->lock-AF_UNIX and u->lock-AF_NETLINK.

I think here is another problem. lockdep woried about
sk->sk_receive_queue vs unix_sk(s)->lock.

sk_diag_dump_icons() takes sk->sk_receive_queue and then
unix_sk(s)->lock.

unix_dgram_sendmsg takes unix_sk(sk)->lock and then sk->sk_receive_queue.

sk_diag_dump_icons() takes locks for two different sockets, but
unix_dgram_sendmsg() takes locks for one socket.

sk_diag_dump_icons
        if (sk->sk_state == TCP_LISTEN) {
                spin_lock(&sk->sk_receive_queue.lock);
                skb_queue_walk(&sk->sk_receive_queue, skb) {
			unix_state_lock_nested(req);
				spin_lock_nested(&unix_sk(s)->lock,


unix_dgram_sendmsg
	unix_state_lock(other)
		spin_lock(&unix_sk(s)->lock)
        skb_queue_tail(&other->sk_receive_queue, skb);
	        spin_lock_irqsave(&list->lock, flags);

> 
> 
> 
> >> the existing dependency chain (in reverse order) is:
> >>
> >> -> #1 (rlock-AF_UNIX){+.+.}:
> >>        __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
> >>        _raw_spin_lock_irqsave+0x96/0xc0 kernel/locking/spinlock.c:152
> >>        skb_queue_tail+0x26/0x150 net/core/skbuff.c:2900
> >>        unix_dgram_sendmsg+0xf77/0x1730 net/unix/af_unix.c:1797
> >>        sock_sendmsg_nosec net/socket.c:629 [inline]
> >>        sock_sendmsg+0xd5/0x120 net/socket.c:639
> >>        ___sys_sendmsg+0x525/0x940 net/socket.c:2117
> >>        __sys_sendmmsg+0x3bb/0x6f0 net/socket.c:2205
> >>        __compat_sys_sendmmsg net/compat.c:770 [inline]
> >>        __do_compat_sys_sendmmsg net/compat.c:777 [inline]
> >>        __se_compat_sys_sendmmsg net/compat.c:774 [inline]
> >>        __ia32_compat_sys_sendmmsg+0x9f/0x100 net/compat.c:774
> >>        do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline]
> >>        do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394
> >>        entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
> >>
> >> -> #0 (&(&u->lock)->rlock/1){+.+.}:
> >>        lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
> >>        _raw_spin_lock_nested+0x28/0x40 kernel/locking/spinlock.c:354
> >>        sk_diag_dump_icons net/unix/diag.c:82 [inline]
> >>        sk_diag_fill.isra.5+0xa43/0x10d0 net/unix/diag.c:144
> >>        sk_diag_dump net/unix/diag.c:178 [inline]
> >>        unix_diag_dump+0x35f/0x550 net/unix/diag.c:206
> >>        netlink_dump+0x507/0xd20 net/netlink/af_netlink.c:2226
> >>        __netlink_dump_start+0x51a/0x780 net/netlink/af_netlink.c:2323
> >>        netlink_dump_start include/linux/netlink.h:214 [inline]
> >>        unix_diag_handler_dump+0x3f4/0x7b0 net/unix/diag.c:307
> >>        __sock_diag_cmd net/core/sock_diag.c:230 [inline]
> >>        sock_diag_rcv_msg+0x2e0/0x3d0 net/core/sock_diag.c:261
> >>        netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2448
> >>        sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:272
> >>        netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
> >>        netlink_unicast+0x58b/0x740 net/netlink/af_netlink.c:1336
> >>        netlink_sendmsg+0x9f0/0xfa0 net/netlink/af_netlink.c:1901
> >>        sock_sendmsg_nosec net/socket.c:629 [inline]
> >>        sock_sendmsg+0xd5/0x120 net/socket.c:639
> >>        sock_write_iter+0x35a/0x5a0 net/socket.c:908
> >>        call_write_iter include/linux/fs.h:1784 [inline]
> >>        new_sync_write fs/read_write.c:474 [inline]
> >>        __vfs_write+0x64d/0x960 fs/read_write.c:487
> >>        vfs_write+0x1f8/0x560 fs/read_write.c:549
> >>        ksys_write+0xf9/0x250 fs/read_write.c:598
> >>        __do_sys_write fs/read_write.c:610 [inline]
> >>        __se_sys_write fs/read_write.c:607 [inline]
> >>        __ia32_sys_write+0x71/0xb0 fs/read_write.c:607
> >>        do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline]
> >>        do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394
> >>        entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
> >>
> >> other info that might help us debug this:
> >>
> >>  Possible unsafe locking scenario:
> >>
> >>        CPU0                    CPU1
> >>        ----                    ----
> >>   lock(rlock-AF_UNIX);
> >>                                lock(&(&u->lock)->rlock/1);
> >>                                lock(rlock-AF_UNIX);
> >>   lock(&(&u->lock)->rlock/1);
> >>
> >>  *** DEADLOCK ***
> >>
> >> 5 locks held by syz-executor1/25282:
> >>  #0: 000000003919e1bd (sock_diag_mutex){+.+.}, at: sock_diag_rcv+0x1b/0x40
> >> net/core/sock_diag.c:271
> >>  #1: 000000004f328d3e (sock_diag_table_mutex){+.+.}, at: __sock_diag_cmd
> >> net/core/sock_diag.c:225 [inline]
> >>  #1: 000000004f328d3e (sock_diag_table_mutex){+.+.}, at:
> >> sock_diag_rcv_msg+0x169/0x3d0 net/core/sock_diag.c:261
> >>  #2: 000000004cc04dbb (nlk_cb_mutex-SOCK_DIAG){+.+.}, at:
> >> netlink_dump+0x98/0xd20 net/netlink/af_netlink.c:2182
> >>  #3: 00000000accdef41 (unix_table_lock){+.+.}, at: spin_lock
> >> include/linux/spinlock.h:310 [inline]
> >>  #3: 00000000accdef41 (unix_table_lock){+.+.}, at:
> >> unix_diag_dump+0x10a/0x550 net/unix/diag.c:192
> >>  #4: 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: spin_lock
> >> include/linux/spinlock.h:310 [inline]
> >>  #4: 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: sk_diag_dump_icons
> >> net/unix/diag.c:64 [inline]
> >>  #4: 00000000b6895645 (rlock-AF_UNIX){+.+.}, at:
> >> sk_diag_fill.isra.5+0x94e/0x10d0 net/unix/diag.c:144
> >>
> >> stack backtrace:
> >> CPU: 1 PID: 25282 Comm: syz-executor1 Not tainted 4.17.0-rc3+ #59
> >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> >> Google 01/01/2011
> >> Call Trace:
> >>  __dump_stack lib/dump_stack.c:77 [inline]
> >>  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
> >>  print_circular_bug.isra.36.cold.54+0x1bd/0x27d
> >> kernel/locking/lockdep.c:1223
> >>  check_prev_add kernel/locking/lockdep.c:1863 [inline]
> >>  check_prevs_add kernel/locking/lockdep.c:1976 [inline]
> >>  validate_chain kernel/locking/lockdep.c:2417 [inline]
> >>  __lock_acquire+0x343e/0x5140 kernel/locking/lockdep.c:3431
> >>  lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
> >>  _raw_spin_lock_nested+0x28/0x40 kernel/locking/spinlock.c:354
> >>  sk_diag_dump_icons net/unix/diag.c:82 [inline]
> >>  sk_diag_fill.isra.5+0xa43/0x10d0 net/unix/diag.c:144
> >>  sk_diag_dump net/unix/diag.c:178 [inline]
> >>  unix_diag_dump+0x35f/0x550 net/unix/diag.c:206
> >>  netlink_dump+0x507/0xd20 net/netlink/af_netlink.c:2226
> >>  __netlink_dump_start+0x51a/0x780 net/netlink/af_netlink.c:2323
> >>  netlink_dump_start include/linux/netlink.h:214 [inline]
> >>  unix_diag_handler_dump+0x3f4/0x7b0 net/unix/diag.c:307
> >>  __sock_diag_cmd net/core/sock_diag.c:230 [inline]
> >>  sock_diag_rcv_msg+0x2e0/0x3d0 net/core/sock_diag.c:261
> >>  netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2448
> >>  sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:272
> >>  netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
> >>  netlink_unicast+0x58b/0x740 net/netlink/af_netlink.c:1336
> >>  netlink_sendmsg+0x9f0/0xfa0 net/netlink/af_netlink.c:1901
> >>  sock_sendmsg_nosec net/socket.c:629 [inline]
> >>  sock_sendmsg+0xd5/0x120 net/socket.c:639
> >>  sock_write_iter+0x35a/0x5a0 net/socket.c:908
> >>  call_write_iter include/linux/fs.h:1784 [inline]
> >>  new_sync_write fs/read_write.c:474 [inline]
> >>  __vfs_write+0x64d/0x960 fs/read_write.c:487
> >>  vfs_write+0x1f8/0x560 fs/read_write.c:549
> >>  ksys_write+0xf9/0x250 fs/read_write.c:598
> >>  __do_sys_write fs/read_write.c:610 [inline]
> >>  __se_sys_write fs/read_write.c:607 [inline]
> >>  __ia32_sys_write+0x71/0xb0 fs/read_write.c:607
> >>  do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline]
> >>  do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394
> >>  entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
> >> RIP: 0023:0xf7f8ccb9
> >> RSP: 002b:00000000f5f880ac EFLAGS: 00000282 ORIG_RAX: 0000000000000004
> >> RAX: ffffffffffffffda RBX: 0000000000000017 RCX: 000000002058bfe4
> >> RDX: 0000000000000029 RSI: 0000000000000000 RDI: 0000000000000000
> >> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> >> R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000
> >> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> >>
> >>
> >> ---
> >> This bug is generated by a bot. It may contain errors.
> >> See https://goo.gl/tpsmEJ for more information about syzbot.
> >> syzbot engineers can be reached at syzkaller@googlegroups.com.
> >>
> >> syzbot will keep track of this bug report.
> >> If you forgot to add the Reported-by tag, once the fix for this bug is
> >> merged
> >> into any tree, please reply to this email with:
> >> #syz fix: exact-commit-title
> >> To mark this as a duplicate of another syzbot report, please reply with:
> >> #syz dup: exact-subject-of-another-report
> >> If it's a one-off invalid bug report, please reply with:
> >> #syz invalid
> >> Note: if the crash happens again, it will cause creation of a new bug
> >> report.
> >> Note: all commands must start from beginning of the line in the email body.
> >
> > --
> > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20180511183358.GA1492%40outlook.office365.com.
> > For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: possible deadlock in sk_diag_fill
  2018-05-14 18:00     ` Andrei Vagin
@ 2018-05-15  5:19       ` Dmitry Vyukov
  2018-05-15  6:18         ` Andrei Vagin
  0 siblings, 1 reply; 7+ messages in thread
From: Dmitry Vyukov @ 2018-05-15  5:19 UTC (permalink / raw)
  To: Andrei Vagin; +Cc: syzbot, avagin, David Miller, LKML, netdev, syzkaller-bugs

On Mon, May 14, 2018 at 8:00 PM, Andrei Vagin <avagin@virtuozzo.com> wrote:
>> >> Hello,
>> >>
>> >> syzbot found the following crash on:
>> >>
>> >> HEAD commit:    c1c07416cdd4 Merge tag 'kbuild-fixes-v4.17' of git://git.k..
>> >> git tree:       upstream
>> >> console output: https://syzkaller.appspot.com/x/log.txt?x=12164c97800000
>> >> kernel config:  https://syzkaller.appspot.com/x/.config?x=5a1dc06635c10d27
>> >> dashboard link: https://syzkaller.appspot.com/bug?extid=c1872be62e587eae9669
>> >> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>> >> userspace arch: i386
>> >>
>> >> Unfortunately, I don't have any reproducer for this crash yet.
>> >>
>> >> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> >> Reported-by: syzbot+c1872be62e587eae9669@syzkaller.appspotmail.com
>> >>
>> >>
>> >> ======================================================
>> >> WARNING: possible circular locking dependency detected
>> >> 4.17.0-rc3+ #59 Not tainted
>> >> ------------------------------------------------------
>> >> syz-executor1/25282 is trying to acquire lock:
>> >> 000000004fddf743 (&(&u->lock)->rlock/1){+.+.}, at: sk_diag_dump_icons
>> >> net/unix/diag.c:82 [inline]
>> >> 000000004fddf743 (&(&u->lock)->rlock/1){+.+.}, at:
>> >> sk_diag_fill.isra.5+0xa43/0x10d0 net/unix/diag.c:144
>> >>
>> >> but task is already holding lock:
>> >> 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: spin_lock
>> >> include/linux/spinlock.h:310 [inline]
>> >> 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: sk_diag_dump_icons
>> >> net/unix/diag.c:64 [inline]
>> >> 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: sk_diag_fill.isra.5+0x94e/0x10d0
>> >> net/unix/diag.c:144
>> >>
>> >> which lock already depends on the new lock.
>> >
>> > In the code, we have a comment which explains why it is safe to take this lock
>> >
>> > /*
>> >  * The state lock is outer for the same sk's
>> >  * queue lock. With the other's queue locked it's
>> >  * OK to lock the state.
>> >  */
>> > unix_state_lock_nested(req);
>> >
>> > It is a question how to explain this to lockdep.
>>
>> Do I understand it correctly that (&u->lock)->rlock associated with
>> AF_UNIX is locked under rlock-AF_UNIX, and then rlock-AF_UNIX is
>> locked under (&u->lock)->rlock associated with AF_NETLINK? If so, I
>> think we need to split (&u->lock)->rlock by family too, so that we
>> have u->lock-AF_UNIX and u->lock-AF_NETLINK.
>
> I think here is another problem. lockdep woried about
> sk->sk_receive_queue vs unix_sk(s)->lock.
>
> sk_diag_dump_icons() takes sk->sk_receive_queue and then
> unix_sk(s)->lock.
>
> unix_dgram_sendmsg takes unix_sk(sk)->lock and then sk->sk_receive_queue.
>
> sk_diag_dump_icons() takes locks for two different sockets, but
> unix_dgram_sendmsg() takes locks for one socket.
>
> sk_diag_dump_icons
>         if (sk->sk_state == TCP_LISTEN) {
>                 spin_lock(&sk->sk_receive_queue.lock);
>                 skb_queue_walk(&sk->sk_receive_queue, skb) {
>                         unix_state_lock_nested(req);
>                                 spin_lock_nested(&unix_sk(s)->lock,
>
>
> unix_dgram_sendmsg
>         unix_state_lock(other)
>                 spin_lock(&unix_sk(s)->lock)
>         skb_queue_tail(&other->sk_receive_queue, skb);
>                 spin_lock_irqsave(&list->lock, flags);


Do you mean the following?
There is socket 1 with state lock (S1) and queue lock (Q2), and socket
2 with state lock (S2) and queue lock (Q2). unix_dgram_sendmsg lock
S1->Q1. And sk_diag_dump_icons locks Q1->S2.
If yes, then this looks pretty much as deadlock. Consider that 2
unix_dgram_sendmsg in 2 different threads lock S1 and S2 respectively.
Now 2  sk_diag_dump_icons in 2 different threads lock Q1 and Q2
respectively. Now sk_diag_dump_icons want to lock S's, and
unix_dgram_sendmsg want to lock Q's. Nobody can proceed.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: possible deadlock in sk_diag_fill
  2018-05-15  5:19       ` Dmitry Vyukov
@ 2018-05-15  6:18         ` Andrei Vagin
  2018-05-15  7:26           ` Dmitry Vyukov
  0 siblings, 1 reply; 7+ messages in thread
From: Andrei Vagin @ 2018-05-15  6:18 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: syzbot, avagin, David Miller, LKML, netdev, syzkaller-bugs

On Tue, May 15, 2018 at 07:19:39AM +0200, Dmitry Vyukov wrote:
> On Mon, May 14, 2018 at 8:00 PM, Andrei Vagin <avagin@virtuozzo.com> wrote:
> >> >> Hello,
> >> >>
> >> >> syzbot found the following crash on:
> >> >>
> >> >> HEAD commit:    c1c07416cdd4 Merge tag 'kbuild-fixes-v4.17' of git://git.k..
> >> >> git tree:       upstream
> >> >> console output: https://syzkaller.appspot.com/x/log.txt?x=12164c97800000
> >> >> kernel config:  https://syzkaller.appspot.com/x/.config?x=5a1dc06635c10d27
> >> >> dashboard link: https://syzkaller.appspot.com/bug?extid=c1872be62e587eae9669
> >> >> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> >> >> userspace arch: i386
> >> >>
> >> >> Unfortunately, I don't have any reproducer for this crash yet.
> >> >>
> >> >> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >> >> Reported-by: syzbot+c1872be62e587eae9669@syzkaller.appspotmail.com
> >> >>
> >> >>
> >> >> ======================================================
> >> >> WARNING: possible circular locking dependency detected
> >> >> 4.17.0-rc3+ #59 Not tainted
> >> >> ------------------------------------------------------
> >> >> syz-executor1/25282 is trying to acquire lock:
> >> >> 000000004fddf743 (&(&u->lock)->rlock/1){+.+.}, at: sk_diag_dump_icons
> >> >> net/unix/diag.c:82 [inline]
> >> >> 000000004fddf743 (&(&u->lock)->rlock/1){+.+.}, at:
> >> >> sk_diag_fill.isra.5+0xa43/0x10d0 net/unix/diag.c:144
> >> >>
> >> >> but task is already holding lock:
> >> >> 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: spin_lock
> >> >> include/linux/spinlock.h:310 [inline]
> >> >> 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: sk_diag_dump_icons
> >> >> net/unix/diag.c:64 [inline]
> >> >> 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: sk_diag_fill.isra.5+0x94e/0x10d0
> >> >> net/unix/diag.c:144
> >> >>
> >> >> which lock already depends on the new lock.
> >> >
> >> > In the code, we have a comment which explains why it is safe to take this lock
> >> >
> >> > /*
> >> >  * The state lock is outer for the same sk's
> >> >  * queue lock. With the other's queue locked it's
> >> >  * OK to lock the state.
> >> >  */
> >> > unix_state_lock_nested(req);
> >> >
> >> > It is a question how to explain this to lockdep.
> >>
> >> Do I understand it correctly that (&u->lock)->rlock associated with
> >> AF_UNIX is locked under rlock-AF_UNIX, and then rlock-AF_UNIX is
> >> locked under (&u->lock)->rlock associated with AF_NETLINK? If so, I
> >> think we need to split (&u->lock)->rlock by family too, so that we
> >> have u->lock-AF_UNIX and u->lock-AF_NETLINK.
> >
> > I think here is another problem. lockdep woried about
> > sk->sk_receive_queue vs unix_sk(s)->lock.
> >
> > sk_diag_dump_icons() takes sk->sk_receive_queue and then
> > unix_sk(s)->lock.
> >
> > unix_dgram_sendmsg takes unix_sk(sk)->lock and then sk->sk_receive_queue.
> >
> > sk_diag_dump_icons() takes locks for two different sockets, but
> > unix_dgram_sendmsg() takes locks for one socket.
> >
> > sk_diag_dump_icons
> >         if (sk->sk_state == TCP_LISTEN) {
> >                 spin_lock(&sk->sk_receive_queue.lock);
> >                 skb_queue_walk(&sk->sk_receive_queue, skb) {
> >                         unix_state_lock_nested(req);
> >                                 spin_lock_nested(&unix_sk(s)->lock,
> >
> >
> > unix_dgram_sendmsg
> >         unix_state_lock(other)
> >                 spin_lock(&unix_sk(s)->lock)
> >         skb_queue_tail(&other->sk_receive_queue, skb);
> >                 spin_lock_irqsave(&list->lock, flags);
> 
> 
> Do you mean the following?
> There is socket 1 with state lock (S1) and queue lock (Q2), and socket
> 2 with state lock (S2) and queue lock (Q2). unix_dgram_sendmsg lock
> S1->Q1. And sk_diag_dump_icons locks Q1->S2.
> If yes, then this looks pretty much as deadlock. Consider that 2
> unix_dgram_sendmsg in 2 different threads lock S1 and S2 respectively.
> Now 2  sk_diag_dump_icons in 2 different threads lock Q1 and Q2
> respectively. Now sk_diag_dump_icons want to lock S's, and
> unix_dgram_sendmsg want to lock Q's. Nobody can proceed.

Q1 and S1 belongs to a listen socket, so they can't be taken from
unix_dgram_sendmsg().

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: possible deadlock in sk_diag_fill
  2018-05-15  6:18         ` Andrei Vagin
@ 2018-05-15  7:26           ` Dmitry Vyukov
  0 siblings, 0 replies; 7+ messages in thread
From: Dmitry Vyukov @ 2018-05-15  7:26 UTC (permalink / raw)
  To: Andrei Vagin; +Cc: syzbot, avagin, David Miller, LKML, netdev, syzkaller-bugs

On Tue, May 15, 2018 at 8:18 AM, Andrei Vagin <avagin@virtuozzo.com> wrote:
>> >> >> Hello,
>> >> >>
>> >> >> syzbot found the following crash on:
>> >> >>
>> >> >> HEAD commit:    c1c07416cdd4 Merge tag 'kbuild-fixes-v4.17' of git://git.k..
>> >> >> git tree:       upstream
>> >> >> console output: https://syzkaller.appspot.com/x/log.txt?x=12164c97800000
>> >> >> kernel config:  https://syzkaller.appspot.com/x/.config?x=5a1dc06635c10d27
>> >> >> dashboard link: https://syzkaller.appspot.com/bug?extid=c1872be62e587eae9669
>> >> >> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>> >> >> userspace arch: i386
>> >> >>
>> >> >> Unfortunately, I don't have any reproducer for this crash yet.
>> >> >>
>> >> >> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> >> >> Reported-by: syzbot+c1872be62e587eae9669@syzkaller.appspotmail.com
>> >> >>
>> >> >>
>> >> >> ======================================================
>> >> >> WARNING: possible circular locking dependency detected
>> >> >> 4.17.0-rc3+ #59 Not tainted
>> >> >> ------------------------------------------------------
>> >> >> syz-executor1/25282 is trying to acquire lock:
>> >> >> 000000004fddf743 (&(&u->lock)->rlock/1){+.+.}, at: sk_diag_dump_icons
>> >> >> net/unix/diag.c:82 [inline]
>> >> >> 000000004fddf743 (&(&u->lock)->rlock/1){+.+.}, at:
>> >> >> sk_diag_fill.isra.5+0xa43/0x10d0 net/unix/diag.c:144
>> >> >>
>> >> >> but task is already holding lock:
>> >> >> 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: spin_lock
>> >> >> include/linux/spinlock.h:310 [inline]
>> >> >> 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: sk_diag_dump_icons
>> >> >> net/unix/diag.c:64 [inline]
>> >> >> 00000000b6895645 (rlock-AF_UNIX){+.+.}, at: sk_diag_fill.isra.5+0x94e/0x10d0
>> >> >> net/unix/diag.c:144
>> >> >>
>> >> >> which lock already depends on the new lock.
>> >> >
>> >> > In the code, we have a comment which explains why it is safe to take this lock
>> >> >
>> >> > /*
>> >> >  * The state lock is outer for the same sk's
>> >> >  * queue lock. With the other's queue locked it's
>> >> >  * OK to lock the state.
>> >> >  */
>> >> > unix_state_lock_nested(req);
>> >> >
>> >> > It is a question how to explain this to lockdep.
>> >>
>> >> Do I understand it correctly that (&u->lock)->rlock associated with
>> >> AF_UNIX is locked under rlock-AF_UNIX, and then rlock-AF_UNIX is
>> >> locked under (&u->lock)->rlock associated with AF_NETLINK? If so, I
>> >> think we need to split (&u->lock)->rlock by family too, so that we
>> >> have u->lock-AF_UNIX and u->lock-AF_NETLINK.
>> >
>> > I think here is another problem. lockdep woried about
>> > sk->sk_receive_queue vs unix_sk(s)->lock.
>> >
>> > sk_diag_dump_icons() takes sk->sk_receive_queue and then
>> > unix_sk(s)->lock.
>> >
>> > unix_dgram_sendmsg takes unix_sk(sk)->lock and then sk->sk_receive_queue.
>> >
>> > sk_diag_dump_icons() takes locks for two different sockets, but
>> > unix_dgram_sendmsg() takes locks for one socket.
>> >
>> > sk_diag_dump_icons
>> >         if (sk->sk_state == TCP_LISTEN) {
>> >                 spin_lock(&sk->sk_receive_queue.lock);
>> >                 skb_queue_walk(&sk->sk_receive_queue, skb) {
>> >                         unix_state_lock_nested(req);
>> >                                 spin_lock_nested(&unix_sk(s)->lock,
>> >
>> >
>> > unix_dgram_sendmsg
>> >         unix_state_lock(other)
>> >                 spin_lock(&unix_sk(s)->lock)
>> >         skb_queue_tail(&other->sk_receive_queue, skb);
>> >                 spin_lock_irqsave(&list->lock, flags);
>>
>>
>> Do you mean the following?
>> There is socket 1 with state lock (S1) and queue lock (Q2), and socket
>> 2 with state lock (S2) and queue lock (Q2). unix_dgram_sendmsg lock
>> S1->Q1. And sk_diag_dump_icons locks Q1->S2.
>> If yes, then this looks pretty much as deadlock. Consider that 2
>> unix_dgram_sendmsg in 2 different threads lock S1 and S2 respectively.
>> Now 2  sk_diag_dump_icons in 2 different threads lock Q1 and Q2
>> respectively. Now sk_diag_dump_icons want to lock S's, and
>> unix_dgram_sendmsg want to lock Q's. Nobody can proceed.
>
> Q1 and S1 belongs to a listen socket, so they can't be taken from
> unix_dgram_sendmsg().

Should we then split Q1/S1 for listening and data sockets? I don't
know it lockdep allows changing lock class on the fly, though. Always
wondered if there was a single reason to mix listening and data
sockets into a single thing on API level...

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-05-15  7:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-05 17:59 possible deadlock in sk_diag_fill syzbot
2018-05-11 18:33 ` Andrei Vagin
2018-05-12  7:46   ` Dmitry Vyukov
2018-05-14 18:00     ` Andrei Vagin
2018-05-15  5:19       ` Dmitry Vyukov
2018-05-15  6:18         ` Andrei Vagin
2018-05-15  7:26           ` Dmitry Vyukov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).