linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* fs, net: deadlock between bind/splice on af_unix
@ 2016-12-08 14:47 Dmitry Vyukov
  2016-12-08 16:30 ` Dmitry Vyukov
       [not found] ` <065031f0-27c5-443d-82f9-2f475fcef8c3@googlegroups.com>
  0 siblings, 2 replies; 22+ messages in thread
From: Dmitry Vyukov @ 2016-12-08 14:47 UTC (permalink / raw)
  To: Al Viro, linux-fsdevel, LKML, David Miller, Rainer Weikusat,
	Hannes Frederic Sowa, Cong Wang, netdev, Eric Dumazet
  Cc: syzkaller

Hello,

I am getting the following deadlock reports while running syzkaller
fuzzer on 318c8932ddec5c1c26a4af0f3c053784841c598e (Dec 7).


[ INFO: possible circular locking dependency detected ]
4.9.0-rc8+ #77 Not tainted
-------------------------------------------------------
syz-executor0/3155 is trying to acquire lock:
 (&u->bindlock){+.+.+.}, at: [<ffffffff871bca1a>]
unix_autobind.isra.26+0xca/0x8a0 net/unix/af_unix.c:852
but task is already holding lock:
 (&pipe->mutex/1){+.+.+.}, at: [<     inline     >] pipe_lock_nested
fs/pipe.c:66
 (&pipe->mutex/1){+.+.+.}, at: [<ffffffff81a8ea4b>]
pipe_lock+0x5b/0x70 fs/pipe.c:74
which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

       [  202.103497] [<     inline     >] validate_chain
kernel/locking/lockdep.c:2265
       [  202.103497] [<ffffffff81569576>]
__lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
       [  202.103497] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
kernel/locking/lockdep.c:3749
       [  202.103497] [<     inline     >] __mutex_lock_common
kernel/locking/mutex.c:521
       [  202.103497] [<ffffffff88195bcf>]
mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
       [  202.103497] [<     inline     >] pipe_lock_nested fs/pipe.c:66
       [  202.103497] [<ffffffff81a8ea4b>] pipe_lock+0x5b/0x70 fs/pipe.c:74
       [  202.103497] [<ffffffff81b451f7>]
iter_file_splice_write+0x267/0xfa0 fs/splice.c:717
       [  202.103497] [<     inline     >] do_splice_from fs/splice.c:869
       [  202.103497] [<     inline     >] do_splice fs/splice.c:1160
       [  202.103497] [<     inline     >] SYSC_splice fs/splice.c:1410
       [  202.103497] [<ffffffff81b473c7>] SyS_splice+0x7d7/0x16a0
fs/splice.c:1393
       [  202.103497] [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6

       [  202.103497] [<     inline     >] validate_chain
kernel/locking/lockdep.c:2265
       [  202.103497] [<ffffffff81569576>]
__lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
       [  202.103497] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
kernel/locking/lockdep.c:3749
       [  202.103497] [<     inline     >]
percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:35
       [  202.103497] [<     inline     >] percpu_down_read
include/linux/percpu-rwsem.h:58
       [  202.103497] [<ffffffff81a7bb33>]
__sb_start_write+0x193/0x2a0 fs/super.c:1252
       [  202.103497] [<     inline     >] sb_start_write
include/linux/fs.h:1549
       [  202.103497] [<ffffffff81af9954>] mnt_want_write+0x44/0xb0
fs/namespace.c:389
       [  202.103497] [<ffffffff81ab09f6>] filename_create+0x156/0x620
fs/namei.c:3598
       [  202.103497] [<ffffffff81ab0ef8>] kern_path_create+0x38/0x50
fs/namei.c:3644
       [  202.103497] [<     inline     >] unix_mknod net/unix/af_unix.c:967
       [  202.103497] [<ffffffff871c0e11>] unix_bind+0x4d1/0xe60
net/unix/af_unix.c:1035
       [  202.103497] [<ffffffff86a76b7e>] SYSC_bind+0x20e/0x4c0
net/socket.c:1382
       [  202.103497] [<ffffffff86a7a509>] SyS_bind+0x29/0x30 net/socket.c:1368
       [  202.103497] [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6

       [  202.103497] [<     inline     >] check_prev_add
kernel/locking/lockdep.c:1828
       [  202.103497] [<ffffffff8156309b>]
check_prevs_add+0xaab/0x1c20 kernel/locking/lockdep.c:1938
       [  202.103497] [<     inline     >] validate_chain
kernel/locking/lockdep.c:2265
       [  202.103497] [<ffffffff81569576>]
__lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
       [  202.103497] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
kernel/locking/lockdep.c:3749
       [  202.103497] [<     inline     >] __mutex_lock_common
kernel/locking/mutex.c:521
       [  202.103497] [<ffffffff88196b82>]
mutex_lock_interruptible_nested+0x2d2/0x11d0
kernel/locking/mutex.c:650
       [  202.103497] [<ffffffff871bca1a>]
unix_autobind.isra.26+0xca/0x8a0 net/unix/af_unix.c:852
       [  202.103497] [<ffffffff871c76dd>]
unix_dgram_sendmsg+0x105d/0x1730 net/unix/af_unix.c:1667
       [  202.103497] [<ffffffff871c7ea8>]
unix_seqpacket_sendmsg+0xf8/0x170 net/unix/af_unix.c:2071
       [  202.103497] [<     inline     >] sock_sendmsg_nosec net/socket.c:621
       [  202.103497] [<ffffffff86a7618f>] sock_sendmsg+0xcf/0x110
net/socket.c:631
       [  202.103497] [<ffffffff86a7683c>] kernel_sendmsg+0x4c/0x60
net/socket.c:639
       [  202.103497] [<ffffffff86a8101d>]
sock_no_sendpage+0x20d/0x310 net/core/sock.c:2321
       [  202.103497] [<ffffffff86a74c95>] kernel_sendpage+0x95/0xf0
net/socket.c:3289
       [  202.103497] [<ffffffff86a74d92>] sock_sendpage+0xa2/0xd0
net/socket.c:775
       [  202.103497] [<ffffffff81b3ee1e>]
pipe_to_sendpage+0x2ae/0x390 fs/splice.c:469
       [  202.103497] [<     inline     >] splice_from_pipe_feed fs/splice.c:520
       [  202.103497] [<ffffffff81b42f3f>]
__splice_from_pipe+0x31f/0x750 fs/splice.c:644
       [  202.103497] [<ffffffff81b4665c>]
splice_from_pipe+0x1dc/0x300 fs/splice.c:679
       [  202.103497] [<ffffffff81b467c5>]
generic_splice_sendpage+0x45/0x60 fs/splice.c:850
       [  202.103497] [<     inline     >] do_splice_from fs/splice.c:869
       [  202.103497] [<     inline     >] do_splice fs/splice.c:1160
       [  202.103497] [<     inline     >] SYSC_splice fs/splice.c:1410
       [  202.103497] [<ffffffff81b473c7>] SyS_splice+0x7d7/0x16a0
fs/splice.c:1393
       [  202.103497] [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6

other info that might help us debug this:

Chain exists of:
 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&pipe->mutex/1);
                               lock(sb_writers#5);
                               lock(&pipe->mutex/1);
  lock(&u->bindlock);

 *** DEADLOCK ***

1 lock held by syz-executor0/3155:
 #0:  (&pipe->mutex/1){+.+.+.}, at: [<     inline     >]
pipe_lock_nested fs/pipe.c:66
 #0:  (&pipe->mutex/1){+.+.+.}, at: [<ffffffff81a8ea4b>]
pipe_lock+0x5b/0x70 fs/pipe.c:74

stack backtrace:
CPU: 3 PID: 3155 Comm: syz-executor0 Not tainted 4.9.0-rc8+ #77
Hardware name: Google Google/Google, BIOS Google 01/01/2011
 ffff88004b1fe288 ffffffff834c44f9 ffffffff00000003 1ffff1000963fbe4
 ffffed000963fbdc 0000000041b58ab3 ffffffff895816f0 ffffffff834c420b
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
 [<     inline     >] __dump_stack lib/dump_stack.c:15
 [<ffffffff834c44f9>] dump_stack+0x2ee/0x3f5 lib/dump_stack.c:51
 [<ffffffff81560cb0>] print_circular_bug+0x310/0x3c0
kernel/locking/lockdep.c:1202
 [<     inline     >] check_prev_add kernel/locking/lockdep.c:1828
 [<ffffffff8156309b>] check_prevs_add+0xaab/0x1c20 kernel/locking/lockdep.c:1938
 [<     inline     >] validate_chain kernel/locking/lockdep.c:2265
 [<ffffffff81569576>] __lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
 [<ffffffff8156b672>] lock_acquire+0x2a2/0x790 kernel/locking/lockdep.c:3749
 [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:521
 [<ffffffff88196b82>] mutex_lock_interruptible_nested+0x2d2/0x11d0
kernel/locking/mutex.c:650
 [<ffffffff871bca1a>] unix_autobind.isra.26+0xca/0x8a0 net/unix/af_unix.c:852
 [<ffffffff871c76dd>] unix_dgram_sendmsg+0x105d/0x1730 net/unix/af_unix.c:1667
 [<ffffffff871c7ea8>] unix_seqpacket_sendmsg+0xf8/0x170 net/unix/af_unix.c:2071
 [<     inline     >] sock_sendmsg_nosec net/socket.c:621
 [<ffffffff86a7618f>] sock_sendmsg+0xcf/0x110 net/socket.c:631
 [<ffffffff86a7683c>] kernel_sendmsg+0x4c/0x60 net/socket.c:639
 [<ffffffff86a8101d>] sock_no_sendpage+0x20d/0x310 net/core/sock.c:2321
 [<ffffffff86a74c95>] kernel_sendpage+0x95/0xf0 net/socket.c:3289
 [<ffffffff86a74d92>] sock_sendpage+0xa2/0xd0 net/socket.c:775
 [<ffffffff81b3ee1e>] pipe_to_sendpage+0x2ae/0x390 fs/splice.c:469
 [<     inline     >] splice_from_pipe_feed fs/splice.c:520
 [<ffffffff81b42f3f>] __splice_from_pipe+0x31f/0x750 fs/splice.c:644
 [<ffffffff81b4665c>] splice_from_pipe+0x1dc/0x300 fs/splice.c:679
 [<ffffffff81b467c5>] generic_splice_sendpage+0x45/0x60 fs/splice.c:850
 [<     inline     >] do_splice_from fs/splice.c:869
 [<     inline     >] do_splice fs/splice.c:1160
 [<     inline     >] SYSC_splice fs/splice.c:1410
 [<ffffffff81b473c7>] SyS_splice+0x7d7/0x16a0 fs/splice.c:1393
 [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2016-12-08 14:47 fs, net: deadlock between bind/splice on af_unix Dmitry Vyukov
@ 2016-12-08 16:30 ` Dmitry Vyukov
  2016-12-09  0:08   ` Cong Wang
       [not found] ` <065031f0-27c5-443d-82f9-2f475fcef8c3@googlegroups.com>
  1 sibling, 1 reply; 22+ messages in thread
From: Dmitry Vyukov @ 2016-12-08 16:30 UTC (permalink / raw)
  To: Al Viro, linux-fsdevel, LKML, David Miller, Rainer Weikusat,
	Hannes Frederic Sowa, Cong Wang, netdev, Eric Dumazet
  Cc: syzkaller

On Thu, Dec 8, 2016 at 3:47 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> Hello,
>
> I am getting the following deadlock reports while running syzkaller
> fuzzer on 318c8932ddec5c1c26a4af0f3c053784841c598e (Dec 7).
>
>
> [ INFO: possible circular locking dependency detected ]
> 4.9.0-rc8+ #77 Not tainted
> -------------------------------------------------------
> syz-executor0/3155 is trying to acquire lock:
>  (&u->bindlock){+.+.+.}, at: [<ffffffff871bca1a>]
> unix_autobind.isra.26+0xca/0x8a0 net/unix/af_unix.c:852
> but task is already holding lock:
>  (&pipe->mutex/1){+.+.+.}, at: [<     inline     >] pipe_lock_nested
> fs/pipe.c:66
>  (&pipe->mutex/1){+.+.+.}, at: [<ffffffff81a8ea4b>]
> pipe_lock+0x5b/0x70 fs/pipe.c:74
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
>        [  202.103497] [<     inline     >] validate_chain
> kernel/locking/lockdep.c:2265
>        [  202.103497] [<ffffffff81569576>]
> __lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
>        [  202.103497] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
> kernel/locking/lockdep.c:3749
>        [  202.103497] [<     inline     >] __mutex_lock_common
> kernel/locking/mutex.c:521
>        [  202.103497] [<ffffffff88195bcf>]
> mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
>        [  202.103497] [<     inline     >] pipe_lock_nested fs/pipe.c:66
>        [  202.103497] [<ffffffff81a8ea4b>] pipe_lock+0x5b/0x70 fs/pipe.c:74
>        [  202.103497] [<ffffffff81b451f7>]
> iter_file_splice_write+0x267/0xfa0 fs/splice.c:717
>        [  202.103497] [<     inline     >] do_splice_from fs/splice.c:869
>        [  202.103497] [<     inline     >] do_splice fs/splice.c:1160
>        [  202.103497] [<     inline     >] SYSC_splice fs/splice.c:1410
>        [  202.103497] [<ffffffff81b473c7>] SyS_splice+0x7d7/0x16a0
> fs/splice.c:1393
>        [  202.103497] [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6
>
>        [  202.103497] [<     inline     >] validate_chain
> kernel/locking/lockdep.c:2265
>        [  202.103497] [<ffffffff81569576>]
> __lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
>        [  202.103497] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
> kernel/locking/lockdep.c:3749
>        [  202.103497] [<     inline     >]
> percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:35
>        [  202.103497] [<     inline     >] percpu_down_read
> include/linux/percpu-rwsem.h:58
>        [  202.103497] [<ffffffff81a7bb33>]
> __sb_start_write+0x193/0x2a0 fs/super.c:1252
>        [  202.103497] [<     inline     >] sb_start_write
> include/linux/fs.h:1549
>        [  202.103497] [<ffffffff81af9954>] mnt_want_write+0x44/0xb0
> fs/namespace.c:389
>        [  202.103497] [<ffffffff81ab09f6>] filename_create+0x156/0x620
> fs/namei.c:3598
>        [  202.103497] [<ffffffff81ab0ef8>] kern_path_create+0x38/0x50
> fs/namei.c:3644
>        [  202.103497] [<     inline     >] unix_mknod net/unix/af_unix.c:967
>        [  202.103497] [<ffffffff871c0e11>] unix_bind+0x4d1/0xe60
> net/unix/af_unix.c:1035
>        [  202.103497] [<ffffffff86a76b7e>] SYSC_bind+0x20e/0x4c0
> net/socket.c:1382
>        [  202.103497] [<ffffffff86a7a509>] SyS_bind+0x29/0x30 net/socket.c:1368
>        [  202.103497] [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6
>
>        [  202.103497] [<     inline     >] check_prev_add
> kernel/locking/lockdep.c:1828
>        [  202.103497] [<ffffffff8156309b>]
> check_prevs_add+0xaab/0x1c20 kernel/locking/lockdep.c:1938
>        [  202.103497] [<     inline     >] validate_chain
> kernel/locking/lockdep.c:2265
>        [  202.103497] [<ffffffff81569576>]
> __lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
>        [  202.103497] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
> kernel/locking/lockdep.c:3749
>        [  202.103497] [<     inline     >] __mutex_lock_common
> kernel/locking/mutex.c:521
>        [  202.103497] [<ffffffff88196b82>]
> mutex_lock_interruptible_nested+0x2d2/0x11d0
> kernel/locking/mutex.c:650
>        [  202.103497] [<ffffffff871bca1a>]
> unix_autobind.isra.26+0xca/0x8a0 net/unix/af_unix.c:852
>        [  202.103497] [<ffffffff871c76dd>]
> unix_dgram_sendmsg+0x105d/0x1730 net/unix/af_unix.c:1667
>        [  202.103497] [<ffffffff871c7ea8>]
> unix_seqpacket_sendmsg+0xf8/0x170 net/unix/af_unix.c:2071
>        [  202.103497] [<     inline     >] sock_sendmsg_nosec net/socket.c:621
>        [  202.103497] [<ffffffff86a7618f>] sock_sendmsg+0xcf/0x110
> net/socket.c:631
>        [  202.103497] [<ffffffff86a7683c>] kernel_sendmsg+0x4c/0x60
> net/socket.c:639
>        [  202.103497] [<ffffffff86a8101d>]
> sock_no_sendpage+0x20d/0x310 net/core/sock.c:2321
>        [  202.103497] [<ffffffff86a74c95>] kernel_sendpage+0x95/0xf0
> net/socket.c:3289
>        [  202.103497] [<ffffffff86a74d92>] sock_sendpage+0xa2/0xd0
> net/socket.c:775
>        [  202.103497] [<ffffffff81b3ee1e>]
> pipe_to_sendpage+0x2ae/0x390 fs/splice.c:469
>        [  202.103497] [<     inline     >] splice_from_pipe_feed fs/splice.c:520
>        [  202.103497] [<ffffffff81b42f3f>]
> __splice_from_pipe+0x31f/0x750 fs/splice.c:644
>        [  202.103497] [<ffffffff81b4665c>]
> splice_from_pipe+0x1dc/0x300 fs/splice.c:679
>        [  202.103497] [<ffffffff81b467c5>]
> generic_splice_sendpage+0x45/0x60 fs/splice.c:850
>        [  202.103497] [<     inline     >] do_splice_from fs/splice.c:869
>        [  202.103497] [<     inline     >] do_splice fs/splice.c:1160
>        [  202.103497] [<     inline     >] SYSC_splice fs/splice.c:1410
>        [  202.103497] [<ffffffff81b473c7>] SyS_splice+0x7d7/0x16a0
> fs/splice.c:1393
>        [  202.103497] [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6
>
> other info that might help us debug this:
>
> Chain exists of:
>  Possible unsafe locking scenario:
>
>        CPU0                    CPU1
>        ----                    ----
>   lock(&pipe->mutex/1);
>                                lock(sb_writers#5);
>                                lock(&pipe->mutex/1);
>   lock(&u->bindlock);
>
>  *** DEADLOCK ***
>
> 1 lock held by syz-executor0/3155:
>  #0:  (&pipe->mutex/1){+.+.+.}, at: [<     inline     >]
> pipe_lock_nested fs/pipe.c:66
>  #0:  (&pipe->mutex/1){+.+.+.}, at: [<ffffffff81a8ea4b>]
> pipe_lock+0x5b/0x70 fs/pipe.c:74
>
> stack backtrace:
> CPU: 3 PID: 3155 Comm: syz-executor0 Not tainted 4.9.0-rc8+ #77
> Hardware name: Google Google/Google, BIOS Google 01/01/2011
>  ffff88004b1fe288 ffffffff834c44f9 ffffffff00000003 1ffff1000963fbe4
>  ffffed000963fbdc 0000000041b58ab3 ffffffff895816f0 ffffffff834c420b
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
>  [<     inline     >] __dump_stack lib/dump_stack.c:15
>  [<ffffffff834c44f9>] dump_stack+0x2ee/0x3f5 lib/dump_stack.c:51
>  [<ffffffff81560cb0>] print_circular_bug+0x310/0x3c0
> kernel/locking/lockdep.c:1202
>  [<     inline     >] check_prev_add kernel/locking/lockdep.c:1828
>  [<ffffffff8156309b>] check_prevs_add+0xaab/0x1c20 kernel/locking/lockdep.c:1938
>  [<     inline     >] validate_chain kernel/locking/lockdep.c:2265
>  [<ffffffff81569576>] __lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
>  [<ffffffff8156b672>] lock_acquire+0x2a2/0x790 kernel/locking/lockdep.c:3749
>  [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:521
>  [<ffffffff88196b82>] mutex_lock_interruptible_nested+0x2d2/0x11d0
> kernel/locking/mutex.c:650
>  [<ffffffff871bca1a>] unix_autobind.isra.26+0xca/0x8a0 net/unix/af_unix.c:852
>  [<ffffffff871c76dd>] unix_dgram_sendmsg+0x105d/0x1730 net/unix/af_unix.c:1667
>  [<ffffffff871c7ea8>] unix_seqpacket_sendmsg+0xf8/0x170 net/unix/af_unix.c:2071
>  [<     inline     >] sock_sendmsg_nosec net/socket.c:621
>  [<ffffffff86a7618f>] sock_sendmsg+0xcf/0x110 net/socket.c:631
>  [<ffffffff86a7683c>] kernel_sendmsg+0x4c/0x60 net/socket.c:639
>  [<ffffffff86a8101d>] sock_no_sendpage+0x20d/0x310 net/core/sock.c:2321
>  [<ffffffff86a74c95>] kernel_sendpage+0x95/0xf0 net/socket.c:3289
>  [<ffffffff86a74d92>] sock_sendpage+0xa2/0xd0 net/socket.c:775
>  [<ffffffff81b3ee1e>] pipe_to_sendpage+0x2ae/0x390 fs/splice.c:469
>  [<     inline     >] splice_from_pipe_feed fs/splice.c:520
>  [<ffffffff81b42f3f>] __splice_from_pipe+0x31f/0x750 fs/splice.c:644
>  [<ffffffff81b4665c>] splice_from_pipe+0x1dc/0x300 fs/splice.c:679
>  [<ffffffff81b467c5>] generic_splice_sendpage+0x45/0x60 fs/splice.c:850
>  [<     inline     >] do_splice_from fs/splice.c:869
>  [<     inline     >] do_splice fs/splice.c:1160
>  [<     inline     >] SYSC_splice fs/splice.c:1410
>  [<ffffffff81b473c7>] SyS_splice+0x7d7/0x16a0 fs/splice.c:1393
>  [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6


Seems to be the same, but detected in the context of the second thread:

[ INFO: possible circular locking dependency detected ]
4.9.0-rc8+ #77 Not tainted
-------------------------------------------------------
syz-executor3/24365 is trying to acquire lock:
 (&pipe->mutex/1){+.+.+.}, at: [<     inline     >] pipe_lock_nested
fs/pipe.c:66
 (&pipe->mutex/1){+.+.+.}, at: [<ffffffff81a8ea4b>]
pipe_lock+0x5b/0x70 fs/pipe.c:74
but task is already holding lock:
 (sb_writers#5){.+.+.+}, at: [<     inline     >] file_start_write
include/linux/fs.h:2592
 (sb_writers#5){.+.+.+}, at: [<     inline     >] do_splice fs/splice.c:1159
 (sb_writers#5){.+.+.+}, at: [<     inline     >] SYSC_splice fs/splice.c:1410
 (sb_writers#5){.+.+.+}, at: [<ffffffff81b47d9f>]
SyS_splice+0x11af/0x16a0 fs/splice.c:1393
which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

       [  131.709013] [<     inline     >] validate_chain
kernel/locking/lockdep.c:2265
       [  131.709013] [<ffffffff81569576>]
__lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
       [  131.709013] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
kernel/locking/lockdep.c:3749
       [  131.709013] [<     inline     >]
percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:35
       [  131.709013] [<     inline     >] percpu_down_read
include/linux/percpu-rwsem.h:58
       [  131.709013] [<ffffffff81a7bb33>]
__sb_start_write+0x193/0x2a0 fs/super.c:1252
       [  131.709013] [<     inline     >] sb_start_write
include/linux/fs.h:1549
       [  131.709013] [<ffffffff81af9954>] mnt_want_write+0x44/0xb0
fs/namespace.c:389
       [  131.709013] [<ffffffff81ab09f6>] filename_create+0x156/0x620
fs/namei.c:3598
       [  131.709013] [<ffffffff81ab0ef8>] kern_path_create+0x38/0x50
fs/namei.c:3644
       [  131.709013] [<     inline     >] unix_mknod net/unix/af_unix.c:967
       [  131.709013] [<ffffffff871c0e11>] unix_bind+0x4d1/0xe60
net/unix/af_unix.c:1035
       [  131.709013] [<ffffffff86a76b7e>] SYSC_bind+0x20e/0x4c0
net/socket.c:1382
       [  131.709013] [<ffffffff86a7a509>] SyS_bind+0x29/0x30 net/socket.c:1368
       [  131.709013] [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6

       [  131.709013] [<     inline     >] validate_chain
kernel/locking/lockdep.c:2265
       [  131.709013] [<ffffffff81569576>]
__lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
       [  131.709013] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
kernel/locking/lockdep.c:3749
       [  131.709013] [<     inline     >] __mutex_lock_common
kernel/locking/mutex.c:521
       [  131.709013] [<ffffffff88196b82>]
mutex_lock_interruptible_nested+0x2d2/0x11d0
kernel/locking/mutex.c:650
       [  131.709013] [<ffffffff871bca1a>]
unix_autobind.isra.26+0xca/0x8a0 net/unix/af_unix.c:852
       [  131.709013] [<ffffffff871c76dd>]
unix_dgram_sendmsg+0x105d/0x1730 net/unix/af_unix.c:1667
       [  131.709013] [<ffffffff871c7ea8>]
unix_seqpacket_sendmsg+0xf8/0x170 net/unix/af_unix.c:2071
       [  131.709013] [<     inline     >] sock_sendmsg_nosec net/socket.c:621
       [  131.709013] [<ffffffff86a7618f>] sock_sendmsg+0xcf/0x110
net/socket.c:631
       [  131.709013] [<ffffffff86a7683c>] kernel_sendmsg+0x4c/0x60
net/socket.c:639
       [  131.709013] [<ffffffff86a8101d>]
sock_no_sendpage+0x20d/0x310 net/core/sock.c:2321
       [  131.709013] [<ffffffff86a74c95>] kernel_sendpage+0x95/0xf0
net/socket.c:3289
       [  131.709013] [<ffffffff86a74d92>] sock_sendpage+0xa2/0xd0
net/socket.c:775
       [  131.709013] [<ffffffff81b3ee1e>]
pipe_to_sendpage+0x2ae/0x390 fs/splice.c:469
       [  131.709013] [<     inline     >] splice_from_pipe_feed fs/splice.c:520
       [  131.709013] [<ffffffff81b42f3f>]
__splice_from_pipe+0x31f/0x750 fs/splice.c:644
       [  131.709013] [<ffffffff81b4665c>]
splice_from_pipe+0x1dc/0x300 fs/splice.c:679
       [  131.709013] [<ffffffff81b467c5>]
generic_splice_sendpage+0x45/0x60 fs/splice.c:850
       [  131.709013] [<     inline     >] do_splice_from fs/splice.c:869
       [  131.709013] [<     inline     >] do_splice fs/splice.c:1160
       [  131.709013] [<     inline     >] SYSC_splice fs/splice.c:1410
       [  131.709013] [<ffffffff81b473c7>] SyS_splice+0x7d7/0x16a0
fs/splice.c:1393
       [  131.709013] [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6

       [  131.709013] [<     inline     >] check_prev_add
kernel/locking/lockdep.c:1828
       [  131.709013] [<ffffffff8156309b>]
check_prevs_add+0xaab/0x1c20 kernel/locking/lockdep.c:1938
       [  131.709013] [<     inline     >] validate_chain
kernel/locking/lockdep.c:2265
       [  131.709013] [<ffffffff81569576>]
__lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
       [  131.709013] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
kernel/locking/lockdep.c:3749
       [  131.709013] [<     inline     >] __mutex_lock_common
kernel/locking/mutex.c:521
       [  131.709013] [<ffffffff88195bcf>]
mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
       [  131.709013] [<     inline     >] pipe_lock_nested fs/pipe.c:66
       [  131.709013] [<ffffffff81a8ea4b>] pipe_lock+0x5b/0x70 fs/pipe.c:74
       [  131.709013] [<ffffffff81b451f7>]
iter_file_splice_write+0x267/0xfa0 fs/splice.c:717
       [  131.709013] [<     inline     >] do_splice_from fs/splice.c:869
       [  131.709013] [<     inline     >] do_splice fs/splice.c:1160
       [  131.709013] [<     inline     >] SYSC_splice fs/splice.c:1410
       [  131.709013] [<ffffffff81b473c7>] SyS_splice+0x7d7/0x16a0
fs/splice.c:1393
       [  131.709013] [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6

other info that might help us debug this:

Chain exists of:
 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(sb_writers#5);
                               lock(&u->bindlock);
                               lock(sb_writers#5);
  lock(&pipe->mutex/1);

 *** DEADLOCK ***

1 lock held by syz-executor3/24365:
 #0:  (sb_writers#5){.+.+.+}, at: [<     inline     >]
file_start_write include/linux/fs.h:2592
 #0:  (sb_writers#5){.+.+.+}, at: [<     inline     >] do_splice
fs/splice.c:1159
 #0:  (sb_writers#5){.+.+.+}, at: [<     inline     >] SYSC_splice
fs/splice.c:1410
 #0:  (sb_writers#5){.+.+.+}, at: [<ffffffff81b47d9f>]
SyS_splice+0x11af/0x16a0 fs/splice.c:1393

stack backtrace:
CPU: 2 PID: 24365 Comm: syz-executor3 Not tainted 4.9.0-rc8+ #77
Hardware name: Google Google/Google, BIOS Google 01/01/2011
 ffff8800597b6af8 ffffffff834c44f9 ffffffff00000002 1ffff1000b2f6cf2
 ffffed000b2f6cea 0000000041b58ab3 ffffffff895816f0 ffffffff834c420b
 0000000041b58ab3 ffffffff894dbca8 ffffffff8155c780 ffff8800597b6878
Call Trace:
 [<     inline     >] __dump_stack lib/dump_stack.c:15
 [<ffffffff834c44f9>] dump_stack+0x2ee/0x3f5 lib/dump_stack.c:51
 [<ffffffff81560cb0>] print_circular_bug+0x310/0x3c0
kernel/locking/lockdep.c:1202
 [<     inline     >] check_prev_add kernel/locking/lockdep.c:1828
 [<ffffffff8156309b>] check_prevs_add+0xaab/0x1c20 kernel/locking/lockdep.c:1938
 [<     inline     >] validate_chain kernel/locking/lockdep.c:2265
 [<ffffffff81569576>] __lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
 [<ffffffff8156b672>] lock_acquire+0x2a2/0x790 kernel/locking/lockdep.c:3749
 [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:521
 [<ffffffff88195bcf>] mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
 [<     inline     >] pipe_lock_nested fs/pipe.c:66
 [<ffffffff81a8ea4b>] pipe_lock+0x5b/0x70 fs/pipe.c:74
 [<ffffffff81b451f7>] iter_file_splice_write+0x267/0xfa0 fs/splice.c:717
 [<     inline     >] do_splice_from fs/splice.c:869
 [<     inline     >] do_splice fs/splice.c:1160
 [<     inline     >] SYSC_splice fs/splice.c:1410
 [<ffffffff81b473c7>] SyS_splice+0x7d7/0x16a0 fs/splice.c:1393
 [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2016-12-08 16:30 ` Dmitry Vyukov
@ 2016-12-09  0:08   ` Cong Wang
  2016-12-09  1:32     ` Al Viro
  0 siblings, 1 reply; 22+ messages in thread
From: Cong Wang @ 2016-12-09  0:08 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Al Viro, linux-fsdevel, LKML, David Miller, Rainer Weikusat,
	Hannes Frederic Sowa, netdev, Eric Dumazet, syzkaller

On Thu, Dec 8, 2016 at 8:30 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> Chain exists of:
>  Possible unsafe locking scenario:
>
>        CPU0                    CPU1
>        ----                    ----
>   lock(sb_writers#5);
>                                lock(&u->bindlock);
>                                lock(sb_writers#5);
>   lock(&pipe->mutex/1);

This looks false positive, probably just needs lockdep_set_class()
to set keys for pipe->mutex and unix->bindlock.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2016-12-09  0:08   ` Cong Wang
@ 2016-12-09  1:32     ` Al Viro
  2016-12-09  6:32       ` Cong Wang
  0 siblings, 1 reply; 22+ messages in thread
From: Al Viro @ 2016-12-09  1:32 UTC (permalink / raw)
  To: Cong Wang
  Cc: Dmitry Vyukov, linux-fsdevel, LKML, David Miller,
	Rainer Weikusat, Hannes Frederic Sowa, netdev, Eric Dumazet,
	syzkaller

On Thu, Dec 08, 2016 at 04:08:27PM -0800, Cong Wang wrote:
> On Thu, Dec 8, 2016 at 8:30 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> > Chain exists of:
> >  Possible unsafe locking scenario:
> >
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(sb_writers#5);
> >                                lock(&u->bindlock);
> >                                lock(sb_writers#5);
> >   lock(&pipe->mutex/1);
> 
> This looks false positive, probably just needs lockdep_set_class()
> to set keys for pipe->mutex and unix->bindlock.

I'm afraid that it's not a false positive at all.

Preparations:
	* create an AF_UNIX socket.
	* set SOCK_PASSCRED on it.
	* create a pipe.

Child 1: splice from pipe to socket; locks pipe and proceeds down towards
unix_dgram_sendmsg().

Child 2: splice from pipe to /mnt/foo/bar; requests write access to /mnt
and blocks on attempt to lock the pipe already locked by (1).

Child 3: freeze /mnt; blocks until (2) is done

Child 4: bind() the socket to /mnt/barf; grabs ->bindlock on the socket and
proceeds to create /mnt/barf, which blocks due to fairness of freezer (no
extra write accesses to something that is in process of being frozen).

_Now_ (1) gets around to unix_dgram_sendmsg().  We still have NULL u->addr,
since bind() has not gotten through yet.  We also have SOCK_PASSCRED set,
so we attempt autobind; it blocks on the ->bindlock, which won't be
released until bind() is done (at which point we'll see non-NULL u->addr
and bugger off from autobind), but bind() won't succeed until /mnt
goes through the freeze-thaw cycle, which won't happen until (2) finishes,
which won't happen until (1) unlocks the pipe.  Deadlock.

Granted, ->bindlock is taken interruptibly, so it's not that much of
a problem (you can kill the damn thing), but you would need to intervene
and kill it.

Why do we do autobind there, anyway, and why is it conditional on
SOCK_PASSCRED?  Note that e.g. for SOCK_STREAM we can bloody well get
to sending stuff without autobind ever done - just use socketpair()
to create that sucker and we won't be going through the connect()
at all.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2016-12-09  1:32     ` Al Viro
@ 2016-12-09  6:32       ` Cong Wang
  2016-12-09  6:41         ` Al Viro
  0 siblings, 1 reply; 22+ messages in thread
From: Cong Wang @ 2016-12-09  6:32 UTC (permalink / raw)
  To: Al Viro
  Cc: Dmitry Vyukov, linux-fsdevel, LKML, David Miller,
	Rainer Weikusat, Hannes Frederic Sowa, netdev, Eric Dumazet,
	syzkaller

On Thu, Dec 8, 2016 at 5:32 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Thu, Dec 08, 2016 at 04:08:27PM -0800, Cong Wang wrote:
>> On Thu, Dec 8, 2016 at 8:30 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> > Chain exists of:
>> >  Possible unsafe locking scenario:
>> >
>> >        CPU0                    CPU1
>> >        ----                    ----
>> >   lock(sb_writers#5);
>> >                                lock(&u->bindlock);
>> >                                lock(sb_writers#5);
>> >   lock(&pipe->mutex/1);
>>
>> This looks false positive, probably just needs lockdep_set_class()
>> to set keys for pipe->mutex and unix->bindlock.
>
> I'm afraid that it's not a false positive at all.

Right, I was totally misled by the scenario output of lockdep, the stack
traces actually are much more reasonable.

The deadlock scenario is easy actually, comparing with the netlink one
which has 4 locks involved, it is:

unix_bind() path:
u->bindlock ==> sb_writer

do_splice() path:
sb_writer ==> pipe->mutex ==> u->bindlock

 *** DEADLOCK ***

>
> Why do we do autobind there, anyway, and why is it conditional on
> SOCK_PASSCRED?  Note that e.g. for SOCK_STREAM we can bloody well get
> to sending stuff without autobind ever done - just use socketpair()
> to create that sucker and we won't be going through the connect()
> at all.

In the case Dmitry reported, unix_dgram_sendmsg() calls unix_autobind(),
not SOCK_STREAM.

I guess some lock, perhaps the u->bindlock could be dropped before
acquiring the next one (sb_writer), but I need to double check.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2016-12-09  6:32       ` Cong Wang
@ 2016-12-09  6:41         ` Al Viro
  2017-01-16  9:32           ` Dmitry Vyukov
  2017-01-17  8:07           ` Eric W. Biederman
  0 siblings, 2 replies; 22+ messages in thread
From: Al Viro @ 2016-12-09  6:41 UTC (permalink / raw)
  To: Cong Wang
  Cc: Dmitry Vyukov, linux-fsdevel, LKML, David Miller,
	Rainer Weikusat, Hannes Frederic Sowa, netdev, Eric Dumazet,
	syzkaller

On Thu, Dec 08, 2016 at 10:32:00PM -0800, Cong Wang wrote:

> > Why do we do autobind there, anyway, and why is it conditional on
> > SOCK_PASSCRED?  Note that e.g. for SOCK_STREAM we can bloody well get
> > to sending stuff without autobind ever done - just use socketpair()
> > to create that sucker and we won't be going through the connect()
> > at all.
> 
> In the case Dmitry reported, unix_dgram_sendmsg() calls unix_autobind(),
> not SOCK_STREAM.

Yes, I've noticed.  What I'm asking is what in there needs autobind triggered
on sendmsg and why doesn't the same need affect the SOCK_STREAM case?

> I guess some lock, perhaps the u->bindlock could be dropped before
> acquiring the next one (sb_writer), but I need to double check.

Bad idea, IMO - do you *want* autobind being able to come through while
bind(2) is busy with mknod?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2016-12-09  6:41         ` Al Viro
@ 2017-01-16  9:32           ` Dmitry Vyukov
  2017-01-17 21:21             ` Cong Wang
  2017-01-17  8:07           ` Eric W. Biederman
  1 sibling, 1 reply; 22+ messages in thread
From: Dmitry Vyukov @ 2017-01-16  9:32 UTC (permalink / raw)
  To: Al Viro
  Cc: Cong Wang, linux-fsdevel, LKML, David Miller, Rainer Weikusat,
	Hannes Frederic Sowa, netdev, Eric Dumazet, syzkaller

On Fri, Dec 9, 2016 at 7:41 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Thu, Dec 08, 2016 at 10:32:00PM -0800, Cong Wang wrote:
>
>> > Why do we do autobind there, anyway, and why is it conditional on
>> > SOCK_PASSCRED?  Note that e.g. for SOCK_STREAM we can bloody well get
>> > to sending stuff without autobind ever done - just use socketpair()
>> > to create that sucker and we won't be going through the connect()
>> > at all.
>>
>> In the case Dmitry reported, unix_dgram_sendmsg() calls unix_autobind(),
>> not SOCK_STREAM.
>
> Yes, I've noticed.  What I'm asking is what in there needs autobind triggered
> on sendmsg and why doesn't the same need affect the SOCK_STREAM case?
>
>> I guess some lock, perhaps the u->bindlock could be dropped before
>> acquiring the next one (sb_writer), but I need to double check.
>
> Bad idea, IMO - do you *want* autobind being able to come through while
> bind(2) is busy with mknod?


Ping. This is still happening on HEAD.


[ INFO: possible circular locking dependency detected ]
4.9.0 #1 Not tainted
-------------------------------------------------------
syz-executor6/25491 is trying to acquire lock:
 (&u->bindlock){+.+.+.}, at: [<ffffffff83962315>]
unix_autobind.isra.28+0xc5/0x880 net/unix/af_unix.c:852
but task is already holding lock:
 (&pipe->mutex/1){+.+.+.}, at: [<ffffffff81a45ac6>] pipe_lock_nested
fs/pipe.c:66 [inline]
 (&pipe->mutex/1){+.+.+.}, at: [<ffffffff81a45ac6>]
pipe_lock+0x56/0x70 fs/pipe.c:74
which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

       [  836.500536] [<ffffffff8156f989>] validate_chain
kernel/locking/lockdep.c:2265 [inline]
       [  836.500536] [<ffffffff8156f989>]
__lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
       [  836.508456] [<ffffffff81571b11>] lock_acquire+0x2a1/0x630
kernel/locking/lockdep.c:3753
       [  836.516117] [<ffffffff8435f9be>] __mutex_lock_common
kernel/locking/mutex.c:521 [inline]
       [  836.516117] [<ffffffff8435f9be>]
mutex_lock_nested+0x24e/0xff0 kernel/locking/mutex.c:621
       [  836.524139] [<ffffffff81a45ac6>] pipe_lock_nested
fs/pipe.c:66 [inline]
       [  836.524139] [<ffffffff81a45ac6>] pipe_lock+0x56/0x70 fs/pipe.c:74
       [  836.531287] [<ffffffff81af63d2>]
iter_file_splice_write+0x262/0xf80 fs/splice.c:717
       [  836.539720] [<ffffffff81af84e0>] do_splice_from
fs/splice.c:869 [inline]
       [  836.539720] [<ffffffff81af84e0>] do_splice fs/splice.c:1160 [inline]
       [  836.539720] [<ffffffff81af84e0>] SYSC_splice fs/splice.c:1410 [inline]
       [  836.539720] [<ffffffff81af84e0>] SyS_splice+0x7c0/0x1690
fs/splice.c:1393
       [  836.547273] [<ffffffff84370981>] entry_SYSCALL_64_fastpath+0x1f/0xc2

       [  836.560730] [<ffffffff8156f989>] validate_chain
kernel/locking/lockdep.c:2265 [inline]
       [  836.560730] [<ffffffff8156f989>]
__lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
       [  836.568655] [<ffffffff81571b11>] lock_acquire+0x2a1/0x630
kernel/locking/lockdep.c:3753
       [  836.576230] [<ffffffff81a326ca>]
percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:35
[inline]
       [  836.576230] [<ffffffff81a326ca>] percpu_down_read
include/linux/percpu-rwsem.h:58 [inline]
       [  836.576230] [<ffffffff81a326ca>]
__sb_start_write+0x19a/0x2b0 fs/super.c:1252
       [  836.584168] [<ffffffff81ab1edf>] sb_start_write
include/linux/fs.h:1554 [inline]
       [  836.584168] [<ffffffff81ab1edf>] mnt_want_write+0x3f/0xb0
fs/namespace.c:389
       [  836.591744] [<ffffffff81a67581>] filename_create+0x151/0x610
fs/namei.c:3598
       [  836.599574] [<ffffffff81a67a73>] kern_path_create+0x33/0x40
fs/namei.c:3644
       [  836.607328] [<ffffffff83966683>] unix_mknod
net/unix/af_unix.c:967 [inline]
       [  836.607328] [<ffffffff83966683>] unix_bind+0x4c3/0xe00
net/unix/af_unix.c:1035
       [  836.614634] [<ffffffff834f047e>] SYSC_bind+0x20e/0x4a0
net/socket.c:1382
       [  836.621950] [<ffffffff834f3d84>] SyS_bind+0x24/0x30 net/socket.c:1368
       [  836.629015] [<ffffffff84370981>] entry_SYSCALL_64_fastpath+0x1f/0xc2

       [  836.642405] [<ffffffff815694cd>] check_prev_add
kernel/locking/lockdep.c:1828 [inline]
       [  836.642405] [<ffffffff815694cd>]
check_prevs_add+0xa8d/0x1c00 kernel/locking/lockdep.c:1938
       [  836.650348] [<ffffffff8156f989>] validate_chain
kernel/locking/lockdep.c:2265 [inline]
       [  836.650348] [<ffffffff8156f989>]
__lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
       [  836.658315] [<ffffffff81571b11>] lock_acquire+0x2a1/0x630
kernel/locking/lockdep.c:3753
       [  836.665928] [<ffffffff84361ce1>] __mutex_lock_common
kernel/locking/mutex.c:521 [inline]
       [  836.665928] [<ffffffff84361ce1>]
mutex_lock_interruptible_nested+0x2e1/0x12a0
kernel/locking/mutex.c:650
       [  836.675287] [<ffffffff83962315>]
unix_autobind.isra.28+0xc5/0x880 net/unix/af_unix.c:852
       [  836.683571] [<ffffffff8396cdfc>]
unix_dgram_sendmsg+0x104c/0x1720 net/unix/af_unix.c:1667
       [  836.691870] [<ffffffff8396d5c3>]
unix_seqpacket_sendmsg+0xf3/0x160 net/unix/af_unix.c:2071
       [  836.700261] [<ffffffff834efaaa>] sock_sendmsg_nosec
net/socket.c:621 [inline]
       [  836.700261] [<ffffffff834efaaa>] sock_sendmsg+0xca/0x110
net/socket.c:631
       [  836.707758] [<ffffffff834f0137>] kernel_sendmsg+0x47/0x60
net/socket.c:639
       [  836.715327] [<ffffffff834faca6>]
sock_no_sendpage+0x216/0x300 net/core/sock.c:2321
       [  836.723278] [<ffffffff834ee5e0>] kernel_sendpage+0x90/0xe0
net/socket.c:3289
       [  836.730944] [<ffffffff834ee6bc>] sock_sendpage+0x8c/0xc0
net/socket.c:775
       [  836.738421] [<ffffffff81af011d>]
pipe_to_sendpage+0x29d/0x3e0 fs/splice.c:469
       [  836.746374] [<ffffffff81af4168>] splice_from_pipe_feed
fs/splice.c:520 [inline]
       [  836.746374] [<ffffffff81af4168>]
__splice_from_pipe+0x328/0x760 fs/splice.c:644
       [  836.754487] [<ffffffff81af77a7>]
splice_from_pipe+0x1d7/0x2f0 fs/splice.c:679
       [  836.762451] [<ffffffff81af7900>]
generic_splice_sendpage+0x40/0x50 fs/splice.c:850
       [  836.770826] [<ffffffff81af84e0>] do_splice_from
fs/splice.c:869 [inline]
       [  836.770826] [<ffffffff81af84e0>] do_splice fs/splice.c:1160 [inline]
       [  836.770826] [<ffffffff81af84e0>] SYSC_splice fs/splice.c:1410 [inline]
       [  836.770826] [<ffffffff81af84e0>] SyS_splice+0x7c0/0x1690
fs/splice.c:1393
       [  836.778307] [<ffffffff84370981>] entry_SYSCALL_64_fastpath+0x1f/0xc2

other info that might help us debug this:

Chain exists of:
 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&pipe->mutex/1);
                               lock(sb_writers#5);
                               lock(&pipe->mutex/1);
  lock(&u->bindlock);

 *** DEADLOCK ***

1 lock held by syz-executor6/25491:
 #0:  (&pipe->mutex/1){+.+.+.}, at: [<ffffffff81a45ac6>]
pipe_lock_nested fs/pipe.c:66 [inline]
 #0:  (&pipe->mutex/1){+.+.+.}, at: [<ffffffff81a45ac6>]
pipe_lock+0x56/0x70 fs/pipe.c:74

stack backtrace:
CPU: 0 PID: 25491 Comm: syz-executor6 Not tainted 4.9.0 #1
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
 ffff8801cacc6248 ffffffff8234654f ffffffff00000000 1ffff10039598bdc
 ffffed0039598bd4 0000000041b58ab3 ffffffff84b37a60 ffffffff82346261
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
 [<ffffffff8234654f>] __dump_stack lib/dump_stack.c:15 [inline]
 [<ffffffff8234654f>] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
 [<ffffffff81567147>] print_circular_bug+0x307/0x3b0
kernel/locking/lockdep.c:1202
 [<ffffffff815694cd>] check_prev_add kernel/locking/lockdep.c:1828 [inline]
 [<ffffffff815694cd>] check_prevs_add+0xa8d/0x1c00 kernel/locking/lockdep.c:1938
 [<ffffffff8156f989>] validate_chain kernel/locking/lockdep.c:2265 [inline]
 [<ffffffff8156f989>] __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
 [<ffffffff81571b11>] lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
 [<ffffffff84361ce1>] __mutex_lock_common kernel/locking/mutex.c:521 [inline]
 [<ffffffff84361ce1>] mutex_lock_interruptible_nested+0x2e1/0x12a0
kernel/locking/mutex.c:650
 [<ffffffff83962315>] unix_autobind.isra.28+0xc5/0x880 net/unix/af_unix.c:852
 [<ffffffff8396cdfc>] unix_dgram_sendmsg+0x104c/0x1720 net/unix/af_unix.c:1667
 [<ffffffff8396d5c3>] unix_seqpacket_sendmsg+0xf3/0x160 net/unix/af_unix.c:2071
 [<ffffffff834efaaa>] sock_sendmsg_nosec net/socket.c:621 [inline]
 [<ffffffff834efaaa>] sock_sendmsg+0xca/0x110 net/socket.c:631
 [<ffffffff834f0137>] kernel_sendmsg+0x47/0x60 net/socket.c:639
 [<ffffffff834faca6>] sock_no_sendpage+0x216/0x300 net/core/sock.c:2321
 [<ffffffff834ee5e0>] kernel_sendpage+0x90/0xe0 net/socket.c:3289
 [<ffffffff834ee6bc>] sock_sendpage+0x8c/0xc0 net/socket.c:775
 [<ffffffff81af011d>] pipe_to_sendpage+0x29d/0x3e0 fs/splice.c:469
 [<ffffffff81af4168>] splice_from_pipe_feed fs/splice.c:520 [inline]
 [<ffffffff81af4168>] __splice_from_pipe+0x328/0x760 fs/splice.c:644
 [<ffffffff81af77a7>] splice_from_pipe+0x1d7/0x2f0 fs/splice.c:679
 [<ffffffff81af7900>] generic_splice_sendpage+0x40/0x50 fs/splice.c:850
 [<ffffffff81af84e0>] do_splice_from fs/splice.c:869 [inline]
 [<ffffffff81af84e0>] do_splice fs/splice.c:1160 [inline]
 [<ffffffff81af84e0>] SYSC_splice fs/splice.c:1410 [inline]
 [<ffffffff81af84e0>] SyS_splice+0x7c0/0x1690 fs/splice.c:1393
 [<ffffffff84370981>] entry_SYSCALL_64_fastpath+0x1f/0xc2
QAT: Invalid ioctl
QAT: Invalid ioctl
QAT: Invalid ioctl
QAT: Invalid ioctl
FAULT_FLAG_ALLOW_RETRY missing 30
CPU: 1 PID: 25716 Comm: syz-executor3 Not tainted 4.9.0 #1
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
 ffff8801b6a274a8 ffffffff8234654f ffffffff00000001 1ffff10036d44e28
 ffffed0036d44e20 0000000041b58ab3 ffffffff84b37a60 ffffffff82346261
 0000000000000000 ffff8801dc122980 ffff8801a36c2800 1ffff10036d44e2a
Call Trace:
 [<ffffffff8234654f>] __dump_stack lib/dump_stack.c:15 [inline]
 [<ffffffff8234654f>] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
 [<ffffffff81b6325d>] handle_userfault+0x115d/0x1fc0 fs/userfaultfd.c:381
 [<ffffffff8192f792>] do_anonymous_page mm/memory.c:2800 [inline]
 [<ffffffff8192f792>] handle_pte_fault mm/memory.c:3560 [inline]
 [<ffffffff8192f792>] __handle_mm_fault mm/memory.c:3652 [inline]
 [<ffffffff8192f792>] handle_mm_fault+0x24f2/0x2890 mm/memory.c:3689
 [<ffffffff81323df6>] __do_page_fault+0x4f6/0xb60 arch/x86/mm/fault.c:1397
 [<ffffffff813244b4>] do_page_fault+0x54/0x70 arch/x86/mm/fault.c:1460
 [<ffffffff84371d38>] page_fault+0x28/0x30 arch/x86/entry/entry_64.S:1012
 [<ffffffff81a65dfe>] getname_flags+0x10e/0x580 fs/namei.c:148
 [<ffffffff81a66f1d>] user_path_at_empty+0x2d/0x50 fs/namei.c:2556
 [<ffffffff81a385e1>] user_path_at include/linux/namei.h:55 [inline]
 [<ffffffff81a385e1>] vfs_fstatat+0xf1/0x1a0 fs/stat.c:106
 [<ffffffff81a3a12b>] vfs_lstat fs/stat.c:129 [inline]
 [<ffffffff81a3a12b>] SYSC_newlstat+0xab/0x140 fs/stat.c:283
 [<ffffffff81a3a51d>] SyS_newlstat+0x1d/0x30 fs/stat.c:277
 [<ffffffff84370981>] entry_SYSCALL_64_fastpath+0x1f/0xc2
FAULT_FLAG_ALLOW_RETRY missing 30
QAT: Invalid ioctl
QAT: Invalid ioctl
QAT: Invalid ioctl
CPU: 1 PID: 25716 Comm: syz-executor3 Not tainted 4.9.0 #1
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
 ffff8801b6a27360 ffffffff8234654f ffffffff00000001 1ffff10036d44dff
 ffffed0036d44df7 0000000041b58ab3 ffffffff84b37a60 ffffffff82346261
 0000000000000082 ffff8801dc122980 ffff8801da622540 1ffff10036d44e01
Call Trace:
 [<ffffffff8234654f>] __dump_stack lib/dump_stack.c:15 [inline]
 [<ffffffff8234654f>] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
 [<ffffffff81b6325d>] handle_userfault+0x115d/0x1fc0 fs/userfaultfd.c:381
 [<ffffffff8192f792>] do_anonymous_page mm/memory.c:2800 [inline]
 [<ffffffff8192f792>] handle_pte_fault mm/memory.c:3560 [inline]
 [<ffffffff8192f792>] __handle_mm_fault mm/memory.c:3652 [inline]
 [<ffffffff8192f792>] handle_mm_fault+0x24f2/0x2890 mm/memory.c:3689
 [<ffffffff81323df6>] __do_page_fault+0x4f6/0xb60 arch/x86/mm/fault.c:1397
 [<ffffffff81324611>] trace_do_page_fault+0x141/0x6c0 arch/x86/mm/fault.c:1490
 [<ffffffff84371d08>] trace_page_fault+0x28/0x30 arch/x86/entry/entry_64.S:1012
 [<ffffffff81a65dfe>] getname_flags+0x10e/0x580 fs/namei.c:148
 [<ffffffff81a66f1d>] user_path_at_empty+0x2d/0x50 fs/namei.c:2556
 [<ffffffff81a385e1>] user_path_at include/linux/namei.h:55 [inline]
 [<ffffffff81a385e1>] vfs_fstatat+0xf1/0x1a0 fs/stat.c:106
 [<ffffffff81a3a12b>] vfs_lstat fs/stat.c:129 [inline]
 [<ffffffff81a3a12b>] SYSC_newlstat+0xab/0x140 fs/stat.c:283
 [<ffffffff81a3a51d>] SyS_newlstat+0x1d/0x30 fs/stat.c:277
 [<ffffffff84370981>] entry_SYSCALL_64_fastpath+0x1f/0xc2

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2016-12-09  6:41         ` Al Viro
  2017-01-16  9:32           ` Dmitry Vyukov
@ 2017-01-17  8:07           ` Eric W. Biederman
  1 sibling, 0 replies; 22+ messages in thread
From: Eric W. Biederman @ 2017-01-17  8:07 UTC (permalink / raw)
  To: Al Viro
  Cc: Cong Wang, Dmitry Vyukov, linux-fsdevel, LKML, David Miller,
	Rainer Weikusat, Hannes Frederic Sowa, netdev, Eric Dumazet,
	syzkaller

Al Viro <viro@ZenIV.linux.org.uk> writes:

> On Thu, Dec 08, 2016 at 10:32:00PM -0800, Cong Wang wrote:
>
>> > Why do we do autobind there, anyway, and why is it conditional on
>> > SOCK_PASSCRED?  Note that e.g. for SOCK_STREAM we can bloody well get
>> > to sending stuff without autobind ever done - just use socketpair()
>> > to create that sucker and we won't be going through the connect()
>> > at all.
>> 
>> In the case Dmitry reported, unix_dgram_sendmsg() calls unix_autobind(),
>> not SOCK_STREAM.
>
> Yes, I've noticed.  What I'm asking is what in there needs autobind triggered
> on sendmsg and why doesn't the same need affect the SOCK_STREAM case?

With respect to the conditionality on SOCK_PASSCRED those are the linux
semantics.  Semantically that is the way the code has behaved since
2.1.15 when support for passing credentials was added to the code.
So I presume someone thought it was a good idea to have a name for
a socket that is sending credentials to another socket.  It certainly
seems reasonable at first glance.

With socketpair the only path that doesn't enforce this with
SOCK_STREAM and SOCK_PASSCRED that is either an oversight or a don't
care because we already know who is at the other end.

I can imagine two possible fixes:
1) Declare that splice is non-sense in the presence of SOCK_PASSCRED.
2) Someone adds a preparation operation that can be called on
   af_unix sockets that will ensure the autobind happens before
   any problematic locks are taken.

Eric

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2017-01-16  9:32           ` Dmitry Vyukov
@ 2017-01-17 21:21             ` Cong Wang
  2017-01-18  9:17               ` Dmitry Vyukov
  2017-01-26 23:29               ` Mateusz Guzik
  0 siblings, 2 replies; 22+ messages in thread
From: Cong Wang @ 2017-01-17 21:21 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Al Viro, linux-fsdevel, LKML, David Miller, Rainer Weikusat,
	Hannes Frederic Sowa, netdev, Eric Dumazet, syzkaller

[-- Attachment #1: Type: text/plain, Size: 1326 bytes --]

On Mon, Jan 16, 2017 at 1:32 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Fri, Dec 9, 2016 at 7:41 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>> On Thu, Dec 08, 2016 at 10:32:00PM -0800, Cong Wang wrote:
>>
>>> > Why do we do autobind there, anyway, and why is it conditional on
>>> > SOCK_PASSCRED?  Note that e.g. for SOCK_STREAM we can bloody well get
>>> > to sending stuff without autobind ever done - just use socketpair()
>>> > to create that sucker and we won't be going through the connect()
>>> > at all.
>>>
>>> In the case Dmitry reported, unix_dgram_sendmsg() calls unix_autobind(),
>>> not SOCK_STREAM.
>>
>> Yes, I've noticed.  What I'm asking is what in there needs autobind triggered
>> on sendmsg and why doesn't the same need affect the SOCK_STREAM case?
>>
>>> I guess some lock, perhaps the u->bindlock could be dropped before
>>> acquiring the next one (sb_writer), but I need to double check.
>>
>> Bad idea, IMO - do you *want* autobind being able to come through while
>> bind(2) is busy with mknod?
>
>
> Ping. This is still happening on HEAD.
>

Thanks for your reminder. Mind to give the attached patch (compile only)
a try? I take another approach to fix this deadlock, which moves the
unix_mknod() out of unix->bindlock. Not sure if there is any unexpected
impact with this way.

Thanks.

[-- Attachment #2: unix.diff --]
[-- Type: text/plain, Size: 1678 bytes --]

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 127656e..5d4b4d1 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -995,6 +995,7 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
 	unsigned int hash;
 	struct unix_address *addr;
 	struct hlist_head *list;
+	struct path path;
 
 	err = -EINVAL;
 	if (sunaddr->sun_family != AF_UNIX)
@@ -1010,9 +1011,20 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
 		goto out;
 	addr_len = err;
 
+	if (sun_path[0]) {
+		umode_t mode = S_IFSOCK |
+		       (SOCK_INODE(sock)->i_mode & ~current_umask());
+		err = unix_mknod(sun_path, mode, &path);
+		if (err) {
+			if (err == -EEXIST)
+				err = -EADDRINUSE;
+			goto out;
+		}
+	}
+
 	err = mutex_lock_interruptible(&u->bindlock);
 	if (err)
-		goto out;
+		goto out_put;
 
 	err = -EINVAL;
 	if (u->addr)
@@ -1029,16 +1041,6 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
 	atomic_set(&addr->refcnt, 1);
 
 	if (sun_path[0]) {
-		struct path path;
-		umode_t mode = S_IFSOCK |
-		       (SOCK_INODE(sock)->i_mode & ~current_umask());
-		err = unix_mknod(sun_path, mode, &path);
-		if (err) {
-			if (err == -EEXIST)
-				err = -EADDRINUSE;
-			unix_release_addr(addr);
-			goto out_up;
-		}
 		addr->hash = UNIX_HASH_SIZE;
 		hash = d_backing_inode(path.dentry)->i_ino & (UNIX_HASH_SIZE - 1);
 		spin_lock(&unix_table_lock);
@@ -1065,6 +1067,9 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
 	spin_unlock(&unix_table_lock);
 out_up:
 	mutex_unlock(&u->bindlock);
+out_put:
+	if (err)
+		path_put(&path);
 out:
 	return err;
 }

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2017-01-17 21:21             ` Cong Wang
@ 2017-01-18  9:17               ` Dmitry Vyukov
  2017-01-20  4:57                 ` Cong Wang
  2017-01-26 23:29               ` Mateusz Guzik
  1 sibling, 1 reply; 22+ messages in thread
From: Dmitry Vyukov @ 2017-01-18  9:17 UTC (permalink / raw)
  To: Cong Wang
  Cc: Al Viro, linux-fsdevel, LKML, David Miller, Rainer Weikusat,
	Hannes Frederic Sowa, netdev, Eric Dumazet, syzkaller

On Tue, Jan 17, 2017 at 10:21 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Mon, Jan 16, 2017 at 1:32 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> On Fri, Dec 9, 2016 at 7:41 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>>> On Thu, Dec 08, 2016 at 10:32:00PM -0800, Cong Wang wrote:
>>>
>>>> > Why do we do autobind there, anyway, and why is it conditional on
>>>> > SOCK_PASSCRED?  Note that e.g. for SOCK_STREAM we can bloody well get
>>>> > to sending stuff without autobind ever done - just use socketpair()
>>>> > to create that sucker and we won't be going through the connect()
>>>> > at all.
>>>>
>>>> In the case Dmitry reported, unix_dgram_sendmsg() calls unix_autobind(),
>>>> not SOCK_STREAM.
>>>
>>> Yes, I've noticed.  What I'm asking is what in there needs autobind triggered
>>> on sendmsg and why doesn't the same need affect the SOCK_STREAM case?
>>>
>>>> I guess some lock, perhaps the u->bindlock could be dropped before
>>>> acquiring the next one (sb_writer), but I need to double check.
>>>
>>> Bad idea, IMO - do you *want* autobind being able to come through while
>>> bind(2) is busy with mknod?
>>
>>
>> Ping. This is still happening on HEAD.
>>
>
> Thanks for your reminder. Mind to give the attached patch (compile only)
> a try? I take another approach to fix this deadlock, which moves the
> unix_mknod() out of unix->bindlock. Not sure if there is any unexpected
> impact with this way.


I instantly hit:

general protection fault: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 8930 Comm: syz-executor1 Not tainted 4.10.0-rc4+ #177
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff88003c908840 task.stack: ffff88003a9a0000
RIP: 0010:__lock_acquire+0xb3a/0x3430 kernel/locking/lockdep.c:3224
RSP: 0018:ffff88003a9a7218 EFLAGS: 00010006
RAX: dffffc0000000000 RBX: dffffc0000000000 RCX: 0000000000000000
RDX: 0000000000000003 RSI: 0000000000000000 RDI: 1ffff10007534e9d
RBP: ffff88003a9a7750 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000018 R11: 0000000000000000 R12: ffff88003c908840
R13: 0000000000000001 R14: ffffffff863504a0 R15: 0000000000000001
FS:  00007f4f8eb5d700(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020b1d000 CR3: 000000003bde9000 CR4: 00000000000006f0
Call Trace:
 lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
 __raw_spin_lock include/linux/spinlock_api_smp.h:144 [inline]
 _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
 spin_lock include/linux/spinlock.h:302 [inline]
 list_lru_add+0x10b/0x340 mm/list_lru.c:115
 d_lru_add fs/dcache.c:366 [inline]
 dentry_lru_add fs/dcache.c:421 [inline]
 dput.part.27+0x659/0x7c0 fs/dcache.c:784
 dput+0x1f/0x30 fs/dcache.c:753
 path_put+0x31/0x70 fs/namei.c:500
 unix_bind+0x424/0xea0 net/unix/af_unix.c:1072
 SYSC_bind+0x20e/0x4a0 net/socket.c:1413
 SyS_bind+0x24/0x30 net/socket.c:1399
 entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x4454b9
RSP: 002b:00007f4f8eb5cb58 EFLAGS: 00000292 ORIG_RAX: 0000000000000031
RAX: ffffffffffffffda RBX: 000000000000001d RCX: 00000000004454b9
RDX: 0000000000000008 RSI: 000000002002cff8 RDI: 000000000000001d
RBP: 00000000006dd230 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000292 R12: 0000000000700000
R13: 00007f4f8f2de9d0 R14: 00007f4f8f2dfc40 R15: 0000000000000000
Code: e9 03 f3 48 ab 48 81 c4 10 05 00 00 44 89 e8 5b 41 5c 41 5d 41
5e 41 5f 5d c3 4c 89 d2 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80>
3c 02 00 0f 85 9e 26 00 00 49 81 3a e0 be 6b 85 41 bf 00 00
RIP: __lock_acquire+0xb3a/0x3430 kernel/locking/lockdep.c:3224 RSP:
ffff88003a9a7218
---[ end trace 78951d69744a2fe1 ]---
Kernel panic - not syncing: Fatal exception
Dumping ftrace buffer:
   (ftrace buffer empty)
Kernel Offset: disabled


and:


BUG: KASAN: use-after-free in list_lru_add+0x2fd/0x340
mm/list_lru.c:112 at addr ffff88006b301340
Read of size 8 by task syz-executor0/7116
CPU: 2 PID: 7116 Comm: syz-executor0 Not tainted 4.10.0-rc4+ #177
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15 [inline]
 dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
 kasan_object_err+0x1c/0x70 mm/kasan/report.c:165
 print_address_description mm/kasan/report.c:203 [inline]
 kasan_report_error mm/kasan/report.c:287 [inline]
 kasan_report+0x1b6/0x460 mm/kasan/report.c:307
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:333
 list_lru_add+0x2fd/0x340 mm/list_lru.c:112
 d_lru_add fs/dcache.c:366 [inline]
 dentry_lru_add fs/dcache.c:421 [inline]
 dput.part.27+0x659/0x7c0 fs/dcache.c:784
 dput+0x1f/0x30 fs/dcache.c:753
 path_put+0x31/0x70 fs/namei.c:500
 unix_bind+0x424/0xea0 net/unix/af_unix.c:1072
 SYSC_bind+0x20e/0x4a0 net/socket.c:1413
 SyS_bind+0x24/0x30 net/socket.c:1399
 entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x4454b9
RSP: 002b:00007f1b034ebb58 EFLAGS: 00000292 ORIG_RAX: 0000000000000031
RAX: ffffffffffffffda RBX: 0000000000000016 RCX: 00000000004454b9
RDX: 0000000000000008 RSI: 000000002002eff8 RDI: 0000000000000016
RBP: 00000000006dd230 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000292 R12: 0000000000700000
R13: 00007f1b03c6d458 R14: 00007f1b03c6e5e8 R15: 0000000000000000
Object at ffff88006b301300, in cache vm_area_struct size: 192
Allocated:
PID = 1391

[<ffffffff812b2686>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57

[<ffffffff81a0e713>] save_stack+0x43/0xd0 mm/kasan/kasan.c:502

[<ffffffff81a0e9da>] set_track mm/kasan/kasan.c:514 [inline]
[<ffffffff81a0e9da>] kasan_kmalloc+0xaa/0xd0 mm/kasan/kasan.c:605

[<ffffffff81a0efd2>] kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:544

[<ffffffff81a0a5e2>] kmem_cache_alloc+0x102/0x680 mm/slab.c:3563

[<ffffffff8144093b>] dup_mmap kernel/fork.c:609 [inline]
[<ffffffff8144093b>] dup_mm kernel/fork.c:1145 [inline]
[<ffffffff8144093b>] copy_mm kernel/fork.c:1199 [inline]
[<ffffffff8144093b>] copy_process.part.42+0x503b/0x5fd0 kernel/fork.c:1669

[<ffffffff81441e10>] copy_process kernel/fork.c:1494 [inline]
[<ffffffff81441e10>] _do_fork+0x200/0xff0 kernel/fork.c:1950

[<ffffffff81442cd7>] SYSC_clone kernel/fork.c:2060 [inline]
[<ffffffff81442cd7>] SyS_clone+0x37/0x50 kernel/fork.c:2054

[<ffffffff81009798>] do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:280

[<ffffffff841cadc9>] return_from_SYSCALL_64+0x0/0x7a
Freed:
PID = 5275

[<ffffffff812b2686>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57

[<ffffffff81a0e713>] save_stack+0x43/0xd0 mm/kasan/kasan.c:502

[<ffffffff81a0f04f>] set_track mm/kasan/kasan.c:514 [inline]
[<ffffffff81a0f04f>] kasan_slab_free+0x6f/0xb0 mm/kasan/kasan.c:578

[<ffffffff81a0c3b1>] __cache_free mm/slab.c:3505 [inline]
[<ffffffff81a0c3b1>] kmem_cache_free+0x71/0x240 mm/slab.c:3765

[<ffffffff81976992>] remove_vma+0x162/0x1b0 mm/mmap.c:175

[<ffffffff8197f72f>] exit_mmap+0x2ef/0x490 mm/mmap.c:2952

[<ffffffff814390bb>] __mmput kernel/fork.c:873 [inline]
[<ffffffff814390bb>] mmput+0x22b/0x6e0 kernel/fork.c:895

[<ffffffff81453a3f>] exit_mm kernel/exit.c:521 [inline]
[<ffffffff81453a3f>] do_exit+0x9cf/0x28a0 kernel/exit.c:826

[<ffffffff8145a369>] do_group_exit+0x149/0x420 kernel/exit.c:943

[<ffffffff81489630>] get_signal+0x7e0/0x1820 kernel/signal.c:2313

[<ffffffff8127ca92>] do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:807

[<ffffffff81007900>] exit_to_usermode_loop+0x200/0x2a0
arch/x86/entry/common.c:156

[<ffffffff81009413>] prepare_exit_to_usermode
arch/x86/entry/common.c:190 [inline]
[<ffffffff81009413>] syscall_return_slowpath+0x4d3/0x570
arch/x86/entry/common.c:259

[<ffffffff841cada2>] entry_SYSCALL_64_fastpath+0xc0/0xc2
Memory state around the buggy address:
 ffff88006b301200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88006b301280: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>ffff88006b301300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                           ^
 ffff88006b301380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
 ffff88006b301400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2017-01-18  9:17               ` Dmitry Vyukov
@ 2017-01-20  4:57                 ` Cong Wang
  2017-01-20 22:52                   ` Dmitry Vyukov
  0 siblings, 1 reply; 22+ messages in thread
From: Cong Wang @ 2017-01-20  4:57 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Al Viro, linux-fsdevel, LKML, David Miller, Rainer Weikusat,
	Hannes Frederic Sowa, netdev, Eric Dumazet, syzkaller

[-- Attachment #1: Type: text/plain, Size: 1710 bytes --]

On Wed, Jan 18, 2017 at 1:17 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Tue, Jan 17, 2017 at 10:21 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> On Mon, Jan 16, 2017 at 1:32 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>> On Fri, Dec 9, 2016 at 7:41 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>>>> On Thu, Dec 08, 2016 at 10:32:00PM -0800, Cong Wang wrote:
>>>>
>>>>> > Why do we do autobind there, anyway, and why is it conditional on
>>>>> > SOCK_PASSCRED?  Note that e.g. for SOCK_STREAM we can bloody well get
>>>>> > to sending stuff without autobind ever done - just use socketpair()
>>>>> > to create that sucker and we won't be going through the connect()
>>>>> > at all.
>>>>>
>>>>> In the case Dmitry reported, unix_dgram_sendmsg() calls unix_autobind(),
>>>>> not SOCK_STREAM.
>>>>
>>>> Yes, I've noticed.  What I'm asking is what in there needs autobind triggered
>>>> on sendmsg and why doesn't the same need affect the SOCK_STREAM case?
>>>>
>>>>> I guess some lock, perhaps the u->bindlock could be dropped before
>>>>> acquiring the next one (sb_writer), but I need to double check.
>>>>
>>>> Bad idea, IMO - do you *want* autobind being able to come through while
>>>> bind(2) is busy with mknod?
>>>
>>>
>>> Ping. This is still happening on HEAD.
>>>
>>
>> Thanks for your reminder. Mind to give the attached patch (compile only)
>> a try? I take another approach to fix this deadlock, which moves the
>> unix_mknod() out of unix->bindlock. Not sure if there is any unexpected
>> impact with this way.
>
>
> I instantly hit:
>

Oh, sorry about it, I forgot to initialize struct path...

Attached is the updated version, I just did a boot test, no crash at least. ;)

Thanks!

[-- Attachment #2: unix.diff --]
[-- Type: text/plain, Size: 1695 bytes --]

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 127656e..cef7987 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -995,6 +995,7 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
 	unsigned int hash;
 	struct unix_address *addr;
 	struct hlist_head *list;
+	struct path path = { NULL, NULL };
 
 	err = -EINVAL;
 	if (sunaddr->sun_family != AF_UNIX)
@@ -1010,9 +1011,20 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
 		goto out;
 	addr_len = err;
 
+	if (sun_path[0]) {
+		umode_t mode = S_IFSOCK |
+		       (SOCK_INODE(sock)->i_mode & ~current_umask());
+		err = unix_mknod(sun_path, mode, &path);
+		if (err) {
+			if (err == -EEXIST)
+				err = -EADDRINUSE;
+			goto out;
+		}
+	}
+
 	err = mutex_lock_interruptible(&u->bindlock);
 	if (err)
-		goto out;
+		goto out_put;
 
 	err = -EINVAL;
 	if (u->addr)
@@ -1029,16 +1041,6 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
 	atomic_set(&addr->refcnt, 1);
 
 	if (sun_path[0]) {
-		struct path path;
-		umode_t mode = S_IFSOCK |
-		       (SOCK_INODE(sock)->i_mode & ~current_umask());
-		err = unix_mknod(sun_path, mode, &path);
-		if (err) {
-			if (err == -EEXIST)
-				err = -EADDRINUSE;
-			unix_release_addr(addr);
-			goto out_up;
-		}
 		addr->hash = UNIX_HASH_SIZE;
 		hash = d_backing_inode(path.dentry)->i_ino & (UNIX_HASH_SIZE - 1);
 		spin_lock(&unix_table_lock);
@@ -1065,6 +1067,9 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
 	spin_unlock(&unix_table_lock);
 out_up:
 	mutex_unlock(&u->bindlock);
+out_put:
+	if (err)
+		path_put(&path);
 out:
 	return err;
 }

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2017-01-20  4:57                 ` Cong Wang
@ 2017-01-20 22:52                   ` Dmitry Vyukov
  2017-01-23 19:00                     ` Cong Wang
  0 siblings, 1 reply; 22+ messages in thread
From: Dmitry Vyukov @ 2017-01-20 22:52 UTC (permalink / raw)
  To: Cong Wang
  Cc: Al Viro, linux-fsdevel, LKML, David Miller, Rainer Weikusat,
	Hannes Frederic Sowa, netdev, Eric Dumazet, syzkaller

On Fri, Jan 20, 2017 at 5:57 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>>>>> > Why do we do autobind there, anyway, and why is it conditional on
>>>>>> > SOCK_PASSCRED?  Note that e.g. for SOCK_STREAM we can bloody well get
>>>>>> > to sending stuff without autobind ever done - just use socketpair()
>>>>>> > to create that sucker and we won't be going through the connect()
>>>>>> > at all.
>>>>>>
>>>>>> In the case Dmitry reported, unix_dgram_sendmsg() calls unix_autobind(),
>>>>>> not SOCK_STREAM.
>>>>>
>>>>> Yes, I've noticed.  What I'm asking is what in there needs autobind triggered
>>>>> on sendmsg and why doesn't the same need affect the SOCK_STREAM case?
>>>>>
>>>>>> I guess some lock, perhaps the u->bindlock could be dropped before
>>>>>> acquiring the next one (sb_writer), but I need to double check.
>>>>>
>>>>> Bad idea, IMO - do you *want* autobind being able to come through while
>>>>> bind(2) is busy with mknod?
>>>>
>>>>
>>>> Ping. This is still happening on HEAD.
>>>>
>>>
>>> Thanks for your reminder. Mind to give the attached patch (compile only)
>>> a try? I take another approach to fix this deadlock, which moves the
>>> unix_mknod() out of unix->bindlock. Not sure if there is any unexpected
>>> impact with this way.
>>
>>
>> I instantly hit:
>>
>
> Oh, sorry about it, I forgot to initialize struct path...
>
> Attached is the updated version, I just did a boot test, no crash at least. ;)
>
> Thanks!

This works! I did not see the deadlock warning, nor any other related crashes.

Tested-by: Dmitry Vyukov <dvyukov@google.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2017-01-20 22:52                   ` Dmitry Vyukov
@ 2017-01-23 19:00                     ` Cong Wang
  0 siblings, 0 replies; 22+ messages in thread
From: Cong Wang @ 2017-01-23 19:00 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Al Viro, linux-fsdevel, LKML, David Miller, Rainer Weikusat,
	Hannes Frederic Sowa, netdev, Eric Dumazet, syzkaller

On Fri, Jan 20, 2017 at 2:52 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>
> This works! I did not see the deadlock warning, nor any other related crashes.
>
> Tested-by: Dmitry Vyukov <dvyukov@google.com>

Thanks for verifying it. I will send it out formally soon.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2017-01-17 21:21             ` Cong Wang
  2017-01-18  9:17               ` Dmitry Vyukov
@ 2017-01-26 23:29               ` Mateusz Guzik
  2017-01-27  5:11                 ` Cong Wang
  1 sibling, 1 reply; 22+ messages in thread
From: Mateusz Guzik @ 2017-01-26 23:29 UTC (permalink / raw)
  To: Cong Wang
  Cc: Dmitry Vyukov, Al Viro, linux-fsdevel, LKML, David Miller,
	Rainer Weikusat, Hannes Frederic Sowa, netdev, Eric Dumazet,
	syzkaller

On Tue, Jan 17, 2017 at 01:21:48PM -0800, Cong Wang wrote:
> On Mon, Jan 16, 2017 at 1:32 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> > On Fri, Dec 9, 2016 at 7:41 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >> On Thu, Dec 08, 2016 at 10:32:00PM -0800, Cong Wang wrote:
> >>
> >>> > Why do we do autobind there, anyway, and why is it conditional on
> >>> > SOCK_PASSCRED?  Note that e.g. for SOCK_STREAM we can bloody well get
> >>> > to sending stuff without autobind ever done - just use socketpair()
> >>> > to create that sucker and we won't be going through the connect()
> >>> > at all.
> >>>
> >>> In the case Dmitry reported, unix_dgram_sendmsg() calls unix_autobind(),
> >>> not SOCK_STREAM.
> >>
> >> Yes, I've noticed.  What I'm asking is what in there needs autobind triggered
> >> on sendmsg and why doesn't the same need affect the SOCK_STREAM case?
> >>
> >>> I guess some lock, perhaps the u->bindlock could be dropped before
> >>> acquiring the next one (sb_writer), but I need to double check.
> >>
> >> Bad idea, IMO - do you *want* autobind being able to come through while
> >> bind(2) is busy with mknod?
> >
> >
> > Ping. This is still happening on HEAD.
> >
> 
> Thanks for your reminder. Mind to give the attached patch (compile only)
> a try? I take another approach to fix this deadlock, which moves the
> unix_mknod() out of unix->bindlock. Not sure if there is any unexpected
> impact with this way.
> 

I don't think this is the right approach.

Currently the file creation is potponed until unix_bind can no longer
fail otherwise. With it reordered, it may be someone races you with a
different path and now you are left with a file to clean up. Except it
is quite unclear for me if you can unlink it.

I don't have a good idea how to fix it. A somewhat typical approach
would introduce an intermediate state ("under construction") and drop
the lock between calling into unix_mknod.

In this particular case, perhaps you could repurpose gc_flags as a
general flags carrier and add a 'binding in process' flag to test.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2017-01-26 23:29               ` Mateusz Guzik
@ 2017-01-27  5:11                 ` Cong Wang
  2017-01-27  6:41                   ` Mateusz Guzik
  0 siblings, 1 reply; 22+ messages in thread
From: Cong Wang @ 2017-01-27  5:11 UTC (permalink / raw)
  To: Mateusz Guzik
  Cc: Dmitry Vyukov, Al Viro, linux-fsdevel, LKML, David Miller,
	Rainer Weikusat, Hannes Frederic Sowa, netdev, Eric Dumazet,
	syzkaller

On Thu, Jan 26, 2017 at 3:29 PM, Mateusz Guzik <mguzik@redhat.com> wrote:
> On Tue, Jan 17, 2017 at 01:21:48PM -0800, Cong Wang wrote:
>> On Mon, Jan 16, 2017 at 1:32 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> > On Fri, Dec 9, 2016 at 7:41 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>> >> On Thu, Dec 08, 2016 at 10:32:00PM -0800, Cong Wang wrote:
>> >>
>> >>> > Why do we do autobind there, anyway, and why is it conditional on
>> >>> > SOCK_PASSCRED?  Note that e.g. for SOCK_STREAM we can bloody well get
>> >>> > to sending stuff without autobind ever done - just use socketpair()
>> >>> > to create that sucker and we won't be going through the connect()
>> >>> > at all.
>> >>>
>> >>> In the case Dmitry reported, unix_dgram_sendmsg() calls unix_autobind(),
>> >>> not SOCK_STREAM.
>> >>
>> >> Yes, I've noticed.  What I'm asking is what in there needs autobind triggered
>> >> on sendmsg and why doesn't the same need affect the SOCK_STREAM case?
>> >>
>> >>> I guess some lock, perhaps the u->bindlock could be dropped before
>> >>> acquiring the next one (sb_writer), but I need to double check.
>> >>
>> >> Bad idea, IMO - do you *want* autobind being able to come through while
>> >> bind(2) is busy with mknod?
>> >
>> >
>> > Ping. This is still happening on HEAD.
>> >
>>
>> Thanks for your reminder. Mind to give the attached patch (compile only)
>> a try? I take another approach to fix this deadlock, which moves the
>> unix_mknod() out of unix->bindlock. Not sure if there is any unexpected
>> impact with this way.
>>
>
> I don't think this is the right approach.
>
> Currently the file creation is potponed until unix_bind can no longer
> fail otherwise. With it reordered, it may be someone races you with a
> different path and now you are left with a file to clean up. Except it
> is quite unclear for me if you can unlink it.

What races do you mean here? If you mean someone could get a
refcount of that file, it could happen no matter we have bindlock or not
since it is visible once created. The filesystem layer should take care of
the file refcount so all we need to do here is calling path_put() as in my
patch. Or if you mean two threads calling unix_bind() could race without
binlock, only one of them should succeed the other one just fails out.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2017-01-27  5:11                 ` Cong Wang
@ 2017-01-27  6:41                   ` Mateusz Guzik
  2017-01-31  6:44                     ` Cong Wang
  0 siblings, 1 reply; 22+ messages in thread
From: Mateusz Guzik @ 2017-01-27  6:41 UTC (permalink / raw)
  To: Cong Wang
  Cc: Dmitry Vyukov, Al Viro, linux-fsdevel, LKML, David Miller,
	Rainer Weikusat, Hannes Frederic Sowa, netdev, Eric Dumazet,
	syzkaller

On Thu, Jan 26, 2017 at 09:11:07PM -0800, Cong Wang wrote:
> On Thu, Jan 26, 2017 at 3:29 PM, Mateusz Guzik <mguzik@redhat.com> wrote:
> > On Tue, Jan 17, 2017 at 01:21:48PM -0800, Cong Wang wrote:
> >> On Mon, Jan 16, 2017 at 1:32 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> >> > On Fri, Dec 9, 2016 at 7:41 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >> >> On Thu, Dec 08, 2016 at 10:32:00PM -0800, Cong Wang wrote:
> >> >>
> >> >>> > Why do we do autobind there, anyway, and why is it conditional on
> >> >>> > SOCK_PASSCRED?  Note that e.g. for SOCK_STREAM we can bloody well get
> >> >>> > to sending stuff without autobind ever done - just use socketpair()
> >> >>> > to create that sucker and we won't be going through the connect()
> >> >>> > at all.
> >> >>>
> >> >>> In the case Dmitry reported, unix_dgram_sendmsg() calls unix_autobind(),
> >> >>> not SOCK_STREAM.
> >> >>
> >> >> Yes, I've noticed.  What I'm asking is what in there needs autobind triggered
> >> >> on sendmsg and why doesn't the same need affect the SOCK_STREAM case?
> >> >>
> >> >>> I guess some lock, perhaps the u->bindlock could be dropped before
> >> >>> acquiring the next one (sb_writer), but I need to double check.
> >> >>
> >> >> Bad idea, IMO - do you *want* autobind being able to come through while
> >> >> bind(2) is busy with mknod?
> >> >
> >> >
> >> > Ping. This is still happening on HEAD.
> >> >
> >>
> >> Thanks for your reminder. Mind to give the attached patch (compile only)
> >> a try? I take another approach to fix this deadlock, which moves the
> >> unix_mknod() out of unix->bindlock. Not sure if there is any unexpected
> >> impact with this way.
> >>
> >
> > I don't think this is the right approach.
> >
> > Currently the file creation is potponed until unix_bind can no longer
> > fail otherwise. With it reordered, it may be someone races you with a
> > different path and now you are left with a file to clean up. Except it
> > is quite unclear for me if you can unlink it.
> 
> What races do you mean here? If you mean someone could get a
> refcount of that file, it could happen no matter we have bindlock or not
> since it is visible once created. The filesystem layer should take care of
> the file refcount so all we need to do here is calling path_put() as in my
> patch. Or if you mean two threads calling unix_bind() could race without
> binlock, only one of them should succeed the other one just fails out.

Two threads can race and one fails with EINVAL.

With your patch there is a new file created and it is unclear what to
do with it - leaving it as it is sounds like the last resort and
unlinking it sounds extremely fishy as it opens you to games played by
the user.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2017-01-27  6:41                   ` Mateusz Guzik
@ 2017-01-31  6:44                     ` Cong Wang
  2017-01-31 18:14                       ` Mateusz Guzik
  0 siblings, 1 reply; 22+ messages in thread
From: Cong Wang @ 2017-01-31  6:44 UTC (permalink / raw)
  To: Mateusz Guzik
  Cc: Dmitry Vyukov, Al Viro, linux-fsdevel, LKML, David Miller,
	Rainer Weikusat, Hannes Frederic Sowa, netdev, Eric Dumazet,
	syzkaller

On Thu, Jan 26, 2017 at 10:41 PM, Mateusz Guzik <mguzik@redhat.com> wrote:
> On Thu, Jan 26, 2017 at 09:11:07PM -0800, Cong Wang wrote:
>> On Thu, Jan 26, 2017 at 3:29 PM, Mateusz Guzik <mguzik@redhat.com> wrote:
>> > Currently the file creation is potponed until unix_bind can no longer
>> > fail otherwise. With it reordered, it may be someone races you with a
>> > different path and now you are left with a file to clean up. Except it
>> > is quite unclear for me if you can unlink it.
>>
>> What races do you mean here? If you mean someone could get a
>> refcount of that file, it could happen no matter we have bindlock or not
>> since it is visible once created. The filesystem layer should take care of
>> the file refcount so all we need to do here is calling path_put() as in my
>> patch. Or if you mean two threads calling unix_bind() could race without
>> binlock, only one of them should succeed the other one just fails out.
>
> Two threads can race and one fails with EINVAL.
>
> With your patch there is a new file created and it is unclear what to
> do with it - leaving it as it is sounds like the last resort and
> unlinking it sounds extremely fishy as it opens you to games played by
> the user.

But the file is created and visible to users too even without my patch,
the file is also put when the unix sock is released. So the only difference
my patch makes is bindlock is no longer taken during file creation, which
does not seem to be the cause of the problem you complain here.

Mind being more specific?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2017-01-31  6:44                     ` Cong Wang
@ 2017-01-31 18:14                       ` Mateusz Guzik
  2017-02-06  7:22                         ` Cong Wang
  0 siblings, 1 reply; 22+ messages in thread
From: Mateusz Guzik @ 2017-01-31 18:14 UTC (permalink / raw)
  To: Cong Wang
  Cc: Dmitry Vyukov, Al Viro, linux-fsdevel, LKML, David Miller,
	Rainer Weikusat, Hannes Frederic Sowa, netdev, Eric Dumazet,
	syzkaller

On Mon, Jan 30, 2017 at 10:44:03PM -0800, Cong Wang wrote:
> On Thu, Jan 26, 2017 at 10:41 PM, Mateusz Guzik <mguzik@redhat.com> wrote:
> > On Thu, Jan 26, 2017 at 09:11:07PM -0800, Cong Wang wrote:
> >> On Thu, Jan 26, 2017 at 3:29 PM, Mateusz Guzik <mguzik@redhat.com> wrote:
> >> > Currently the file creation is potponed until unix_bind can no longer
> >> > fail otherwise. With it reordered, it may be someone races you with a
> >> > different path and now you are left with a file to clean up. Except it
> >> > is quite unclear for me if you can unlink it.
> >>
> >> What races do you mean here? If you mean someone could get a
> >> refcount of that file, it could happen no matter we have bindlock or not
> >> since it is visible once created. The filesystem layer should take care of
> >> the file refcount so all we need to do here is calling path_put() as in my
> >> patch. Or if you mean two threads calling unix_bind() could race without
> >> binlock, only one of them should succeed the other one just fails out.
> >
> > Two threads can race and one fails with EINVAL.
> >
> > With your patch there is a new file created and it is unclear what to
> > do with it - leaving it as it is sounds like the last resort and
> > unlinking it sounds extremely fishy as it opens you to games played by
> > the user.
> 
> But the file is created and visible to users too even without my patch,
> the file is also put when the unix sock is released. So the only difference
> my patch makes is bindlock is no longer taken during file creation, which
> does not seem to be the cause of the problem you complain here.
> 
> Mind being more specific?

Consider 2 threads which bind the same socket, but with different paths.

Currently exactly one file will get created, the one used to bind.

With your patch both threads can succeed creating their respective
files, but only one will manage to bind. The other one must error out,
but it already created a file it is unclear what to do with.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2017-01-31 18:14                       ` Mateusz Guzik
@ 2017-02-06  7:22                         ` Cong Wang
  2017-02-07 14:20                           ` Mateusz Guzik
  0 siblings, 1 reply; 22+ messages in thread
From: Cong Wang @ 2017-02-06  7:22 UTC (permalink / raw)
  To: Mateusz Guzik
  Cc: Dmitry Vyukov, Al Viro, linux-fsdevel, LKML, David Miller,
	Rainer Weikusat, Hannes Frederic Sowa, netdev, Eric Dumazet,
	syzkaller

On Tue, Jan 31, 2017 at 10:14 AM, Mateusz Guzik <mguzik@redhat.com> wrote:
> On Mon, Jan 30, 2017 at 10:44:03PM -0800, Cong Wang wrote:
>> Mind being more specific?
>
> Consider 2 threads which bind the same socket, but with different paths.
>
> Currently exactly one file will get created, the one used to bind.
>
> With your patch both threads can succeed creating their respective
> files, but only one will manage to bind. The other one must error out,
> but it already created a file it is unclear what to do with.

In this case, it simply puts the path back:

        err = -EINVAL;
        if (u->addr)
                goto out_up;
[...]

out_up:
        mutex_unlock(&u->bindlock);
out_put:
        if (err)
                path_put(&path);
out:
        return err;


Which is what unix_release_sock() does too:

        if (path.dentry)
                path_put(&path);

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2017-02-06  7:22                         ` Cong Wang
@ 2017-02-07 14:20                           ` Mateusz Guzik
  2017-02-10  1:37                             ` Cong Wang
  0 siblings, 1 reply; 22+ messages in thread
From: Mateusz Guzik @ 2017-02-07 14:20 UTC (permalink / raw)
  To: Cong Wang
  Cc: Dmitry Vyukov, Al Viro, linux-fsdevel, LKML, David Miller,
	Rainer Weikusat, Hannes Frederic Sowa, netdev, Eric Dumazet,
	syzkaller

On Sun, Feb 05, 2017 at 11:22:12PM -0800, Cong Wang wrote:
> On Tue, Jan 31, 2017 at 10:14 AM, Mateusz Guzik <mguzik@redhat.com> wrote:
> > On Mon, Jan 30, 2017 at 10:44:03PM -0800, Cong Wang wrote:
> >> Mind being more specific?
> >
> > Consider 2 threads which bind the same socket, but with different paths.
> >
> > Currently exactly one file will get created, the one used to bind.
> >
> > With your patch both threads can succeed creating their respective
> > files, but only one will manage to bind. The other one must error out,
> > but it already created a file it is unclear what to do with.
> 
> In this case, it simply puts the path back:
> 
>         err = -EINVAL;
>         if (u->addr)
>                 goto out_up;
> [...]
> 
> out_up:
>         mutex_unlock(&u->bindlock);
> out_put:
>         if (err)
>                 path_put(&path);
> out:
>         return err;
> 
> 
> Which is what unix_release_sock() does too:
> 
>         if (path.dentry)
>                 path_put(&path);

Yes, but unix_release_sock is expected to leave the file behind.
Note I'm not claiming there is a leak, but that racing threads will be
able to trigger a condition where you create a file and fail to bind it.

What to do with the file now?

Untested, but likely a working solution would rework the code so that
e.g. a flag is set and the lock can be dropped.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
  2017-02-07 14:20                           ` Mateusz Guzik
@ 2017-02-10  1:37                             ` Cong Wang
  0 siblings, 0 replies; 22+ messages in thread
From: Cong Wang @ 2017-02-10  1:37 UTC (permalink / raw)
  To: Mateusz Guzik
  Cc: Dmitry Vyukov, Al Viro, linux-fsdevel, LKML, David Miller,
	Rainer Weikusat, Hannes Frederic Sowa, netdev, Eric Dumazet,
	syzkaller

On Tue, Feb 7, 2017 at 6:20 AM, Mateusz Guzik <mguzik@redhat.com> wrote:
>
> Yes, but unix_release_sock is expected to leave the file behind.
> Note I'm not claiming there is a leak, but that racing threads will be
> able to trigger a condition where you create a file and fail to bind it.
>

Which is expected, right? No one guarantees the success of file
creation is the success of bind, the previous code does but it is not
part of API AFAIK. Should a sane user-space application check
the file creation for a successful bind() or just check its return value?

> What to do with the file now?
>

We just do what unix_release_sock() does, so why do you keep
asking the same question?

If you still complain about the race with user-space, think about the
same race in-between a successful bind() and close(), nothing is new.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fs, net: deadlock between bind/splice on af_unix
       [not found] ` <065031f0-27c5-443d-82f9-2f475fcef8c3@googlegroups.com>
@ 2017-06-23 16:30   ` Cong Wang
  0 siblings, 0 replies; 22+ messages in thread
From: Cong Wang @ 2017-06-23 16:30 UTC (permalink / raw)
  To: kodamagulla.kalyan
  Cc: syzkaller, Al Viro, linux-fsdevel, LKML, David Miller,
	Rainer Weikusat, Hannes Frederic Sowa,
	Linux Kernel Network Developers, Eric Dumazet

Hi,

On Thu, Jun 22, 2017 at 10:49 AM,  <kodamagulla.kalyan@gmail.com> wrote:
> I was getting below crash while running mp4.

Are you sure your 3.14 kernel has my patch in this thread?
commit 0fb44559ffd67de8517098 is merged in 4.10.

Also, your crash is on unix_dgram_sendmsg() path, not
unix_bind().

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-06-23 16:31 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-08 14:47 fs, net: deadlock between bind/splice on af_unix Dmitry Vyukov
2016-12-08 16:30 ` Dmitry Vyukov
2016-12-09  0:08   ` Cong Wang
2016-12-09  1:32     ` Al Viro
2016-12-09  6:32       ` Cong Wang
2016-12-09  6:41         ` Al Viro
2017-01-16  9:32           ` Dmitry Vyukov
2017-01-17 21:21             ` Cong Wang
2017-01-18  9:17               ` Dmitry Vyukov
2017-01-20  4:57                 ` Cong Wang
2017-01-20 22:52                   ` Dmitry Vyukov
2017-01-23 19:00                     ` Cong Wang
2017-01-26 23:29               ` Mateusz Guzik
2017-01-27  5:11                 ` Cong Wang
2017-01-27  6:41                   ` Mateusz Guzik
2017-01-31  6:44                     ` Cong Wang
2017-01-31 18:14                       ` Mateusz Guzik
2017-02-06  7:22                         ` Cong Wang
2017-02-07 14:20                           ` Mateusz Guzik
2017-02-10  1:37                             ` Cong Wang
2017-01-17  8:07           ` Eric W. Biederman
     [not found] ` <065031f0-27c5-443d-82f9-2f475fcef8c3@googlegroups.com>
2017-06-23 16:30   ` Cong Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).