linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* possible deadlock in send_sigio
@ 2020-04-04  5:55 syzbot
  2020-06-11  2:32 ` Waiman Long
  0 siblings, 1 reply; 16+ messages in thread
From: syzbot @ 2020-04-04  5:55 UTC (permalink / raw)
  To: adobriyan, akpm, allison, areber, aubrey.li, avagin, bfields,
	christian, cyphar, ebiederm, gregkh, guro, jlayton, joel,
	keescook, linmiaohe, linux-fsdevel, linux-kernel, mhocko, mingo,
	oleg, peterz, sargun, syzkaller-bugs, tglx, viro

Hello,

syzbot found the following crash on:

HEAD commit:    bef7b2a7 Merge tag 'devicetree-for-5.7' of git://git.kerne..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15f39c5de00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=91b674b8f0368e69
dashboard link: https://syzkaller.appspot.com/bug?extid=a9fb1457d720a55d6dc5
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1454c3b7e00000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12a22ac7e00000

The bug was bisected to:

commit 7bc3e6e55acf065500a24621f3b313e7e5998acf
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Thu Feb 20 00:22:26 2020 +0000

    proc: Use a list of inodes to flush from proc

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=165c4acde00000
final crash:    https://syzkaller.appspot.com/x/report.txt?x=155c4acde00000
console output: https://syzkaller.appspot.com/x/log.txt?x=115c4acde00000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+a9fb1457d720a55d6dc5@syzkaller.appspotmail.com
Fixes: 7bc3e6e55acf ("proc: Use a list of inodes to flush from proc")

========================================================
WARNING: possible irq lock inversion dependency detected
5.6.0-syzkaller #0 Not tainted
--------------------------------------------------------
ksoftirqd/0/9 just changed the state of lock:
ffffffff898090d8 (tasklist_lock){.+.?}-{2:2}, at: send_sigio+0xa9/0x340 fs/fcntl.c:800
but this lock took another, SOFTIRQ-unsafe lock in the past:
 (&pid->wait_pidfd){+.+.}-{2:2}


and interrupts could create inverse lock ordering between them.


other info that might help us debug this:
 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&pid->wait_pidfd);
                               local_irq_disable();
                               lock(tasklist_lock);
                               lock(&pid->wait_pidfd);
  <Interrupt>
    lock(tasklist_lock);

 *** DEADLOCK ***

8 locks held by ksoftirqd/0/9:
 #0: ffffffff899bb200 (rcu_read_lock){....}-{1:2}, at: __write_once_size include/linux/compiler.h:226 [inline]
 #0: ffffffff899bb200 (rcu_read_lock){....}-{1:2}, at: __skb_unlink include/linux/skbuff.h:2078 [inline]
 #0: ffffffff899bb200 (rcu_read_lock){....}-{1:2}, at: __skb_dequeue include/linux/skbuff.h:2093 [inline]
 #0: ffffffff899bb200 (rcu_read_lock){....}-{1:2}, at: process_backlog+0x1ad/0x7a0 net/core/dev.c:6131
 #1: ffffffff899bb200 (rcu_read_lock){....}-{1:2}, at: __skb_pull include/linux/skbuff.h:2309 [inline]
 #1: ffffffff899bb200 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0x124/0x360 net/ipv4/ip_input.c:228
 #2: ffff88808e1750e0 (slock-AF_INET/1){+.-.}-{2:2}, at: tcp_v4_rcv+0x2d09/0x39c0 net/ipv4/tcp_ipv4.c:1997
 #3: ffffffff899bb200 (rcu_read_lock){....}-{1:2}, at: sock_def_error_report+0x0/0x4d0 include/linux/compiler.h:199
 #4: ffffffff899bb200 (rcu_read_lock){....}-{1:2}, at: rcu_lock_release include/linux/rcupdate.h:213 [inline]
 #4: ffffffff899bb200 (rcu_read_lock){....}-{1:2}, at: rcu_read_unlock include/linux/rcupdate.h:655 [inline]
 #4: ffffffff899bb200 (rcu_read_lock){....}-{1:2}, at: sock_def_error_report+0x1d6/0x4d0 net/core/sock.c:2809
 #5: ffffffff899bb200 (rcu_read_lock){....}-{1:2}, at: kill_fasync+0x3d/0x470 fs/fcntl.c:1021
 #6: ffff8880a41312b8 (&new->fa_lock){.+.?}-{2:2}, at: kill_fasync_rcu fs/fcntl.c:1002 [inline]
 #6: ffff8880a41312b8 (&new->fa_lock){.+.?}-{2:2}, at: kill_fasync fs/fcntl.c:1023 [inline]
 #6: ffff8880a41312b8 (&new->fa_lock){.+.?}-{2:2}, at: kill_fasync+0x162/0x470 fs/fcntl.c:1016
 #7: ffff8880a5d263f8 (&f->f_owner.lock){.+.?}-{2:2}, at: send_sigio+0x24/0x340 fs/fcntl.c:786

the shortest dependencies between 2nd lock and 1st lock:
 -> (&pid->wait_pidfd){+.+.}-{2:2} {
    HARDIRQ-ON-W at:
                      lock_acquire+0x1f2/0x8f0 kernel/locking/lockdep.c:4923
                      __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
                      _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
                      spin_lock include/linux/spinlock.h:353 [inline]
                      proc_pid_make_inode+0x1f9/0x3c0 fs/proc/base.c:1880
                      proc_pid_instantiate+0x51/0x150 fs/proc/base.c:3285
                      proc_pid_lookup+0x1da/0x340 fs/proc/base.c:3320
                      proc_root_lookup+0x20/0x60 fs/proc/root.c:243
                      __lookup_slow+0x256/0x490 fs/namei.c:1530
                      lookup_slow fs/namei.c:1547 [inline]
                      walk_component+0x418/0x6a0 fs/namei.c:1846
                      link_path_walk.part.0+0x4f1/0xb50 fs/namei.c:2166
                      link_path_walk fs/namei.c:2098 [inline]
                      path_openat+0x25a/0x27b0 fs/namei.c:3342
                      do_filp_open+0x203/0x260 fs/namei.c:3375
                      do_sys_openat2+0x585/0x770 fs/open.c:1148
                      do_sys_open+0xc3/0x140 fs/open.c:1164
                      do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
                      entry_SYSCALL_64_after_hwframe+0x49/0xb3
    SOFTIRQ-ON-W at:
                      lock_acquire+0x1f2/0x8f0 kernel/locking/lockdep.c:4923
                      __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
                      _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
                      spin_lock include/linux/spinlock.h:353 [inline]
                      proc_pid_make_inode+0x1f9/0x3c0 fs/proc/base.c:1880
                      proc_pid_instantiate+0x51/0x150 fs/proc/base.c:3285
                      proc_pid_lookup+0x1da/0x340 fs/proc/base.c:3320
                      proc_root_lookup+0x20/0x60 fs/proc/root.c:243
                      __lookup_slow+0x256/0x490 fs/namei.c:1530
                      lookup_slow fs/namei.c:1547 [inline]
                      walk_component+0x418/0x6a0 fs/namei.c:1846
                      link_path_walk.part.0+0x4f1/0xb50 fs/namei.c:2166
                      link_path_walk fs/namei.c:2098 [inline]
                      path_openat+0x25a/0x27b0 fs/namei.c:3342
                      do_filp_open+0x203/0x260 fs/namei.c:3375
                      do_sys_openat2+0x585/0x770 fs/open.c:1148
                      do_sys_open+0xc3/0x140 fs/open.c:1164
                      do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
                      entry_SYSCALL_64_after_hwframe+0x49/0xb3
    INITIAL USE at:
                     lock_acquire+0x1f2/0x8f0 kernel/locking/lockdep.c:4923
                     __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
                     _raw_spin_lock_irqsave+0x8c/0xbf kernel/locking/spinlock.c:159
                     __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:122
                     do_notify_pidfd kernel/signal.c:1900 [inline]
                     do_notify_parent+0x19e/0xe60 kernel/signal.c:1927
                     exit_notify kernel/exit.c:660 [inline]
                     do_exit+0x238f/0x2dd0 kernel/exit.c:816
                     call_usermodehelper_exec_async+0x507/0x710 kernel/umh.c:125
                     ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
  }
  ... key      at: [<ffffffff8bbaf680>] __key.53746+0x0/0x40
  ... acquired at:
   __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
   _raw_spin_lock_irqsave+0x8c/0xbf kernel/locking/spinlock.c:159
   __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:122
   do_notify_pidfd kernel/signal.c:1900 [inline]
   do_notify_parent+0x19e/0xe60 kernel/signal.c:1927
   exit_notify kernel/exit.c:660 [inline]
   do_exit+0x238f/0x2dd0 kernel/exit.c:816
   call_usermodehelper_exec_async+0x507/0x710 kernel/umh.c:125
   ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

-> (tasklist_lock){.+.?}-{2:2} {
   HARDIRQ-ON-R at:
                    lock_acquire+0x1f2/0x8f0 kernel/locking/lockdep.c:4923
                    __raw_read_lock include/linux/rwlock_api_smp.h:149 [inline]
                    _raw_read_lock+0x2d/0x40 kernel/locking/spinlock.c:223
                    do_wait+0x3b9/0xa00 kernel/exit.c:1436
                    kernel_wait4+0x14c/0x260 kernel/exit.c:1611
                    call_usermodehelper_exec_sync kernel/umh.c:150 [inline]
                    call_usermodehelper_exec_work+0x172/0x260 kernel/umh.c:187
                    process_one_work+0x965/0x16a0 kernel/workqueue.c:2266
                    worker_thread+0x96/0xe20 kernel/workqueue.c:2412
                    kthread+0x388/0x470 kernel/kthread.c:268
                    ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
   IN-SOFTIRQ-R at:
                    lock_acquire+0x1f2/0x8f0 kernel/locking/lockdep.c:4923
                    __raw_read_lock include/linux/rwlock_api_smp.h:149 [inline]
                    _raw_read_lock+0x2d/0x40 kernel/locking/spinlock.c:223
                    send_sigio+0xa9/0x340 fs/fcntl.c:800
                    kill_fasync_rcu fs/fcntl.c:1009 [inline]
                    kill_fasync fs/fcntl.c:1023 [inline]
                    kill_fasync+0x21c/0x470 fs/fcntl.c:1016
                    sock_wake_async+0xd2/0x160 net/socket.c:1337
                    sk_wake_async include/net/sock.h:2259 [inline]
                    sk_wake_async include/net/sock.h:2255 [inline]
                    sock_def_error_report+0x2d7/0x4d0 net/core/sock.c:2808
                    tcp_reset net/ipv4/tcp_input.c:4138 [inline]
                    tcp_reset+0x195/0x4e0 net/ipv4/tcp_input.c:4114
                    tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:5937 [inline]
                    tcp_rcv_state_process+0x2ead/0x4c80 net/ipv4/tcp_input.c:6204
                    tcp_v4_do_rcv+0x34c/0x8b0 net/ipv4/tcp_ipv4.c:1643
                    tcp_v4_rcv+0x2f60/0x39c0 net/ipv4/tcp_ipv4.c:2003
                    ip_protocol_deliver_rcu+0x57/0x880 net/ipv4/ip_input.c:204
                    ip_local_deliver_finish+0x220/0x360 net/ipv4/ip_input.c:231
                    NF_HOOK include/linux/netfilter.h:307 [inline]
                    NF_HOOK include/linux/netfilter.h:301 [inline]
                    ip_local_deliver+0x1c8/0x4e0 net/ipv4/ip_input.c:252
                    dst_input include/net/dst.h:441 [inline]
                    ip_rcv_finish+0x1da/0x2f0 net/ipv4/ip_input.c:428
                    NF_HOOK include/linux/netfilter.h:307 [inline]
                    NF_HOOK include/linux/netfilter.h:301 [inline]
                    ip_rcv+0xd0/0x3c0 net/ipv4/ip_input.c:539
                    __netif_receive_skb_one_core+0xf5/0x160 net/core/dev.c:5187
                    __netif_receive_skb+0x27/0x1c0 net/core/dev.c:5301
                    process_backlog+0x21e/0x7a0 net/core/dev.c:6133
                    napi_poll net/core/dev.c:6571 [inline]
                    net_rx_action+0x4c2/0x1070 net/core/dev.c:6639
                    __do_softirq+0x26c/0x9f7 kernel/softirq.c:292
                    run_ksoftirqd kernel/softirq.c:604 [inline]
                    run_ksoftirqd+0x89/0x100 kernel/softirq.c:596
                    smpboot_thread_fn+0x653/0x9e0 kernel/smpboot.c:165
                    kthread+0x388/0x470 kernel/kthread.c:268
                    ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
   SOFTIRQ-ON-R at:
                    lock_acquire+0x1f2/0x8f0 kernel/locking/lockdep.c:4923
                    __raw_read_lock include/linux/rwlock_api_smp.h:149 [inline]
                    _raw_read_lock+0x2d/0x40 kernel/locking/spinlock.c:223
                    do_wait+0x3b9/0xa00 kernel/exit.c:1436
                    kernel_wait4+0x14c/0x260 kernel/exit.c:1611
                    call_usermodehelper_exec_sync kernel/umh.c:150 [inline]
                    call_usermodehelper_exec_work+0x172/0x260 kernel/umh.c:187
                    process_one_work+0x965/0x16a0 kernel/workqueue.c:2266
                    worker_thread+0x96/0xe20 kernel/workqueue.c:2412
                    kthread+0x388/0x470 kernel/kthread.c:268
                    ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
   INITIAL USE at:
                   lock_acquire+0x1f2/0x8f0 kernel/locking/lockdep.c:4923
                   __raw_write_lock_irq include/linux/rwlock_api_smp.h:196 [inline]
                   _raw_write_lock_irq+0x5b/0x80 kernel/locking/spinlock.c:311
                   copy_process+0x3430/0x72c0 kernel/fork.c:2205
                   _do_fork+0x12d/0x1010 kernel/fork.c:2432
                   kernel_thread+0xb1/0xf0 kernel/fork.c:2519
                   rest_init+0x23/0x365 init/main.c:626
                   start_kernel+0x867/0x8a1 init/main.c:998
                   secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:242
 }
 ... key      at: [<ffffffff898090d8>] tasklist_lock+0x18/0x40
 ... acquired at:
   mark_lock_irq kernel/locking/lockdep.c:3585 [inline]
   mark_lock+0x624/0xf10 kernel/locking/lockdep.c:3935
   mark_usage kernel/locking/lockdep.c:3826 [inline]
   __lock_acquire+0x1ed9/0x4e00 kernel/locking/lockdep.c:4298
   lock_acquire+0x1f2/0x8f0 kernel/locking/lockdep.c:4923
   __raw_read_lock include/linux/rwlock_api_smp.h:149 [inline]
   _raw_read_lock+0x2d/0x40 kernel/locking/spinlock.c:223
   send_sigio+0xa9/0x340 fs/fcntl.c:800
   kill_fasync_rcu fs/fcntl.c:1009 [inline]
   kill_fasync fs/fcntl.c:1023 [inline]
   kill_fasync+0x21c/0x470 fs/fcntl.c:1016
   sock_wake_async+0xd2/0x160 net/socket.c:1337
   sk_wake_async include/net/sock.h:2259 [inline]
   sk_wake_async include/net/sock.h:2255 [inline]
   sock_def_error_report+0x2d7/0x4d0 net/core/sock.c:2808
   tcp_reset net/ipv4/tcp_input.c:4138 [inline]
   tcp_reset+0x195/0x4e0 net/ipv4/tcp_input.c:4114
   tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:5937 [inline]
   tcp_rcv_state_process+0x2ead/0x4c80 net/ipv4/tcp_input.c:6204
   tcp_v4_do_rcv+0x34c/0x8b0 net/ipv4/tcp_ipv4.c:1643
   tcp_v4_rcv+0x2f60/0x39c0 net/ipv4/tcp_ipv4.c:2003
   ip_protocol_deliver_rcu+0x57/0x880 net/ipv4/ip_input.c:204
   ip_local_deliver_finish+0x220/0x360 net/ipv4/ip_input.c:231
   NF_HOOK include/linux/netfilter.h:307 [inline]
   NF_HOOK include/linux/netfilter.h:301 [inline]
   ip_local_deliver+0x1c8/0x4e0 net/ipv4/ip_input.c:252
   dst_input include/net/dst.h:441 [inline]
   ip_rcv_finish+0x1da/0x2f0 net/ipv4/ip_input.c:428
   NF_HOOK include/linux/netfilter.h:307 [inline]
   NF_HOOK include/linux/netfilter.h:301 [inline]
   ip_rcv+0xd0/0x3c0 net/ipv4/ip_input.c:539
   __netif_receive_skb_one_core+0xf5/0x160 net/core/dev.c:5187
   __netif_receive_skb+0x27/0x1c0 net/core/dev.c:5301
   process_backlog+0x21e/0x7a0 net/core/dev.c:6133
   napi_poll net/core/dev.c:6571 [inline]
   net_rx_action+0x4c2/0x1070 net/core/dev.c:6639
   __do_softirq+0x26c/0x9f7 kernel/softirq.c:292
   run_ksoftirqd kernel/softirq.c:604 [inline]
   run_ksoftirqd+0x89/0x100 kernel/softirq.c:596
   smpboot_thread_fn+0x653/0x9e0 kernel/smpboot.c:165
   kthread+0x388/0x470 kernel/kthread.c:268
   ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352


stack backtrace:
CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 5.6.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x188/0x20d lib/dump_stack.c:118
 print_irq_inversion_bug kernel/locking/lockdep.c:3448 [inline]
 check_usage_forwards.cold+0x20/0x29 kernel/locking/lockdep.c:3472
 mark_lock_irq kernel/locking/lockdep.c:3585 [inline]
 mark_lock+0x624/0xf10 kernel/locking/lockdep.c:3935
 mark_usage kernel/locking/lockdep.c:3826 [inline]
 __lock_acquire+0x1ed9/0x4e00 kernel/locking/lockdep.c:4298
 lock_acquire+0x1f2/0x8f0 kernel/locking/lockdep.c:4923
 __raw_read_lock include/linux/rwlock_api_smp.h:149 [inline]
 _raw_read_lock+0x2d/0x40 kernel/locking/spinlock.c:223
 send_sigio+0xa9/0x340 fs/fcntl.c:800
 kill_fasync_rcu fs/fcntl.c:1009 [inline]
 kill_fasync fs/fcntl.c:1023 [inline]
 kill_fasync+0x21c/0x470 fs/fcntl.c:1016
 sock_wake_async+0xd2/0x160 net/socket.c:1337
 sk_wake_async include/net/sock.h:2259 [inline]
 sk_wake_async include/net/sock.h:2255 [inline]
 sock_def_error_report+0x2d7/0x4d0 net/core/sock.c:2808
 tcp_reset net/ipv4/tcp_input.c:4138 [inline]
 tcp_reset+0x195/0x4e0 net/ipv4/tcp_input.c:4114
 tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:5937 [inline]
 tcp_rcv_state_process+0x2ead/0x4c80 net/ipv4/tcp_input.c:6204
 tcp_v4_do_rcv+0x34c/0x8b0 net/ipv4/tcp_ipv4.c:1643
 tcp_v4_rcv+0x2f60/0x39c0 net/ipv4/tcp_ipv4.c:2003
 ip_protocol_deliver_rcu+0x57/0x880 net/ipv4/ip_input.c:204
 ip_local_deliver_finish+0x220/0x360 net/ipv4/ip_input.c:231
 NF_HOOK include/linux/netfilter.h:307 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 ip_local_deliver+0x1c8/0x4e0 net/ipv4/ip_input.c:252
 dst_input include/net/dst.h:441 [inline]
 ip_rcv_finish+0x1da/0x2f0 net/ipv4/ip_input.c:428
 NF_HOOK include/linux/netfilter.h:307 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 ip_rcv+0xd0/0x3c0 net/ipv4/ip_input.c:539
 __netif_receive_skb_one_core+0xf5/0x160 net/core/dev.c:5187
 __netif_receive_skb+0x27/0x1c0 net/core/dev.c:5301
 process_backlog+0x21e/0x7a0 net/core/dev.c:6133
 napi_poll net/core/dev.c:6571 [inline]
 net_rx_action+0x4c2/0x1070 net/core/dev.c:6639
 __do_softirq+0x26c/0x9f7 kernel/softirq.c:292
 run_ksoftirqd kernel/softirq.c:604 [inline]
 run_ksoftirqd+0x89/0x100 kernel/softirq.c:596
 smpboot_thread_fn+0x653/0x9e0 kernel/smpboot.c:165
 kthread+0x388/0x470 kernel/kthread.c:268
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible deadlock in send_sigio
  2020-04-04  5:55 possible deadlock in send_sigio syzbot
@ 2020-06-11  2:32 ` Waiman Long
  2020-06-11  7:43   ` Dmitry Vyukov
  2020-06-11 16:07   ` Eric W. Biederman
  0 siblings, 2 replies; 16+ messages in thread
From: Waiman Long @ 2020-06-11  2:32 UTC (permalink / raw)
  To: syzbot, adobriyan, akpm, allison, areber, aubrey.li, avagin,
	bfields, christian, cyphar, ebiederm, gregkh, guro, jlayton,
	joel, keescook, linmiaohe, linux-fsdevel, linux-kernel, mhocko,
	mingo, oleg, peterz, sargun, syzkaller-bugs, tglx, viro

On 4/4/20 1:55 AM, syzbot wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:    bef7b2a7 Merge tag 'devicetree-for-5.7' of git://git.kerne..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=15f39c5de00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=91b674b8f0368e69
> dashboard link: https://syzkaller.appspot.com/bug?extid=a9fb1457d720a55d6dc5
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1454c3b7e00000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12a22ac7e00000
>
> The bug was bisected to:
>
> commit 7bc3e6e55acf065500a24621f3b313e7e5998acf
> Author: Eric W. Biederman <ebiederm@xmission.com>
> Date:   Thu Feb 20 00:22:26 2020 +0000
>
>      proc: Use a list of inodes to flush from proc
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=165c4acde00000
> final crash:    https://syzkaller.appspot.com/x/report.txt?x=155c4acde00000
> console output: https://syzkaller.appspot.com/x/log.txt?x=115c4acde00000
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+a9fb1457d720a55d6dc5@syzkaller.appspotmail.com
> Fixes: 7bc3e6e55acf ("proc: Use a list of inodes to flush from proc")
>
> ========================================================
> WARNING: possible irq lock inversion dependency detected
> 5.6.0-syzkaller #0 Not tainted
> --------------------------------------------------------
> ksoftirqd/0/9 just changed the state of lock:
> ffffffff898090d8 (tasklist_lock){.+.?}-{2:2}, at: send_sigio+0xa9/0x340 fs/fcntl.c:800
> but this lock took another, SOFTIRQ-unsafe lock in the past:
>   (&pid->wait_pidfd){+.+.}-{2:2}
>
>
> and interrupts could create inverse lock ordering between them.
>
>
> other info that might help us debug this:
>   Possible interrupt unsafe locking scenario:
>
>         CPU0                    CPU1
>         ----                    ----
>    lock(&pid->wait_pidfd);
>                                 local_irq_disable();
>                                 lock(tasklist_lock);
>                                 lock(&pid->wait_pidfd);
>    <Interrupt>
>      lock(tasklist_lock);
>
>   *** DEADLOCK ***

That is a false positive. The qrwlock has the special property that it 
becomes unfair (for read lock) at interrupt context. So unless it is 
taking a write lock in the interrupt context, it won't go into deadlock. 
The current lockdep code does not capture the full semantics of qrwlock 
leading to this false positive.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible deadlock in send_sigio
  2020-06-11  2:32 ` Waiman Long
@ 2020-06-11  7:43   ` Dmitry Vyukov
  2020-06-11 13:51     ` Waiman Long
  2020-06-11 16:07   ` Eric W. Biederman
  1 sibling, 1 reply; 16+ messages in thread
From: Dmitry Vyukov @ 2020-06-11  7:43 UTC (permalink / raw)
  To: Waiman Long
  Cc: syzbot, Alexey Dobriyan, Andrew Morton, allison, areber,
	aubrey.li, Andrei Vagin, Bruce Fields, Christian Brauner, cyphar,
	Eric W. Biederman, Greg Kroah-Hartman, guro, Jeff Layton,
	Joel Fernandes, Kees Cook, linmiaohe, linux-fsdevel, LKML,
	Michal Hocko, Ingo Molnar, Oleg Nesterov, Peter Zijlstra, sargun,
	syzkaller-bugs, Thomas Gleixner, Al Viro

On Thu, Jun 11, 2020 at 4:33 AM Waiman Long <longman@redhat.com> wrote:
>
> On 4/4/20 1:55 AM, syzbot wrote:
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit:    bef7b2a7 Merge tag 'devicetree-for-5.7' of git://git.kerne..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=15f39c5de00000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=91b674b8f0368e69
> > dashboard link: https://syzkaller.appspot.com/bug?extid=a9fb1457d720a55d6dc5
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1454c3b7e00000
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12a22ac7e00000
> >
> > The bug was bisected to:
> >
> > commit 7bc3e6e55acf065500a24621f3b313e7e5998acf
> > Author: Eric W. Biederman <ebiederm@xmission.com>
> > Date:   Thu Feb 20 00:22:26 2020 +0000
> >
> >      proc: Use a list of inodes to flush from proc
> >
> > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=165c4acde00000
> > final crash:    https://syzkaller.appspot.com/x/report.txt?x=155c4acde00000
> > console output: https://syzkaller.appspot.com/x/log.txt?x=115c4acde00000
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+a9fb1457d720a55d6dc5@syzkaller.appspotmail.com
> > Fixes: 7bc3e6e55acf ("proc: Use a list of inodes to flush from proc")
> >
> > ========================================================
> > WARNING: possible irq lock inversion dependency detected
> > 5.6.0-syzkaller #0 Not tainted
> > --------------------------------------------------------
> > ksoftirqd/0/9 just changed the state of lock:
> > ffffffff898090d8 (tasklist_lock){.+.?}-{2:2}, at: send_sigio+0xa9/0x340 fs/fcntl.c:800
> > but this lock took another, SOFTIRQ-unsafe lock in the past:
> >   (&pid->wait_pidfd){+.+.}-{2:2}
> >
> >
> > and interrupts could create inverse lock ordering between them.
> >
> >
> > other info that might help us debug this:
> >   Possible interrupt unsafe locking scenario:
> >
> >         CPU0                    CPU1
> >         ----                    ----
> >    lock(&pid->wait_pidfd);
> >                                 local_irq_disable();
> >                                 lock(tasklist_lock);
> >                                 lock(&pid->wait_pidfd);
> >    <Interrupt>
> >      lock(tasklist_lock);
> >
> >   *** DEADLOCK ***
>
> That is a false positive. The qrwlock has the special property that it
> becomes unfair (for read lock) at interrupt context. So unless it is
> taking a write lock in the interrupt context, it won't go into deadlock.
> The current lockdep code does not capture the full semantics of qrwlock
> leading to this false positive.

Hi Longman

Thanks for looking into this.
Now the question is: how should we change lockdep annotations to fix this bug?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible deadlock in send_sigio
  2020-06-11  7:43   ` Dmitry Vyukov
@ 2020-06-11 13:51     ` Waiman Long
  2020-06-11 14:22       ` Peter Zijlstra
  0 siblings, 1 reply; 16+ messages in thread
From: Waiman Long @ 2020-06-11 13:51 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzbot, Alexey Dobriyan, Andrew Morton, allison, areber,
	aubrey.li, Andrei Vagin, Bruce Fields, Christian Brauner, cyphar,
	Eric W. Biederman, Greg Kroah-Hartman, guro, Jeff Layton,
	Joel Fernandes, Kees Cook, linmiaohe, linux-fsdevel, LKML,
	Michal Hocko, Ingo Molnar, Oleg Nesterov, Peter Zijlstra, sargun,
	syzkaller-bugs, Thomas Gleixner, Al Viro

On 6/11/20 3:43 AM, Dmitry Vyukov wrote:
> On Thu, Jun 11, 2020 at 4:33 AM Waiman Long <longman@redhat.com> wrote:
>> On 4/4/20 1:55 AM, syzbot wrote:
>>> Hello,
>>>
>>> syzbot found the following crash on:
>>>
>>> HEAD commit:    bef7b2a7 Merge tag 'devicetree-for-5.7' of git://git.kerne..
>>> git tree:       upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=15f39c5de00000
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=91b674b8f0368e69
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=a9fb1457d720a55d6dc5
>>> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1454c3b7e00000
>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12a22ac7e00000
>>>
>>> The bug was bisected to:
>>>
>>> commit 7bc3e6e55acf065500a24621f3b313e7e5998acf
>>> Author: Eric W. Biederman <ebiederm@xmission.com>
>>> Date:   Thu Feb 20 00:22:26 2020 +0000
>>>
>>>       proc: Use a list of inodes to flush from proc
>>>
>>> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=165c4acde00000
>>> final crash:    https://syzkaller.appspot.com/x/report.txt?x=155c4acde00000
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=115c4acde00000
>>>
>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>> Reported-by: syzbot+a9fb1457d720a55d6dc5@syzkaller.appspotmail.com
>>> Fixes: 7bc3e6e55acf ("proc: Use a list of inodes to flush from proc")
>>>
>>> ========================================================
>>> WARNING: possible irq lock inversion dependency detected
>>> 5.6.0-syzkaller #0 Not tainted
>>> --------------------------------------------------------
>>> ksoftirqd/0/9 just changed the state of lock:
>>> ffffffff898090d8 (tasklist_lock){.+.?}-{2:2}, at: send_sigio+0xa9/0x340 fs/fcntl.c:800
>>> but this lock took another, SOFTIRQ-unsafe lock in the past:
>>>    (&pid->wait_pidfd){+.+.}-{2:2}
>>>
>>>
>>> and interrupts could create inverse lock ordering between them.
>>>
>>>
>>> other info that might help us debug this:
>>>    Possible interrupt unsafe locking scenario:
>>>
>>>          CPU0                    CPU1
>>>          ----                    ----
>>>     lock(&pid->wait_pidfd);
>>>                                  local_irq_disable();
>>>                                  lock(tasklist_lock);
>>>                                  lock(&pid->wait_pidfd);
>>>     <Interrupt>
>>>       lock(tasklist_lock);
>>>
>>>    *** DEADLOCK ***
>> That is a false positive. The qrwlock has the special property that it
>> becomes unfair (for read lock) at interrupt context. So unless it is
>> taking a write lock in the interrupt context, it won't go into deadlock.
>> The current lockdep code does not capture the full semantics of qrwlock
>> leading to this false positive.
> Hi Longman
>
> Thanks for looking into this.
> Now the question is: how should we change lockdep annotations to fix this bug?

There was an old lockdep patch that I think may address the issue, but 
was not merged at the time. I will need to dig it out and see if it can 
be adapted to work in the current kernel. It may take some time.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible deadlock in send_sigio
  2020-06-11 13:51     ` Waiman Long
@ 2020-06-11 14:22       ` Peter Zijlstra
  2020-06-11 16:09         ` Waiman Long
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2020-06-11 14:22 UTC (permalink / raw)
  To: Waiman Long
  Cc: Dmitry Vyukov, syzbot, Alexey Dobriyan, Andrew Morton, allison,
	areber, aubrey.li, Andrei Vagin, Bruce Fields, Christian Brauner,
	cyphar, Eric W. Biederman, Greg Kroah-Hartman, guro, Jeff Layton,
	Joel Fernandes, Kees Cook, linmiaohe, linux-fsdevel, LKML,
	Michal Hocko, Ingo Molnar, Oleg Nesterov, sargun, syzkaller-bugs,
	Thomas Gleixner, Al Viro, Boqun Feng

On Thu, Jun 11, 2020 at 09:51:29AM -0400, Waiman Long wrote:

> There was an old lockdep patch that I think may address the issue, but was
> not merged at the time. I will need to dig it out and see if it can be
> adapted to work in the current kernel. It may take some time.

Boqun was working on that; I can't remember what happened, but ISTR it
was shaping up nice.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible deadlock in send_sigio
  2020-06-11  2:32 ` Waiman Long
  2020-06-11  7:43   ` Dmitry Vyukov
@ 2020-06-11 16:07   ` Eric W. Biederman
  1 sibling, 0 replies; 16+ messages in thread
From: Eric W. Biederman @ 2020-06-11 16:07 UTC (permalink / raw)
  To: Waiman Long
  Cc: syzbot, adobriyan, akpm, allison, areber, aubrey.li, avagin,
	bfields, christian, cyphar, gregkh, guro, jlayton, joel,
	keescook, linmiaohe, linux-fsdevel, linux-kernel, mhocko, mingo,
	oleg, peterz, sargun, syzkaller-bugs, tglx, viro

Waiman Long <longman@redhat.com> writes:

> On 4/4/20 1:55 AM, syzbot wrote:
>> Hello,
>>
>> syzbot found the following crash on:
>>
>> HEAD commit:    bef7b2a7 Merge tag 'devicetree-for-5.7' of git://git.kerne..
>> git tree:       upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=15f39c5de00000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=91b674b8f0368e69
>> dashboard link: https://syzkaller.appspot.com/bug?extid=a9fb1457d720a55d6dc5
>> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1454c3b7e00000
>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12a22ac7e00000
>>
>> The bug was bisected to:
>>
>> commit 7bc3e6e55acf065500a24621f3b313e7e5998acf
>> Author: Eric W. Biederman <ebiederm@xmission.com>
>> Date:   Thu Feb 20 00:22:26 2020 +0000
>>
>>      proc: Use a list of inodes to flush from proc
>>
>> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=165c4acde00000
>> final crash:    https://syzkaller.appspot.com/x/report.txt?x=155c4acde00000
>> console output: https://syzkaller.appspot.com/x/log.txt?x=115c4acde00000
>>
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: syzbot+a9fb1457d720a55d6dc5@syzkaller.appspotmail.com
>> Fixes: 7bc3e6e55acf ("proc: Use a list of inodes to flush from proc")
>>
>> ========================================================
>> WARNING: possible irq lock inversion dependency detected
>> 5.6.0-syzkaller #0 Not tainted
>> --------------------------------------------------------
>> ksoftirqd/0/9 just changed the state of lock:
>> ffffffff898090d8 (tasklist_lock){.+.?}-{2:2}, at: send_sigio+0xa9/0x340 fs/fcntl.c:800
>> but this lock took another, SOFTIRQ-unsafe lock in the past:
>>   (&pid->wait_pidfd){+.+.}-{2:2}
>>
>>
>> and interrupts could create inverse lock ordering between them.
>>
>>
>> other info that might help us debug this:
>>   Possible interrupt unsafe locking scenario:
>>
>>         CPU0                    CPU1
>>         ----                    ----
>>    lock(&pid->wait_pidfd);
>>                                 local_irq_disable();
>>                                 lock(tasklist_lock);
>>                                 lock(&pid->wait_pidfd);
>>    <Interrupt>
>>      lock(tasklist_lock);
>>
>>   *** DEADLOCK ***
>
> That is a false positive. The qrwlock has the special property that it becomes
> unfair (for read lock) at interrupt context. So unless it is taking a write lock
> in the interrupt context, it won't go into deadlock. The current lockdep code
> does not capture the full semantics of qrwlock leading to this false positive.
>

Whatever it was it was fixed with:
63f818f46af9 ("proc: Use a dedicated lock in struct pid")

It is classic lock inversion caused by not disabling irqs.

Unless I am completely mistaken any non-irq code path that does:
	write_lock_irq(&tasklist_lock);
        spin_lock(&pid->lock);

Is susceptible to deadlock with:
	spin_lock(&pid->lock);
        <Interrupt>
        read_lock(&task_list_lock);

Because it remains a lock inversion even with only a read lock taken in
irq context in irq context.

Eric

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible deadlock in send_sigio
  2020-06-11 14:22       ` Peter Zijlstra
@ 2020-06-11 16:09         ` Waiman Long
  2020-06-11 23:55           ` Boqun Feng
  0 siblings, 1 reply; 16+ messages in thread
From: Waiman Long @ 2020-06-11 16:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dmitry Vyukov, syzbot, Alexey Dobriyan, Andrew Morton, allison,
	areber, aubrey.li, Andrei Vagin, Bruce Fields, Christian Brauner,
	cyphar, Eric W. Biederman, Greg Kroah-Hartman, guro, Jeff Layton,
	Joel Fernandes, Kees Cook, linmiaohe, linux-fsdevel, LKML,
	Michal Hocko, Ingo Molnar, Oleg Nesterov, sargun, syzkaller-bugs,
	Thomas Gleixner, Al Viro, Boqun Feng

On 6/11/20 10:22 AM, Peter Zijlstra wrote:
> On Thu, Jun 11, 2020 at 09:51:29AM -0400, Waiman Long wrote:
>
>> There was an old lockdep patch that I think may address the issue, but was
>> not merged at the time. I will need to dig it out and see if it can be
>> adapted to work in the current kernel. It may take some time.
> Boqun was working on that; I can't remember what happened, but ISTR it
> was shaping up nice.
>
Yes, I am talking about Boqun's patch. However, I think he had moved to 
another company and so may not be able to actively work on that again.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible deadlock in send_sigio
  2020-06-11 16:09         ` Waiman Long
@ 2020-06-11 23:55           ` Boqun Feng
  2020-06-12  1:55             ` Waiman Long
  2020-06-12  7:01             ` Boqun Feng
  0 siblings, 2 replies; 16+ messages in thread
From: Boqun Feng @ 2020-06-11 23:55 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Dmitry Vyukov, syzbot, Alexey Dobriyan,
	Andrew Morton, allison, areber, aubrey.li, Andrei Vagin,
	Bruce Fields, Christian Brauner, cyphar, Eric W. Biederman,
	Greg Kroah-Hartman, guro, Jeff Layton, Joel Fernandes, Kees Cook,
	linmiaohe, linux-fsdevel, LKML, Michal Hocko, Ingo Molnar,
	Oleg Nesterov, sargun, syzkaller-bugs, Thomas Gleixner, Al Viro

Hi Peter and Waiman,

On Thu, Jun 11, 2020 at 12:09:59PM -0400, Waiman Long wrote:
> On 6/11/20 10:22 AM, Peter Zijlstra wrote:
> > On Thu, Jun 11, 2020 at 09:51:29AM -0400, Waiman Long wrote:
> > 
> > > There was an old lockdep patch that I think may address the issue, but was
> > > not merged at the time. I will need to dig it out and see if it can be
> > > adapted to work in the current kernel. It may take some time.
> > Boqun was working on that; I can't remember what happened, but ISTR it
> > was shaping up nice.
> > 
> Yes, I am talking about Boqun's patch. However, I think he had moved to
> another company and so may not be able to actively work on that again.
> 

I think you are talking about the rescursive read deadlock detection
patchset:

	https://lore.kernel.org/lkml/20180411135110.9217-1-boqun.feng@gmail.com/

Let me have a good and send a new version based on today's master of tip
tree.

Regards,
Boqun

> Cheers,
> Longman
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible deadlock in send_sigio
  2020-06-11 23:55           ` Boqun Feng
@ 2020-06-12  1:55             ` Waiman Long
  2020-06-12  7:01             ` Boqun Feng
  1 sibling, 0 replies; 16+ messages in thread
From: Waiman Long @ 2020-06-12  1:55 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Peter Zijlstra, Dmitry Vyukov, syzbot, Alexey Dobriyan,
	Andrew Morton, allison, areber, aubrey.li, Andrei Vagin,
	Bruce Fields, Christian Brauner, cyphar, Eric W. Biederman,
	Greg Kroah-Hartman, guro, Jeff Layton, Joel Fernandes, Kees Cook,
	linmiaohe, linux-fsdevel, LKML, Michal Hocko, Ingo Molnar,
	Oleg Nesterov, sargun, syzkaller-bugs, Thomas Gleixner, Al Viro

On 6/11/20 7:55 PM, Boqun Feng wrote:
> Hi Peter and Waiman,
>
> On Thu, Jun 11, 2020 at 12:09:59PM -0400, Waiman Long wrote:
>> On 6/11/20 10:22 AM, Peter Zijlstra wrote:
>>> On Thu, Jun 11, 2020 at 09:51:29AM -0400, Waiman Long wrote:
>>>
>>>> There was an old lockdep patch that I think may address the issue, but was
>>>> not merged at the time. I will need to dig it out and see if it can be
>>>> adapted to work in the current kernel. It may take some time.
>>> Boqun was working on that; I can't remember what happened, but ISTR it
>>> was shaping up nice.
>>>
>> Yes, I am talking about Boqun's patch. However, I think he had moved to
>> another company and so may not be able to actively work on that again.
>>
> I think you are talking about the rescursive read deadlock detection
> patchset:
>
> 	https://lore.kernel.org/lkml/20180411135110.9217-1-boqun.feng@gmail.com/
>
> Let me have a good and send a new version based on today's master of tip
> tree.
>
> Regards,
> Boqun

Good to hear back from you. I thought you may not able to work on it again.

Cheers,
Longman



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible deadlock in send_sigio
  2020-06-11 23:55           ` Boqun Feng
  2020-06-12  1:55             ` Waiman Long
@ 2020-06-12  7:01             ` Boqun Feng
  2020-06-15 16:37               ` Waiman Long
  2020-06-15 16:49               ` Matthew Wilcox
  1 sibling, 2 replies; 16+ messages in thread
From: Boqun Feng @ 2020-06-12  7:01 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Dmitry Vyukov, syzbot, Alexey Dobriyan,
	Andrew Morton, allison, areber, aubrey.li, Andrei Vagin,
	Bruce Fields, Christian Brauner, cyphar, Eric W. Biederman,
	Greg Kroah-Hartman, guro, Jeff Layton, Joel Fernandes, Kees Cook,
	linmiaohe, linux-fsdevel, LKML, Michal Hocko, Ingo Molnar,
	Oleg Nesterov, sargun, syzkaller-bugs, Thomas Gleixner, Al Viro

On Fri, Jun 12, 2020 at 07:55:26AM +0800, Boqun Feng wrote:
> Hi Peter and Waiman,
> 
> On Thu, Jun 11, 2020 at 12:09:59PM -0400, Waiman Long wrote:
> > On 6/11/20 10:22 AM, Peter Zijlstra wrote:
> > > On Thu, Jun 11, 2020 at 09:51:29AM -0400, Waiman Long wrote:
> > > 
> > > > There was an old lockdep patch that I think may address the issue, but was
> > > > not merged at the time. I will need to dig it out and see if it can be
> > > > adapted to work in the current kernel. It may take some time.
> > > Boqun was working on that; I can't remember what happened, but ISTR it
> > > was shaping up nice.
> > > 
> > Yes, I am talking about Boqun's patch. However, I think he had moved to
> > another company and so may not be able to actively work on that again.
> > 
> 
> I think you are talking about the rescursive read deadlock detection
> patchset:
> 
> 	https://lore.kernel.org/lkml/20180411135110.9217-1-boqun.feng@gmail.com/
> 
> Let me have a good and send a new version based on today's master of tip
> tree.
> 

FWIW, with the following patch, I think we can avoid to the false
positives. But solely with this patch, we don't have the ability to
detect deadlocks with recursive locks..

I've managed to rebase my patchset, but need some time to tweak it to
work properly, in the meantime, Dmitry, could you give this a try?

Regards,
Boqun

------------->8
Subject: [PATCH] locking: More accurate annotations for read_lock()

On the archs using QUEUED_RWLOCKS, read_lock() is not always a recursive
read lock, actually it's only recursive if in_interrupt() is true. So
change the annotation accordingly to catch more deadlocks.

Note we used to treat read_lock() as pure recursive read locks in
lib/locking-seftest.c, and this is useful, especially for the lockdep
development selftest, so we keep this via a variable to force switching
lock annotation for read_lock().

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 include/linux/lockdep.h | 35 ++++++++++++++++++++++++++++++++++-
 lib/locking-selftest.c  | 11 +++++++++++
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 8fce5c98a4b0..50aedbba0812 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -43,6 +43,7 @@ enum lockdep_wait_type {
 #include <linux/list.h>
 #include <linux/debug_locks.h>
 #include <linux/stacktrace.h>
+#include <linux/preempt.h>
 
 /*
  * We'd rather not expose kernel/lockdep_states.h this wide, but we do need
@@ -640,6 +641,31 @@ static inline void print_irqtrace_events(struct task_struct *curr)
 }
 #endif
 
+/* Variable used to make lockdep treat read_lock() as recursive in selftests */
+#ifdef CONFIG_DEBUG_LOCKING_API_SELFTESTS
+extern unsigned int force_read_lock_recursive;
+#else /* CONFIG_DEBUG_LOCKING_API_SELFTESTS */
+#define force_read_lock_recursive 0
+#endif /* CONFIG_DEBUG_LOCKING_API_SELFTESTS */
+
+#ifdef CONFIG_LOCKDEP
+/*
+ * read_lock() is recursive if:
+ * 1. We force lockdep think this way in selftests or
+ * 2. The implementation is not queued read/write lock or
+ * 3. The locker is at an in_interrupt() context.
+ */
+static inline bool read_lock_is_recursive(void)
+{
+	return force_read_lock_recursive ||
+	       !IS_ENABLED(CONFIG_QUEUED_RWLOCKS) ||
+	       in_interrupt();
+}
+#else /* CONFIG_LOCKDEP */
+/* If !LOCKDEP, the value is meaningless */
+#define read_lock_is_recursive() 0
+#endif
+
 /*
  * For trivial one-depth nesting of a lock-class, the following
  * global define can be used. (Subsystems with multiple levels
@@ -661,7 +687,14 @@ static inline void print_irqtrace_events(struct task_struct *curr)
 #define spin_release(l, i)			lock_release(l, i)
 
 #define rwlock_acquire(l, s, t, i)		lock_acquire_exclusive(l, s, t, NULL, i)
-#define rwlock_acquire_read(l, s, t, i)		lock_acquire_shared_recursive(l, s, t, NULL, i)
+#define rwlock_acquire_read(l, s, t, i)					\
+do {									\
+	if (read_lock_is_recursive())					\
+		lock_acquire_shared_recursive(l, s, t, NULL, i);	\
+	else								\
+		lock_acquire_shared(l, s, t, NULL, i);			\
+} while (0)
+
 #define rwlock_release(l, i)			lock_release(l, i)
 
 #define seqcount_acquire(l, s, t, i)		lock_acquire_exclusive(l, s, t, NULL, i)
diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index 14f44f59e733..caadc4dd3368 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -28,6 +28,7 @@
  * Change this to 1 if you want to see the failure printouts:
  */
 static unsigned int debug_locks_verbose;
+unsigned int force_read_lock_recursive;
 
 static DEFINE_WD_CLASS(ww_lockdep);
 
@@ -1978,6 +1979,11 @@ void locking_selftest(void)
 		return;
 	}
 
+	/*
+	 * treats read_lock() as recursive read locks for testing purpose
+	 */
+	force_read_lock_recursive = 1;
+
 	/*
 	 * Run the testsuite:
 	 */
@@ -2073,6 +2079,11 @@ void locking_selftest(void)
 
 	ww_tests();
 
+	force_read_lock_recursive = 0;
+	/*
+	 * queued_read_lock() specific test cases can be put here
+	 */
+
 	if (unexpected_testcase_failures) {
 		printk("-----------------------------------------------------------------\n");
 		debug_locks = 0;
-- 
2.26.2



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible deadlock in send_sigio
  2020-06-12  7:01             ` Boqun Feng
@ 2020-06-15 16:37               ` Waiman Long
  2020-06-15 16:49               ` Matthew Wilcox
  1 sibling, 0 replies; 16+ messages in thread
From: Waiman Long @ 2020-06-15 16:37 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Peter Zijlstra, Dmitry Vyukov, syzbot, Alexey Dobriyan,
	Andrew Morton, allison, areber, aubrey.li, Andrei Vagin,
	Bruce Fields, Christian Brauner, cyphar, Eric W. Biederman,
	Greg Kroah-Hartman, guro, Jeff Layton, Joel Fernandes, Kees Cook,
	linmiaohe, linux-fsdevel, LKML, Michal Hocko, Ingo Molnar,
	Oleg Nesterov, sargun, syzkaller-bugs, Thomas Gleixner, Al Viro

On 6/12/20 3:01 AM, Boqun Feng wrote:
> On Fri, Jun 12, 2020 at 07:55:26AM +0800, Boqun Feng wrote:
>> Hi Peter and Waiman,
>>
>> On Thu, Jun 11, 2020 at 12:09:59PM -0400, Waiman Long wrote:
>>> On 6/11/20 10:22 AM, Peter Zijlstra wrote:
>>>> On Thu, Jun 11, 2020 at 09:51:29AM -0400, Waiman Long wrote:
>>>>
>>>>> There was an old lockdep patch that I think may address the issue, but was
>>>>> not merged at the time. I will need to dig it out and see if it can be
>>>>> adapted to work in the current kernel. It may take some time.
>>>> Boqun was working on that; I can't remember what happened, but ISTR it
>>>> was shaping up nice.
>>>>
>>> Yes, I am talking about Boqun's patch. However, I think he had moved to
>>> another company and so may not be able to actively work on that again.
>>>
>> I think you are talking about the rescursive read deadlock detection
>> patchset:
>>
>> 	https://lore.kernel.org/lkml/20180411135110.9217-1-boqun.feng@gmail.com/
>>
>> Let me have a good and send a new version based on today's master of tip
>> tree.
>>
> FWIW, with the following patch, I think we can avoid to the false
> positives. But solely with this patch, we don't have the ability to
> detect deadlocks with recursive locks..
>
> I've managed to rebase my patchset, but need some time to tweak it to
> work properly, in the meantime, Dmitry, could you give this a try?
>
> Regards,
> Boqun
>
> ------------->8
> Subject: [PATCH] locking: More accurate annotations for read_lock()
>
> On the archs using QUEUED_RWLOCKS, read_lock() is not always a recursive
> read lock, actually it's only recursive if in_interrupt() is true. So
> change the annotation accordingly to catch more deadlocks.
>
> Note we used to treat read_lock() as pure recursive read locks in
> lib/locking-seftest.c, and this is useful, especially for the lockdep
> development selftest, so we keep this via a variable to force switching
> lock annotation for read_lock().
>
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> ---
>   include/linux/lockdep.h | 35 ++++++++++++++++++++++++++++++++++-
>   lib/locking-selftest.c  | 11 +++++++++++
>   2 files changed, 45 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
> index 8fce5c98a4b0..50aedbba0812 100644
> --- a/include/linux/lockdep.h
> +++ b/include/linux/lockdep.h
> @@ -43,6 +43,7 @@ enum lockdep_wait_type {
>   #include <linux/list.h>
>   #include <linux/debug_locks.h>
>   #include <linux/stacktrace.h>
> +#include <linux/preempt.h>
>   
>   /*
>    * We'd rather not expose kernel/lockdep_states.h this wide, but we do need
> @@ -640,6 +641,31 @@ static inline void print_irqtrace_events(struct task_struct *curr)
>   }
>   #endif
>   
> +/* Variable used to make lockdep treat read_lock() as recursive in selftests */
> +#ifdef CONFIG_DEBUG_LOCKING_API_SELFTESTS
> +extern unsigned int force_read_lock_recursive;
> +#else /* CONFIG_DEBUG_LOCKING_API_SELFTESTS */
> +#define force_read_lock_recursive 0
> +#endif /* CONFIG_DEBUG_LOCKING_API_SELFTESTS */
> +
> +#ifdef CONFIG_LOCKDEP
> +/*
> + * read_lock() is recursive if:
> + * 1. We force lockdep think this way in selftests or
> + * 2. The implementation is not queued read/write lock or
> + * 3. The locker is at an in_interrupt() context.
> + */
> +static inline bool read_lock_is_recursive(void)
> +{
> +	return force_read_lock_recursive ||
> +	       !IS_ENABLED(CONFIG_QUEUED_RWLOCKS) ||
> +	       in_interrupt();
> +}
> +#else /* CONFIG_LOCKDEP */
> +/* If !LOCKDEP, the value is meaningless */
> +#define read_lock_is_recursive() 0
> +#endif
> +
>   /*
>    * For trivial one-depth nesting of a lock-class, the following
>    * global define can be used. (Subsystems with multiple levels
> @@ -661,7 +687,14 @@ static inline void print_irqtrace_events(struct task_struct *curr)
>   #define spin_release(l, i)			lock_release(l, i)
>   
>   #define rwlock_acquire(l, s, t, i)		lock_acquire_exclusive(l, s, t, NULL, i)
> -#define rwlock_acquire_read(l, s, t, i)		lock_acquire_shared_recursive(l, s, t, NULL, i)
> +#define rwlock_acquire_read(l, s, t, i)					\
> +do {									\
> +	if (read_lock_is_recursive())					\
> +		lock_acquire_shared_recursive(l, s, t, NULL, i);	\
> +	else								\
> +		lock_acquire_shared(l, s, t, NULL, i);			\
> +} while (0)
> +
>   #define rwlock_release(l, i)			lock_release(l, i)
>   
>   #define seqcount_acquire(l, s, t, i)		lock_acquire_exclusive(l, s, t, NULL, i)
> diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
> index 14f44f59e733..caadc4dd3368 100644
> --- a/lib/locking-selftest.c
> +++ b/lib/locking-selftest.c
> @@ -28,6 +28,7 @@
>    * Change this to 1 if you want to see the failure printouts:
>    */
>   static unsigned int debug_locks_verbose;
> +unsigned int force_read_lock_recursive;
>   
>   static DEFINE_WD_CLASS(ww_lockdep);
>   
> @@ -1978,6 +1979,11 @@ void locking_selftest(void)
>   		return;
>   	}
>   
> +	/*
> +	 * treats read_lock() as recursive read locks for testing purpose
> +	 */
> +	force_read_lock_recursive = 1;
> +
>   	/*
>   	 * Run the testsuite:
>   	 */
> @@ -2073,6 +2079,11 @@ void locking_selftest(void)
>   
>   	ww_tests();
>   
> +	force_read_lock_recursive = 0;
> +	/*
> +	 * queued_read_lock() specific test cases can be put here
> +	 */
> +
>   	if (unexpected_testcase_failures) {
>   		printk("-----------------------------------------------------------------\n");
>   		debug_locks = 0;

Your patch looks good to me.

Reviewed-by: Waiman Long <longman@redhat.com>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible deadlock in send_sigio
  2020-06-12  7:01             ` Boqun Feng
  2020-06-15 16:37               ` Waiman Long
@ 2020-06-15 16:49               ` Matthew Wilcox
  2020-06-15 17:13                 ` Waiman Long
  1 sibling, 1 reply; 16+ messages in thread
From: Matthew Wilcox @ 2020-06-15 16:49 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Waiman Long, Peter Zijlstra, Dmitry Vyukov, syzbot,
	Alexey Dobriyan, Andrew Morton, allison, areber, aubrey.li,
	Andrei Vagin, Bruce Fields, Christian Brauner, cyphar,
	Eric W. Biederman, Greg Kroah-Hartman, guro, Jeff Layton,
	Joel Fernandes, Kees Cook, linmiaohe, linux-fsdevel, LKML,
	Michal Hocko, Ingo Molnar, Oleg Nesterov, sargun, syzkaller-bugs,
	Thomas Gleixner, Al Viro

On Fri, Jun 12, 2020 at 03:01:01PM +0800, Boqun Feng wrote:
> On the archs using QUEUED_RWLOCKS, read_lock() is not always a recursive
> read lock, actually it's only recursive if in_interrupt() is true. So
> change the annotation accordingly to catch more deadlocks.

[...]

> +#ifdef CONFIG_LOCKDEP
> +/*
> + * read_lock() is recursive if:
> + * 1. We force lockdep think this way in selftests or
> + * 2. The implementation is not queued read/write lock or
> + * 3. The locker is at an in_interrupt() context.
> + */
> +static inline bool read_lock_is_recursive(void)
> +{
> +	return force_read_lock_recursive ||
> +	       !IS_ENABLED(CONFIG_QUEUED_RWLOCKS) ||
> +	       in_interrupt();
> +}

I'm a bit uncomfortable with having the _lockdep_ definition of whether
a read lock is recursive depend on what the _implementation_ is.
The locking semantics should be the same, no matter which architecture
you're running on.  If we rely on read locks being recursive in common
code then we have a locking bug on architectures which don't use queued
rwlocks.

I don't know whether we should just tell the people who aren't using
queued rwlocks that they have a new requirement or whether we should
say that read locks are never recursive, but having this inconsistency
is not a good idea!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible deadlock in send_sigio
  2020-06-15 16:49               ` Matthew Wilcox
@ 2020-06-15 17:13                 ` Waiman Long
  2020-06-15 20:40                   ` Matthew Wilcox
  0 siblings, 1 reply; 16+ messages in thread
From: Waiman Long @ 2020-06-15 17:13 UTC (permalink / raw)
  To: Matthew Wilcox, Boqun Feng
  Cc: Peter Zijlstra, Dmitry Vyukov, syzbot, Alexey Dobriyan,
	Andrew Morton, allison, areber, aubrey.li, Andrei Vagin,
	Bruce Fields, Christian Brauner, cyphar, Eric W. Biederman,
	Greg Kroah-Hartman, guro, Jeff Layton, Joel Fernandes, Kees Cook,
	linmiaohe, linux-fsdevel, LKML, Michal Hocko, Ingo Molnar,
	Oleg Nesterov, sargun, syzkaller-bugs, Thomas Gleixner, Al Viro

On 6/15/20 12:49 PM, Matthew Wilcox wrote:
> On Fri, Jun 12, 2020 at 03:01:01PM +0800, Boqun Feng wrote:
>> On the archs using QUEUED_RWLOCKS, read_lock() is not always a recursive
>> read lock, actually it's only recursive if in_interrupt() is true. So
>> change the annotation accordingly to catch more deadlocks.
> [...]
>
>> +#ifdef CONFIG_LOCKDEP
>> +/*
>> + * read_lock() is recursive if:
>> + * 1. We force lockdep think this way in selftests or
>> + * 2. The implementation is not queued read/write lock or
>> + * 3. The locker is at an in_interrupt() context.
>> + */
>> +static inline bool read_lock_is_recursive(void)
>> +{
>> +	return force_read_lock_recursive ||
>> +	       !IS_ENABLED(CONFIG_QUEUED_RWLOCKS) ||
>> +	       in_interrupt();
>> +}
> I'm a bit uncomfortable with having the _lockdep_ definition of whether
> a read lock is recursive depend on what the _implementation_ is.
> The locking semantics should be the same, no matter which architecture
> you're running on.  If we rely on read locks being recursive in common
> code then we have a locking bug on architectures which don't use queued
> rwlocks.
>
> I don't know whether we should just tell the people who aren't using
> queued rwlocks that they have a new requirement or whether we should
> say that read locks are never recursive, but having this inconsistency
> is not a good idea!

Actually, qrwlock is more restrictive. It is possible that systems with 
qrwlock may hit deadlock which doesn't happens in other systems that use 
recursive rwlock. However, the current lockdep code doesn't detect those 
cases.

Changing lockdep to only use qrwlock semantics can be problematic as the 
code hunk in locking selftest is due to the fact that it assumes 
recursive lock. So we need to change that. Anyway, this patch can allow 
us to see if current qrwlock semantics may have potential deadlock 
problem in the current code. I actually have bug report about deadlock 
due to qrwlock semantics in RHEL7. So I would certainly like to see if 
the current upstream code may have also this kind of problem.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible deadlock in send_sigio
  2020-06-15 17:13                 ` Waiman Long
@ 2020-06-15 20:40                   ` Matthew Wilcox
  2020-06-16  0:13                     ` Boqun Feng
  0 siblings, 1 reply; 16+ messages in thread
From: Matthew Wilcox @ 2020-06-15 20:40 UTC (permalink / raw)
  To: Waiman Long
  Cc: Boqun Feng, Peter Zijlstra, Dmitry Vyukov, syzbot,
	Alexey Dobriyan, Andrew Morton, allison, areber, aubrey.li,
	Andrei Vagin, Bruce Fields, Christian Brauner, cyphar,
	Eric W. Biederman, Greg Kroah-Hartman, guro, Jeff Layton,
	Joel Fernandes, Kees Cook, linmiaohe, linux-fsdevel, LKML,
	Michal Hocko, Ingo Molnar, Oleg Nesterov, sargun, syzkaller-bugs,
	Thomas Gleixner, Al Viro

On Mon, Jun 15, 2020 at 01:13:51PM -0400, Waiman Long wrote:
> On 6/15/20 12:49 PM, Matthew Wilcox wrote:
> > On Fri, Jun 12, 2020 at 03:01:01PM +0800, Boqun Feng wrote:
> > > On the archs using QUEUED_RWLOCKS, read_lock() is not always a recursive
> > > read lock, actually it's only recursive if in_interrupt() is true. So
> > > change the annotation accordingly to catch more deadlocks.
> > [...]
> > 
> > > +#ifdef CONFIG_LOCKDEP
> > > +/*
> > > + * read_lock() is recursive if:
> > > + * 1. We force lockdep think this way in selftests or
> > > + * 2. The implementation is not queued read/write lock or
> > > + * 3. The locker is at an in_interrupt() context.
> > > + */
> > > +static inline bool read_lock_is_recursive(void)
> > > +{
> > > +	return force_read_lock_recursive ||
> > > +	       !IS_ENABLED(CONFIG_QUEUED_RWLOCKS) ||
> > > +	       in_interrupt();
> > > +}
> > I'm a bit uncomfortable with having the _lockdep_ definition of whether
> > a read lock is recursive depend on what the _implementation_ is.
> > The locking semantics should be the same, no matter which architecture
> > you're running on.  If we rely on read locks being recursive in common
> > code then we have a locking bug on architectures which don't use queued
> > rwlocks.
> > 
> > I don't know whether we should just tell the people who aren't using
> > queued rwlocks that they have a new requirement or whether we should
> > say that read locks are never recursive, but having this inconsistency
> > is not a good idea!
> 
> Actually, qrwlock is more restrictive. It is possible that systems with
> qrwlock may hit deadlock which doesn't happens in other systems that use
> recursive rwlock. However, the current lockdep code doesn't detect those
> cases.

Oops.  I misread.  Still, my point stands; we should have the same
definition of how you're allowed to use locks from the lockdep point of
view, even if the underlying implementation won't deadlock on a particular
usage model.


So I'd be happy with:

+	return lockdep_pretend_in_interrupt || in_interrupt();

to allow the test-suite to test that it works as expected, without
actually disabling interrupts while the testsuite runs.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible deadlock in send_sigio
  2020-06-15 20:40                   ` Matthew Wilcox
@ 2020-06-16  0:13                     ` Boqun Feng
  2020-06-16  0:31                       ` Waiman Long
  0 siblings, 1 reply; 16+ messages in thread
From: Boqun Feng @ 2020-06-16  0:13 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Waiman Long, Peter Zijlstra, Dmitry Vyukov, syzbot,
	Alexey Dobriyan, Andrew Morton, allison, areber, aubrey.li,
	Andrei Vagin, Bruce Fields, Christian Brauner, cyphar,
	Eric W. Biederman, Greg Kroah-Hartman, guro, Jeff Layton,
	Joel Fernandes, Kees Cook, linmiaohe, linux-fsdevel, LKML,
	Michal Hocko, Ingo Molnar, Oleg Nesterov, sargun, syzkaller-bugs,
	Thomas Gleixner, Al Viro

[-- Attachment #1: Type: text/plain, Size: 3152 bytes --]

Hi Matthew,

On Mon, Jun 15, 2020 at 01:40:46PM -0700, Matthew Wilcox wrote:
> On Mon, Jun 15, 2020 at 01:13:51PM -0400, Waiman Long wrote:
> > On 6/15/20 12:49 PM, Matthew Wilcox wrote:
> > > On Fri, Jun 12, 2020 at 03:01:01PM +0800, Boqun Feng wrote:
> > > > On the archs using QUEUED_RWLOCKS, read_lock() is not always a recursive
> > > > read lock, actually it's only recursive if in_interrupt() is true. So
> > > > change the annotation accordingly to catch more deadlocks.
> > > [...]
> > > 
> > > > +#ifdef CONFIG_LOCKDEP
> > > > +/*
> > > > + * read_lock() is recursive if:
> > > > + * 1. We force lockdep think this way in selftests or
> > > > + * 2. The implementation is not queued read/write lock or
> > > > + * 3. The locker is at an in_interrupt() context.
> > > > + */
> > > > +static inline bool read_lock_is_recursive(void)
> > > > +{
> > > > +	return force_read_lock_recursive ||
> > > > +	       !IS_ENABLED(CONFIG_QUEUED_RWLOCKS) ||
> > > > +	       in_interrupt();
> > > > +}
> > > I'm a bit uncomfortable with having the _lockdep_ definition of whether
> > > a read lock is recursive depend on what the _implementation_ is.
> > > The locking semantics should be the same, no matter which architecture
> > > you're running on.  If we rely on read locks being recursive in common
> > > code then we have a locking bug on architectures which don't use queued
> > > rwlocks.
> > > 
> > > I don't know whether we should just tell the people who aren't using
> > > queued rwlocks that they have a new requirement or whether we should
> > > say that read locks are never recursive, but having this inconsistency
> > > is not a good idea!
> > 
> > Actually, qrwlock is more restrictive. It is possible that systems with
> > qrwlock may hit deadlock which doesn't happens in other systems that use
> > recursive rwlock. However, the current lockdep code doesn't detect those
> > cases.
> 
> Oops.  I misread.  Still, my point stands; we should have the same
> definition of how you're allowed to use locks from the lockdep point of
> view, even if the underlying implementation won't deadlock on a particular
> usage model.
> 

I understand your point, but such a change will require us to notify
almost every developer using rwlocks and help them to get their code
right, and that requires time and work, while currently I want to focus
on the correctness of the detection, and without that being merged, we
don't have a way to detect those problems. So I think it's better that
we have the detection reviewed and tested for a while (given that x86
uses qrwlock, so it will get a lot chances for testing), after that we
we have the confidence (and the tool) to educate people the "new"
semantics of rwlock. So I'd like to keep this patch as it is for now.

> 
> So I'd be happy with:
> 
> +	return lockdep_pretend_in_interrupt || in_interrupt();
> 
> to allow the test-suite to test that it works as expected, without
> actually disabling interrupts while the testsuite runs.

I've used 'force_read_lock_recursive' for this purpose in this patch.

Regards,
Boqun

> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible deadlock in send_sigio
  2020-06-16  0:13                     ` Boqun Feng
@ 2020-06-16  0:31                       ` Waiman Long
  0 siblings, 0 replies; 16+ messages in thread
From: Waiman Long @ 2020-06-16  0:31 UTC (permalink / raw)
  To: Boqun Feng, Matthew Wilcox
  Cc: Peter Zijlstra, Dmitry Vyukov, syzbot, Alexey Dobriyan,
	Andrew Morton, allison, areber, aubrey.li, Andrei Vagin,
	Bruce Fields, Christian Brauner, cyphar, Eric W. Biederman,
	Greg Kroah-Hartman, guro, Jeff Layton, Joel Fernandes, Kees Cook,
	linmiaohe, linux-fsdevel, LKML, Michal Hocko, Ingo Molnar,
	Oleg Nesterov, sargun, syzkaller-bugs, Thomas Gleixner, Al Viro

On 6/15/20 8:13 PM, Boqun Feng wrote:
> Hi Matthew,
>
> On Mon, Jun 15, 2020 at 01:40:46PM -0700, Matthew Wilcox wrote:
>> On Mon, Jun 15, 2020 at 01:13:51PM -0400, Waiman Long wrote:
>>> On 6/15/20 12:49 PM, Matthew Wilcox wrote:
>>>> On Fri, Jun 12, 2020 at 03:01:01PM +0800, Boqun Feng wrote:
>>>>> On the archs using QUEUED_RWLOCKS, read_lock() is not always a recursive
>>>>> read lock, actually it's only recursive if in_interrupt() is true. So
>>>>> change the annotation accordingly to catch more deadlocks.
>>>> [...]
>>>>
>>>>> +#ifdef CONFIG_LOCKDEP
>>>>> +/*
>>>>> + * read_lock() is recursive if:
>>>>> + * 1. We force lockdep think this way in selftests or
>>>>> + * 2. The implementation is not queued read/write lock or
>>>>> + * 3. The locker is at an in_interrupt() context.
>>>>> + */
>>>>> +static inline bool read_lock_is_recursive(void)
>>>>> +{
>>>>> +	return force_read_lock_recursive ||
>>>>> +	       !IS_ENABLED(CONFIG_QUEUED_RWLOCKS) ||
>>>>> +	       in_interrupt();
>>>>> +}
>>>> I'm a bit uncomfortable with having the _lockdep_ definition of whether
>>>> a read lock is recursive depend on what the _implementation_ is.
>>>> The locking semantics should be the same, no matter which architecture
>>>> you're running on.  If we rely on read locks being recursive in common
>>>> code then we have a locking bug on architectures which don't use queued
>>>> rwlocks.
>>>>
>>>> I don't know whether we should just tell the people who aren't using
>>>> queued rwlocks that they have a new requirement or whether we should
>>>> say that read locks are never recursive, but having this inconsistency
>>>> is not a good idea!
>>> Actually, qrwlock is more restrictive. It is possible that systems with
>>> qrwlock may hit deadlock which doesn't happens in other systems that use
>>> recursive rwlock. However, the current lockdep code doesn't detect those
>>> cases.
>> Oops.  I misread.  Still, my point stands; we should have the same
>> definition of how you're allowed to use locks from the lockdep point of
>> view, even if the underlying implementation won't deadlock on a particular
>> usage model.
>>
> I understand your point, but such a change will require us to notify
> almost every developer using rwlocks and help them to get their code
> right, and that requires time and work, while currently I want to focus
> on the correctness of the detection, and without that being merged, we
> don't have a way to detect those problems. So I think it's better that
> we have the detection reviewed and tested for a while (given that x86
> uses qrwlock, so it will get a lot chances for testing), after that we
> we have the confidence (and the tool) to educate people the "new"
> semantics of rwlock. So I'd like to keep this patch as it is for now.

I agreed. We may have architectures that use recursive rwlocks and 
depend on that behavior in the arch specific code. So there can be false 
positive lockdep warnings when we force rwlock to use qrwlock behavior 
for all archs.

With the current patch, we can check the x86 and generic code to make 
sure that they don't have problem first. When other architectures decide 
to use the generic qrwlock later, they can find out if there are some 
inherent problems in their arch specific code.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-06-16  0:31 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-04  5:55 possible deadlock in send_sigio syzbot
2020-06-11  2:32 ` Waiman Long
2020-06-11  7:43   ` Dmitry Vyukov
2020-06-11 13:51     ` Waiman Long
2020-06-11 14:22       ` Peter Zijlstra
2020-06-11 16:09         ` Waiman Long
2020-06-11 23:55           ` Boqun Feng
2020-06-12  1:55             ` Waiman Long
2020-06-12  7:01             ` Boqun Feng
2020-06-15 16:37               ` Waiman Long
2020-06-15 16:49               ` Matthew Wilcox
2020-06-15 17:13                 ` Waiman Long
2020-06-15 20:40                   ` Matthew Wilcox
2020-06-16  0:13                     ` Boqun Feng
2020-06-16  0:31                       ` Waiman Long
2020-06-11 16:07   ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).