* possible deadlock in bpf_lru_push_free @ 2020-02-16 10:57 syzbot 2020-02-16 12:17 ` syzbot 2020-02-19 0:37 ` syzbot 0 siblings, 2 replies; 7+ messages in thread From: syzbot @ 2020-02-16 10:57 UTC (permalink / raw) To: andriin, ast, bpf, daniel, kafai, linux-kernel, netdev, songliubraving, syzkaller-bugs, yhs Hello, syzbot found the following crash on: HEAD commit: 2019fc96 Merge git://git.kernel.org/pub/scm/linux/kernel/g.. git tree: net console output: https://syzkaller.appspot.com/x/log.txt?x=13aa0229e00000 kernel config: https://syzkaller.appspot.com/x/.config?x=735296e4dd620b10 dashboard link: https://syzkaller.appspot.com/bug?extid=122b5421d14e68f29cd1 compiler: gcc (GCC) 9.0.0 20181231 (experimental) Unfortunately, I don't have any reproducer for this crash yet. IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+122b5421d14e68f29cd1@syzkaller.appspotmail.com ====================================================== WARNING: possible circular locking dependency detected 5.6.0-rc1-syzkaller #0 Not tainted ------------------------------------------------------ syz-executor.3/16820 is trying to acquire lock: ffffe8ffffccc040 (&loc_l->lock){....}, at: bpf_common_lru_push_free kernel/bpf/bpf_lru_list.c:516 [inline] ffffe8ffffccc040 (&loc_l->lock){....}, at: bpf_lru_push_free+0x250/0x5b0 kernel/bpf/bpf_lru_list.c:555 but task is already holding lock: ffff88808ceda560 (&htab->buckets[i].lock#2){....}, at: __htab_map_lookup_and_delete_batch+0x617/0x1540 kernel/bpf/hashtab.c:1322 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&htab->buckets[i].lock#2){....}: __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 htab_lru_map_delete_node+0xce/0x2f0 kernel/bpf/hashtab.c:593 __bpf_lru_list_shrink_inactive kernel/bpf/bpf_lru_list.c:220 [inline] __bpf_lru_list_shrink+0xf9/0x470 kernel/bpf/bpf_lru_list.c:266 bpf_lru_list_pop_free_to_local kernel/bpf/bpf_lru_list.c:340 [inline] bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:447 [inline] bpf_lru_pop_free+0x87c/0x1670 kernel/bpf/bpf_lru_list.c:499 prealloc_lru_pop+0x2c/0xa0 kernel/bpf/hashtab.c:132 __htab_lru_percpu_map_update_elem+0x67e/0xa90 kernel/bpf/hashtab.c:1069 bpf_percpu_hash_update+0x16e/0x210 kernel/bpf/hashtab.c:1585 bpf_map_update_value.isra.0+0x2d7/0x8e0 kernel/bpf/syscall.c:181 generic_map_update_batch+0x41f/0x610 kernel/bpf/syscall.c:1319 bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 __do_sys_bpf+0x9b7/0x41e0 kernel/bpf/syscall.c:3460 __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #1 (&l->lock){....}: __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151 bpf_lru_list_pop_free_to_local kernel/bpf/bpf_lru_list.c:325 [inline] bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:447 [inline] bpf_lru_pop_free+0x67f/0x1670 kernel/bpf/bpf_lru_list.c:499 prealloc_lru_pop+0x2c/0xa0 kernel/bpf/hashtab.c:132 htab_lru_map_update_elem+0x65b/0xba0 kernel/bpf/hashtab.c:950 bpf_map_update_value.isra.0+0x61b/0x8e0 kernel/bpf/syscall.c:206 map_update_elem kernel/bpf/syscall.c:1089 [inline] __do_sys_bpf+0x3163/0x41e0 kernel/bpf/syscall.c:3384 __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #0 (&loc_l->lock){....}: check_prev_add kernel/locking/lockdep.c:2475 [inline] check_prevs_add kernel/locking/lockdep.c:2580 [inline] validate_chain kernel/locking/lockdep.c:2970 [inline] __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3954 lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 bpf_common_lru_push_free kernel/bpf/bpf_lru_list.c:516 [inline] bpf_lru_push_free+0x250/0x5b0 kernel/bpf/bpf_lru_list.c:555 __htab_map_lookup_and_delete_batch+0x8d4/0x1540 kernel/bpf/hashtab.c:1374 htab_lru_map_lookup_and_delete_batch+0x34/0x40 kernel/bpf/hashtab.c:1491 bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 __do_sys_bpf+0x1f7d/0x41e0 kernel/bpf/syscall.c:3456 __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Chain exists of: &loc_l->lock --> &l->lock --> &htab->buckets[i].lock#2 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&htab->buckets[i].lock#2); lock(&l->lock); lock(&htab->buckets[i].lock#2); lock(&loc_l->lock); *** DEADLOCK *** 2 locks held by syz-executor.3/16820: #0: ffffffff89bac240 (rcu_read_lock){....}, at: __htab_map_lookup_and_delete_batch+0x54b/0x1540 kernel/bpf/hashtab.c:1308 #1: ffff88808ceda560 (&htab->buckets[i].lock#2){....}, at: __htab_map_lookup_and_delete_batch+0x617/0x1540 kernel/bpf/hashtab.c:1322 stack backtrace: CPU: 0 PID: 16820 Comm: syz-executor.3 Not tainted 5.6.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 print_circular_bug.isra.0.cold+0x163/0x172 kernel/locking/lockdep.c:1684 check_noncircular+0x32e/0x3e0 kernel/locking/lockdep.c:1808 check_prev_add kernel/locking/lockdep.c:2475 [inline] check_prevs_add kernel/locking/lockdep.c:2580 [inline] validate_chain kernel/locking/lockdep.c:2970 [inline] __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3954 lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 bpf_common_lru_push_free kernel/bpf/bpf_lru_list.c:516 [inline] bpf_lru_push_free+0x250/0x5b0 kernel/bpf/bpf_lru_list.c:555 __htab_map_lookup_and_delete_batch+0x8d4/0x1540 kernel/bpf/hashtab.c:1374 htab_lru_map_lookup_and_delete_batch+0x34/0x40 kernel/bpf/hashtab.c:1491 bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 __do_sys_bpf+0x1f7d/0x41e0 kernel/bpf/syscall.c:3456 __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x45c6c9 Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007efeedbcdc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000141 RAX: ffffffffffffffda RBX: 00007efeedbce6d4 RCX: 000000000045c6c9 RDX: 0000000000000038 RSI: 00000000200001c0 RDI: 0000000000000019 RBP: 000000000076bf20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 0000000000000060 R14: 00000000004c2e9b R15: 000000000076bf2c --- This bug is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this bug report. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: possible deadlock in bpf_lru_push_free 2020-02-16 10:57 possible deadlock in bpf_lru_push_free syzbot @ 2020-02-16 12:17 ` syzbot 2020-02-19 0:37 ` syzbot 1 sibling, 0 replies; 7+ messages in thread From: syzbot @ 2020-02-16 12:17 UTC (permalink / raw) To: andriin, ast, bpf, daniel, kafai, linux-kernel, netdev, songliubraving, syzkaller-bugs, yhs syzbot has found a reproducer for the following crash on: HEAD commit: 2019fc96 Merge git://git.kernel.org/pub/scm/linux/kernel/g.. git tree: net console output: https://syzkaller.appspot.com/x/log.txt?x=1358bb11e00000 kernel config: https://syzkaller.appspot.com/x/.config?x=735296e4dd620b10 dashboard link: https://syzkaller.appspot.com/bug?extid=122b5421d14e68f29cd1 compiler: gcc (GCC) 9.0.0 20181231 (experimental) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14b67d6ee00000 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+122b5421d14e68f29cd1@syzkaller.appspotmail.com ====================================================== WARNING: possible circular locking dependency detected 5.6.0-rc1-syzkaller #0 Not tainted ------------------------------------------------------ syz-executor.4/13544 is trying to acquire lock: ffffe8ffffcba0b8 (&loc_l->lock){....}, at: bpf_common_lru_push_free kernel/bpf/bpf_lru_list.c:516 [inline] ffffe8ffffcba0b8 (&loc_l->lock){....}, at: bpf_lru_push_free+0x250/0x5b0 kernel/bpf/bpf_lru_list.c:555 but task is already holding lock: ffff888094985960 (&htab->buckets[i].lock){....}, at: __htab_map_lookup_and_delete_batch+0x617/0x1540 kernel/bpf/hashtab.c:1322 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&htab->buckets[i].lock){....}: __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 htab_lru_map_delete_node+0xce/0x2f0 kernel/bpf/hashtab.c:593 __bpf_lru_list_shrink_inactive kernel/bpf/bpf_lru_list.c:220 [inline] __bpf_lru_list_shrink+0xf9/0x470 kernel/bpf/bpf_lru_list.c:266 bpf_lru_list_pop_free_to_local kernel/bpf/bpf_lru_list.c:340 [inline] bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:447 [inline] bpf_lru_pop_free+0x87c/0x1670 kernel/bpf/bpf_lru_list.c:499 prealloc_lru_pop+0x2c/0xa0 kernel/bpf/hashtab.c:132 __htab_lru_percpu_map_update_elem+0x67e/0xa90 kernel/bpf/hashtab.c:1069 bpf_percpu_hash_update+0x16e/0x210 kernel/bpf/hashtab.c:1585 bpf_map_update_value.isra.0+0x2d7/0x8e0 kernel/bpf/syscall.c:181 generic_map_update_batch+0x41f/0x610 kernel/bpf/syscall.c:1319 bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 __do_sys_bpf+0x9b7/0x41e0 kernel/bpf/syscall.c:3460 __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #1 (&l->lock){....}: __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151 bpf_lru_list_pop_free_to_local kernel/bpf/bpf_lru_list.c:325 [inline] bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:447 [inline] bpf_lru_pop_free+0x67f/0x1670 kernel/bpf/bpf_lru_list.c:499 prealloc_lru_pop+0x2c/0xa0 kernel/bpf/hashtab.c:132 __htab_lru_percpu_map_update_elem+0x67e/0xa90 kernel/bpf/hashtab.c:1069 bpf_percpu_hash_update+0x16e/0x210 kernel/bpf/hashtab.c:1585 bpf_map_update_value.isra.0+0x2d7/0x8e0 kernel/bpf/syscall.c:181 generic_map_update_batch+0x41f/0x610 kernel/bpf/syscall.c:1319 bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 __do_sys_bpf+0x9b7/0x41e0 kernel/bpf/syscall.c:3460 __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #0 (&loc_l->lock){....}: check_prev_add kernel/locking/lockdep.c:2475 [inline] check_prevs_add kernel/locking/lockdep.c:2580 [inline] validate_chain kernel/locking/lockdep.c:2970 [inline] __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3954 lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 bpf_common_lru_push_free kernel/bpf/bpf_lru_list.c:516 [inline] bpf_lru_push_free+0x250/0x5b0 kernel/bpf/bpf_lru_list.c:555 __htab_map_lookup_and_delete_batch+0x8d4/0x1540 kernel/bpf/hashtab.c:1374 htab_lru_map_lookup_and_delete_batch+0x34/0x40 kernel/bpf/hashtab.c:1491 bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 __do_sys_bpf+0x1f7d/0x41e0 kernel/bpf/syscall.c:3456 __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Chain exists of: &loc_l->lock --> &l->lock --> &htab->buckets[i].lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&htab->buckets[i].lock); lock(&l->lock); lock(&htab->buckets[i].lock); lock(&loc_l->lock); *** DEADLOCK *** 2 locks held by syz-executor.4/13544: #0: ffffffff89bac240 (rcu_read_lock){....}, at: __htab_map_lookup_and_delete_batch+0x54b/0x1540 kernel/bpf/hashtab.c:1308 #1: ffff888094985960 (&htab->buckets[i].lock){....}, at: __htab_map_lookup_and_delete_batch+0x617/0x1540 kernel/bpf/hashtab.c:1322 stack backtrace: CPU: 0 PID: 13544 Comm: syz-executor.4 Not tainted 5.6.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 print_circular_bug.isra.0.cold+0x163/0x172 kernel/locking/lockdep.c:1684 check_noncircular+0x32e/0x3e0 kernel/locking/lockdep.c:1808 check_prev_add kernel/locking/lockdep.c:2475 [inline] check_prevs_add kernel/locking/lockdep.c:2580 [inline] validate_chain kernel/locking/lockdep.c:2970 [inline] __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3954 lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 bpf_common_lru_push_free kernel/bpf/bpf_lru_list.c:516 [inline] bpf_lru_push_free+0x250/0x5b0 kernel/bpf/bpf_lru_list.c:555 __htab_map_lookup_and_delete_batch+0x8d4/0x1540 kernel/bpf/hashtab.c:1374 htab_lru_map_lookup_and_delete_batch+0x34/0x40 kernel/bpf/hashtab.c:1491 bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 __do_sys_bpf+0x1f7d/0x41e0 kernel/bpf/syscall.c:3456 __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x45c6c9 Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007efc51c07c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000141 RAX: ffffffffffffffda RBX: 00007efc51c086d4 RCX: 000000000045c6c9 RDX: 0000000000000038 RSI: 00000000200001c0 RDI: 0000000000000019 RBP: 000000000076c118 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 0000000000000060 R14: 00000000004c2e9b R15: 000000000076c124 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: possible deadlock in bpf_lru_push_free 2020-02-16 10:57 possible deadlock in bpf_lru_push_free syzbot 2020-02-16 12:17 ` syzbot @ 2020-02-19 0:37 ` syzbot 1 sibling, 0 replies; 7+ messages in thread From: syzbot @ 2020-02-19 0:37 UTC (permalink / raw) To: andriin, ast, bpf, daniel, hdanton, kafai, linux-kernel, netdev, songliubraving, syzkaller-bugs, yhs syzbot has found a reproducer for the following crash on: HEAD commit: e20d3a05 bpf, offload: Replace bitwise AND by logical AND .. git tree: bpf console output: https://syzkaller.appspot.com/x/log.txt?x=1491c481e00000 kernel config: https://syzkaller.appspot.com/x/.config?x=e7d35de59001df38 dashboard link: https://syzkaller.appspot.com/bug?extid=122b5421d14e68f29cd1 compiler: gcc (GCC) 9.0.0 20181231 (experimental) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16ca6c45e00000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=134bae29e00000 Bisection is inconclusive: the first bad commit could be any of: 36a375c6 mailmap: add entry for Tiezhu Yang 95c472ff Documentation/ko_KR/howto: Update a broken link ff1e81a7 Documentation: build warnings related to missing blank lines after explicit markups has been fixed 5549c202 Documentation/ko_KR/howto: Update broken web addresses 599e6f8d Documentation: changes.rst: update several outdated project URLs 4bfdebd6 docs/locking: Fix outdated section names d1c9038a Allow git builds of Sphinx 41dcd67e Merge tag 'docs-5.6-2' of git://git.lwn.net/linux bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=17c6c36ee00000 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+122b5421d14e68f29cd1@syzkaller.appspotmail.com IPVS: ftp: loaded support on port[0] = 21 ====================================================== WARNING: possible circular locking dependency detected 5.5.0-syzkaller #0 Not tainted ------------------------------------------------------ syz-executor198/9748 is trying to acquire lock: ffffe8ffffc49158 (&l->lock){....}, at: bpf_lru_list_push_free kernel/bpf/bpf_lru_list.c:313 [inline] ffffe8ffffc49158 (&l->lock){....}, at: bpf_common_lru_push_free kernel/bpf/bpf_lru_list.c:532 [inline] ffffe8ffffc49158 (&l->lock){....}, at: bpf_lru_push_free+0xe5/0x5b0 kernel/bpf/bpf_lru_list.c:555 but task is already holding lock: ffff88809f6c3b60 (&htab->buckets[i].lock){....}, at: __htab_map_lookup_and_delete_batch+0x617/0x1540 kernel/bpf/hashtab.c:1322 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&htab->buckets[i].lock){....}: __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 htab_lru_map_delete_node+0xce/0x2f0 kernel/bpf/hashtab.c:593 __bpf_lru_list_shrink_inactive kernel/bpf/bpf_lru_list.c:220 [inline] __bpf_lru_list_shrink+0xf9/0x470 kernel/bpf/bpf_lru_list.c:266 bpf_percpu_lru_pop_free kernel/bpf/bpf_lru_list.c:416 [inline] bpf_lru_pop_free+0xa9f/0x1670 kernel/bpf/bpf_lru_list.c:497 prealloc_lru_pop+0x2c/0xa0 kernel/bpf/hashtab.c:132 __htab_lru_percpu_map_update_elem+0x67e/0xa90 kernel/bpf/hashtab.c:1069 bpf_percpu_hash_update+0x16e/0x210 kernel/bpf/hashtab.c:1585 bpf_map_update_value.isra.0+0x2d7/0x8e0 kernel/bpf/syscall.c:181 map_update_elem kernel/bpf/syscall.c:1089 [inline] __do_sys_bpf+0x3163/0x41e0 kernel/bpf/syscall.c:3384 __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #0 (&l->lock){....}: check_prev_add kernel/locking/lockdep.c:2475 [inline] check_prevs_add kernel/locking/lockdep.c:2580 [inline] validate_chain kernel/locking/lockdep.c:2970 [inline] __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3954 lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 bpf_lru_list_push_free kernel/bpf/bpf_lru_list.c:313 [inline] bpf_common_lru_push_free kernel/bpf/bpf_lru_list.c:532 [inline] bpf_lru_push_free+0xe5/0x5b0 kernel/bpf/bpf_lru_list.c:555 __htab_map_lookup_and_delete_batch+0x8d4/0x1540 kernel/bpf/hashtab.c:1374 htab_lru_percpu_map_lookup_and_delete_batch+0x37/0x40 kernel/bpf/hashtab.c:1474 bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 __do_sys_bpf+0x1f7d/0x41e0 kernel/bpf/syscall.c:3456 __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&htab->buckets[i].lock); lock(&l->lock); lock(&htab->buckets[i].lock); lock(&l->lock); *** DEADLOCK *** 2 locks held by syz-executor198/9748: #0: ffffffff89bac200 (rcu_read_lock){....}, at: __htab_map_lookup_and_delete_batch+0x54b/0x1540 kernel/bpf/hashtab.c:1308 #1: ffff88809f6c3b60 (&htab->buckets[i].lock){....}, at: __htab_map_lookup_and_delete_batch+0x617/0x1540 kernel/bpf/hashtab.c:1322 stack backtrace: CPU: 0 PID: 9748 Comm: syz-executor198 Not tainted 5.5.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 print_circular_bug.isra.0.cold+0x163/0x172 kernel/locking/lockdep.c:1684 check_noncircular+0x32e/0x3e0 kernel/locking/lockdep.c:1808 check_prev_add kernel/locking/lockdep.c:2475 [inline] check_prevs_add kernel/locking/lockdep.c:2580 [inline] validate_chain kernel/locking/lockdep.c:2970 [inline] __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3954 lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 bpf_lru_list_push_free kernel/bpf/bpf_lru_list.c:313 [inline] bpf_common_lru_push_free kernel/bpf/bpf_lru_list.c:532 [inline] bpf_lru_push_free+0xe5/0x5b0 kernel/bpf/bpf_lru_list.c:555 __htab_map_lookup_and_delete_batch+0x8d4/0x1540 kernel/bpf/hashtab.c:1374 htab_lru_percpu_map_lookup_and_delete_batch+0x37/0x40 kernel/bpf/hashtab.c:1474 bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 __do_sys_bpf+0x1f7d/0x41e0 kernel/bpf/syscall.c:3456 __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x440c09 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b 10 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fff14512e08 EFLAGS: 00000246 ORIG_RAX: 0000000000000141 RAX: ffffffffffffffda RBX: 00000000004a2390 RCX: 0000000000440c09 RDX: 0000000000000038 RSI: 0000000020000100 RDI: 0000000000000019 RBP: 00000000006cb018 R08: 0000000120080522 R09: 0000000120080522 R10: 0000000120080522 R11: 0000000000000246 R12: 0000000000402110 R13: 00000000004021a0 R14: 0000000000000000 R15: 0000000000000000 ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20200217052336.5556-1-hdanton@sina.com>]
* Re: possible deadlock in bpf_lru_push_free [not found] <20200217052336.5556-1-hdanton@sina.com> @ 2020-02-18 17:44 ` Yonghong Song 2020-02-18 23:55 ` Yonghong Song [not found] ` <20200219021542.3304-1-hdanton@sina.com> 0 siblings, 2 replies; 7+ messages in thread From: Yonghong Song @ 2020-02-18 17:44 UTC (permalink / raw) To: Hillf Danton, syzbot Cc: andriin, ast, bpf, daniel, kafai, linux-kernel, netdev, songliubraving, syzkaller-bugs On 2/16/20 9:23 PM, Hillf Danton wrote: > > On Sun, 16 Feb 2020 04:17:09 -0800 >> syzbot has found a reproducer for the following crash on: >> >> HEAD commit: 2019fc96 Merge git://git.kernel.org/pub/scm/linux/kernel/g.. >> git tree: net >> console output: https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_log.txt-3Fx-3D1358bb11e00000&d=DwIDAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=npe_gMkFnfxt6F5dGLs6zsNHWkYM30LkMFOk1_ZR1w8&s=zrgWcBnddWkMWG2zm-9nC8EwvHMsuqw_-EEXwl23XLg&e= >> kernel config: https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_.config-3Fx-3D735296e4dd620b10&d=DwIDAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=npe_gMkFnfxt6F5dGLs6zsNHWkYM30LkMFOk1_ZR1w8&s=kbT6Yw89JDoIWSQtlLJ7sjyNoP2Ulud27GNorna1zQk&e= >> dashboard link: https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_bug-3Fextid-3D122b5421d14e68f29cd1&d=DwIDAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=npe_gMkFnfxt6F5dGLs6zsNHWkYM30LkMFOk1_ZR1w8&s=U3pdUmrcroaeNsJ9DgFbTlvftQUCUcJ1CW_0NxS8yGA&e= >> compiler: gcc (GCC) 9.0.0 20181231 (experimental) >> syz repro: https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_repro.syz-3Fx-3D14b67d6ee00000&d=DwIDAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=npe_gMkFnfxt6F5dGLs6zsNHWkYM30LkMFOk1_ZR1w8&s=TuSfjosRFQW3ArpQwikTtx-dgLLBSMgJfVKtUltqQBM&e= >> >> IMPORTANT: if you fix the bug, please add the following tag to the commit: >> Reported-by: syzbot+122b5421d14e68f29cd1@syzkaller.appspotmail.com >> >> ====================================================== >> WARNING: possible circular locking dependency detected >> 5.6.0-rc1-syzkaller #0 Not tainted >> ------------------------------------------------------ >> syz-executor.4/13544 is trying to acquire lock: >> ffffe8ffffcba0b8 (&loc_l->lock){....}, at: bpf_common_lru_push_free kernel/bpf/bpf_lru_list.c:516 [inline] >> ffffe8ffffcba0b8 (&loc_l->lock){....}, at: bpf_lru_push_free+0x250/0x5b0 kernel/bpf/bpf_lru_list.c:555 >> >> but task is already holding lock: >> ffff888094985960 (&htab->buckets[i].lock){....}, at: __htab_map_lookup_and_delete_batch+0x617/0x1540 kernel/bpf/hashtab.c:1322 >> >> which lock already depends on the new lock. >> >> >> the existing dependency chain (in reverse order) is: >> >> -> #2 (&htab->buckets[i].lock){....}: >> __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] >> _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 >> htab_lru_map_delete_node+0xce/0x2f0 kernel/bpf/hashtab.c:593 >> __bpf_lru_list_shrink_inactive kernel/bpf/bpf_lru_list.c:220 [inline] >> __bpf_lru_list_shrink+0xf9/0x470 kernel/bpf/bpf_lru_list.c:266 >> bpf_lru_list_pop_free_to_local kernel/bpf/bpf_lru_list.c:340 [inline] >> bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:447 [inline] >> bpf_lru_pop_free+0x87c/0x1670 kernel/bpf/bpf_lru_list.c:499 >> prealloc_lru_pop+0x2c/0xa0 kernel/bpf/hashtab.c:132 >> __htab_lru_percpu_map_update_elem+0x67e/0xa90 kernel/bpf/hashtab.c:1069 >> bpf_percpu_hash_update+0x16e/0x210 kernel/bpf/hashtab.c:1585 >> bpf_map_update_value.isra.0+0x2d7/0x8e0 kernel/bpf/syscall.c:181 >> generic_map_update_batch+0x41f/0x610 kernel/bpf/syscall.c:1319 >> bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 >> __do_sys_bpf+0x9b7/0x41e0 kernel/bpf/syscall.c:3460 >> __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] >> __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 >> do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 >> entry_SYSCALL_64_after_hwframe+0x49/0xbe >> >> -> #1 (&l->lock){....}: >> __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] >> _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151 >> bpf_lru_list_pop_free_to_local kernel/bpf/bpf_lru_list.c:325 [inline] >> bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:447 [inline] >> bpf_lru_pop_free+0x67f/0x1670 kernel/bpf/bpf_lru_list.c:499 >> prealloc_lru_pop+0x2c/0xa0 kernel/bpf/hashtab.c:132 >> __htab_lru_percpu_map_update_elem+0x67e/0xa90 kernel/bpf/hashtab.c:1069 >> bpf_percpu_hash_update+0x16e/0x210 kernel/bpf/hashtab.c:1585 >> bpf_map_update_value.isra.0+0x2d7/0x8e0 kernel/bpf/syscall.c:181 >> generic_map_update_batch+0x41f/0x610 kernel/bpf/syscall.c:1319 >> bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 >> __do_sys_bpf+0x9b7/0x41e0 kernel/bpf/syscall.c:3460 >> __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] >> __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 >> do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 >> entry_SYSCALL_64_after_hwframe+0x49/0xbe >> >> -> #0 (&loc_l->lock){....}: >> check_prev_add kernel/locking/lockdep.c:2475 [inline] >> check_prevs_add kernel/locking/lockdep.c:2580 [inline] >> validate_chain kernel/locking/lockdep.c:2970 [inline] >> __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3954 >> lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484 >> __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] >> _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 >> bpf_common_lru_push_free kernel/bpf/bpf_lru_list.c:516 [inline] >> bpf_lru_push_free+0x250/0x5b0 kernel/bpf/bpf_lru_list.c:555 >> __htab_map_lookup_and_delete_batch+0x8d4/0x1540 kernel/bpf/hashtab.c:1374 >> htab_lru_map_lookup_and_delete_batch+0x34/0x40 kernel/bpf/hashtab.c:1491 >> bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 >> __do_sys_bpf+0x1f7d/0x41e0 kernel/bpf/syscall.c:3456 >> __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] >> __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 >> do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 >> entry_SYSCALL_64_after_hwframe+0x49/0xbe >> >> other info that might help us debug this: >> >> Chain exists of: >> &loc_l->lock --> &l->lock --> &htab->buckets[i].lock >> >> Possible unsafe locking scenario: >> >> CPU0 CPU1 >> ---- ---- >> lock(&htab->buckets[i].lock); >> lock(&l->lock); >> lock(&htab->buckets[i].lock); >> lock(&loc_l->lock); >> >> *** DEADLOCK *** >> >> 2 locks held by syz-executor.4/13544: >> #0: ffffffff89bac240 (rcu_read_lock){....}, at: __htab_map_lookup_and_delete_batch+0x54b/0x1540 kernel/bpf/hashtab.c:1308 >> #1: ffff888094985960 (&htab->buckets[i].lock){....}, at: __htab_map_lookup_and_delete_batch+0x617/0x1540 kernel/bpf/hashtab.c:1322 >> >> stack backtrace: >> CPU: 0 PID: 13544 Comm: syz-executor.4 Not tainted 5.6.0-rc1-syzkaller #0 >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >> Call Trace: >> __dump_stack lib/dump_stack.c:77 [inline] >> dump_stack+0x197/0x210 lib/dump_stack.c:118 >> print_circular_bug.isra.0.cold+0x163/0x172 kernel/locking/lockdep.c:1684 >> check_noncircular+0x32e/0x3e0 kernel/locking/lockdep.c:1808 >> check_prev_add kernel/locking/lockdep.c:2475 [inline] >> check_prevs_add kernel/locking/lockdep.c:2580 [inline] >> validate_chain kernel/locking/lockdep.c:2970 [inline] >> __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3954 >> lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484 >> __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] >> _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 >> bpf_common_lru_push_free kernel/bpf/bpf_lru_list.c:516 [inline] >> bpf_lru_push_free+0x250/0x5b0 kernel/bpf/bpf_lru_list.c:555 >> __htab_map_lookup_and_delete_batch+0x8d4/0x1540 kernel/bpf/hashtab.c:1374 >> htab_lru_map_lookup_and_delete_batch+0x34/0x40 kernel/bpf/hashtab.c:1491 >> bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 >> __do_sys_bpf+0x1f7d/0x41e0 kernel/bpf/syscall.c:3456 >> __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] >> __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 >> do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 >> entry_SYSCALL_64_after_hwframe+0x49/0xbe > > Reclaim hash table elememt outside bucket lock. Thanks for the following patch. Yes, we do have an potential issue with the above deadlock if LRU hash map is not preallocated. I am not a RCU expert, but maybe you could you help clarify one thing below? > > --- a/kernel/bpf/hashtab.c > +++ b/kernel/bpf/hashtab.c > @@ -1259,6 +1259,7 @@ __htab_map_lookup_and_delete_batch(struc > u64 elem_map_flags, map_flags; > struct hlist_nulls_head *head; > struct hlist_nulls_node *n; > + struct hlist_nulls_node *node_to_free = NULL; > unsigned long flags; > struct htab_elem *l; > struct bucket *b; > @@ -1370,9 +1371,10 @@ again_nocopy: > } > if (do_delete) { > hlist_nulls_del_rcu(&l->hash_node); > - if (is_lru_map) > - bpf_lru_push_free(&htab->lru, &l->lru_node); > - else > + if (is_lru_map) { > + l->hash_node.next = node_to_free; > + node_to_free = &l->hash_node; Here, we change "next" pointer. How does this may impact the existing parallel map lookup which does not need to take bucket pointer? > + } else > free_htab_elem(htab, l); > } > dst_key += key_size; > @@ -1380,6 +1382,12 @@ again_nocopy: > } > > raw_spin_unlock_irqrestore(&b->lock, flags); > + > + while (node_to_free) { > + l = container_of(node_to_free, struct htab_elem, hash_node); > + node_to_free = node_to_free->next; > + bpf_lru_push_free(&htab->lru, &l->lru_node); > + } > /* If we are not copying data, we can go to next bucket and avoid > * unlocking the rcu. > */ > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: possible deadlock in bpf_lru_push_free 2020-02-18 17:44 ` Yonghong Song @ 2020-02-18 23:55 ` Yonghong Song 2020-02-19 4:55 ` Brian Vazquez [not found] ` <20200219021542.3304-1-hdanton@sina.com> 1 sibling, 1 reply; 7+ messages in thread From: Yonghong Song @ 2020-02-18 23:55 UTC (permalink / raw) To: Hillf Danton, syzbot Cc: andriin, ast, bpf, daniel, kafai, linux-kernel, netdev, songliubraving, syzkaller-bugs On 2/18/20 9:44 AM, Yonghong Song wrote: > > > On 2/16/20 9:23 PM, Hillf Danton wrote: >> >> On Sun, 16 Feb 2020 04:17:09 -0800 >>> syzbot has found a reproducer for the following crash on: >>> >>> HEAD commit: 2019fc96 Merge >>> git://git.kernel.org/pub/scm/linux/kernel/g.. >>> git tree: net >>> console output: >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_log.txt-3Fx-3D1358bb11e00000&d=DwIDAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=npe_gMkFnfxt6F5dGLs6zsNHWkYM30LkMFOk1_ZR1w8&s=zrgWcBnddWkMWG2zm-9nC8EwvHMsuqw_-EEXwl23XLg&e= >>> >>> kernel config: >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_.config-3Fx-3D735296e4dd620b10&d=DwIDAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=npe_gMkFnfxt6F5dGLs6zsNHWkYM30LkMFOk1_ZR1w8&s=kbT6Yw89JDoIWSQtlLJ7sjyNoP2Ulud27GNorna1zQk&e= >>> >>> dashboard link: >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_bug-3Fextid-3D122b5421d14e68f29cd1&d=DwIDAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=npe_gMkFnfxt6F5dGLs6zsNHWkYM30LkMFOk1_ZR1w8&s=U3pdUmrcroaeNsJ9DgFbTlvftQUCUcJ1CW_0NxS8yGA&e= >>> >>> compiler: gcc (GCC) 9.0.0 20181231 (experimental) >>> syz repro: >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_repro.syz-3Fx-3D14b67d6ee00000&d=DwIDAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=npe_gMkFnfxt6F5dGLs6zsNHWkYM30LkMFOk1_ZR1w8&s=TuSfjosRFQW3ArpQwikTtx-dgLLBSMgJfVKtUltqQBM&e= >>> >>> >>> IMPORTANT: if you fix the bug, please add the following tag to the >>> commit: >>> Reported-by: syzbot+122b5421d14e68f29cd1@syzkaller.appspotmail.com >>> >>> ====================================================== >>> WARNING: possible circular locking dependency detected >>> 5.6.0-rc1-syzkaller #0 Not tainted >>> ------------------------------------------------------ >>> syz-executor.4/13544 is trying to acquire lock: >>> ffffe8ffffcba0b8 (&loc_l->lock){....}, at: bpf_common_lru_push_free >>> kernel/bpf/bpf_lru_list.c:516 [inline] >>> ffffe8ffffcba0b8 (&loc_l->lock){....}, at: >>> bpf_lru_push_free+0x250/0x5b0 kernel/bpf/bpf_lru_list.c:555 >>> >>> but task is already holding lock: >>> ffff888094985960 (&htab->buckets[i].lock){....}, at: >>> __htab_map_lookup_and_delete_batch+0x617/0x1540 >>> kernel/bpf/hashtab.c:1322 >>> >>> which lock already depends on the new lock. >>> >>> >>> the existing dependency chain (in reverse order) is: >>> >>> -> #2 (&htab->buckets[i].lock){....}: >>> __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 >>> [inline] >>> _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 >>> htab_lru_map_delete_node+0xce/0x2f0 kernel/bpf/hashtab.c:593 >>> __bpf_lru_list_shrink_inactive kernel/bpf/bpf_lru_list.c:220 >>> [inline] >>> __bpf_lru_list_shrink+0xf9/0x470 kernel/bpf/bpf_lru_list.c:266 >>> bpf_lru_list_pop_free_to_local kernel/bpf/bpf_lru_list.c:340 >>> [inline] >>> bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:447 [inline] >>> bpf_lru_pop_free+0x87c/0x1670 kernel/bpf/bpf_lru_list.c:499 >>> prealloc_lru_pop+0x2c/0xa0 kernel/bpf/hashtab.c:132 >>> __htab_lru_percpu_map_update_elem+0x67e/0xa90 >>> kernel/bpf/hashtab.c:1069 >>> bpf_percpu_hash_update+0x16e/0x210 kernel/bpf/hashtab.c:1585 >>> bpf_map_update_value.isra.0+0x2d7/0x8e0 kernel/bpf/syscall.c:181 >>> generic_map_update_batch+0x41f/0x610 kernel/bpf/syscall.c:1319 >>> bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 >>> __do_sys_bpf+0x9b7/0x41e0 kernel/bpf/syscall.c:3460 >>> __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] >>> __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 >>> do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 >>> entry_SYSCALL_64_after_hwframe+0x49/0xbe >>> >>> -> #1 (&l->lock){....}: >>> __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] >>> _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151 >>> bpf_lru_list_pop_free_to_local kernel/bpf/bpf_lru_list.c:325 >>> [inline] >>> bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:447 [inline] >>> bpf_lru_pop_free+0x67f/0x1670 kernel/bpf/bpf_lru_list.c:499 >>> prealloc_lru_pop+0x2c/0xa0 kernel/bpf/hashtab.c:132 >>> __htab_lru_percpu_map_update_elem+0x67e/0xa90 >>> kernel/bpf/hashtab.c:1069 >>> bpf_percpu_hash_update+0x16e/0x210 kernel/bpf/hashtab.c:1585 >>> bpf_map_update_value.isra.0+0x2d7/0x8e0 kernel/bpf/syscall.c:181 >>> generic_map_update_batch+0x41f/0x610 kernel/bpf/syscall.c:1319 >>> bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 >>> __do_sys_bpf+0x9b7/0x41e0 kernel/bpf/syscall.c:3460 >>> __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] >>> __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 >>> do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 >>> entry_SYSCALL_64_after_hwframe+0x49/0xbe >>> >>> -> #0 (&loc_l->lock){....}: >>> check_prev_add kernel/locking/lockdep.c:2475 [inline] >>> check_prevs_add kernel/locking/lockdep.c:2580 [inline] >>> validate_chain kernel/locking/lockdep.c:2970 [inline] >>> __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3954 >>> lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484 >>> __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 >>> [inline] >>> _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 >>> bpf_common_lru_push_free kernel/bpf/bpf_lru_list.c:516 [inline] >>> bpf_lru_push_free+0x250/0x5b0 kernel/bpf/bpf_lru_list.c:555 >>> __htab_map_lookup_and_delete_batch+0x8d4/0x1540 >>> kernel/bpf/hashtab.c:1374 >>> htab_lru_map_lookup_and_delete_batch+0x34/0x40 >>> kernel/bpf/hashtab.c:1491 >>> bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 >>> __do_sys_bpf+0x1f7d/0x41e0 kernel/bpf/syscall.c:3456 >>> __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] >>> __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 >>> do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 >>> entry_SYSCALL_64_after_hwframe+0x49/0xbe >>> >>> other info that might help us debug this: >>> >>> Chain exists of: >>> &loc_l->lock --> &l->lock --> &htab->buckets[i].lock >>> >>> Possible unsafe locking scenario: >>> >>> CPU0 CPU1 >>> ---- ---- >>> lock(&htab->buckets[i].lock); >>> lock(&l->lock); >>> lock(&htab->buckets[i].lock); >>> lock(&loc_l->lock); >>> >>> *** DEADLOCK *** >>> >>> 2 locks held by syz-executor.4/13544: >>> #0: ffffffff89bac240 (rcu_read_lock){....}, at: >>> __htab_map_lookup_and_delete_batch+0x54b/0x1540 >>> kernel/bpf/hashtab.c:1308 >>> #1: ffff888094985960 (&htab->buckets[i].lock){....}, at: >>> __htab_map_lookup_and_delete_batch+0x617/0x1540 >>> kernel/bpf/hashtab.c:1322 >>> >>> stack backtrace: >>> CPU: 0 PID: 13544 Comm: syz-executor.4 Not tainted >>> 5.6.0-rc1-syzkaller #0 >>> Hardware name: Google Google Compute Engine/Google Compute Engine, >>> BIOS Google 01/01/2011 >>> Call Trace: >>> __dump_stack lib/dump_stack.c:77 [inline] >>> dump_stack+0x197/0x210 lib/dump_stack.c:118 >>> print_circular_bug.isra.0.cold+0x163/0x172 >>> kernel/locking/lockdep.c:1684 >>> check_noncircular+0x32e/0x3e0 kernel/locking/lockdep.c:1808 >>> check_prev_add kernel/locking/lockdep.c:2475 [inline] >>> check_prevs_add kernel/locking/lockdep.c:2580 [inline] >>> validate_chain kernel/locking/lockdep.c:2970 [inline] >>> __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3954 >>> lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484 >>> __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] >>> _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 >>> bpf_common_lru_push_free kernel/bpf/bpf_lru_list.c:516 [inline] >>> bpf_lru_push_free+0x250/0x5b0 kernel/bpf/bpf_lru_list.c:555 >>> __htab_map_lookup_and_delete_batch+0x8d4/0x1540 >>> kernel/bpf/hashtab.c:1374 >>> htab_lru_map_lookup_and_delete_batch+0x34/0x40 >>> kernel/bpf/hashtab.c:1491 >>> bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 >>> __do_sys_bpf+0x1f7d/0x41e0 kernel/bpf/syscall.c:3456 >>> __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] >>> __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 >>> do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 >>> entry_SYSCALL_64_after_hwframe+0x49/0xbe >> >> Reclaim hash table elememt outside bucket lock. > > Thanks for the following patch. Yes, we do have an potential issue > with the above deadlock if LRU hash map is not preallocated. > > I am not a RCU expert, but maybe you could you help clarify > one thing below? > >> >> --- a/kernel/bpf/hashtab.c >> +++ b/kernel/bpf/hashtab.c >> @@ -1259,6 +1259,7 @@ __htab_map_lookup_and_delete_batch(struc >> u64 elem_map_flags, map_flags; >> struct hlist_nulls_head *head; >> struct hlist_nulls_node *n; >> + struct hlist_nulls_node *node_to_free = NULL; >> unsigned long flags; >> struct htab_elem *l; >> struct bucket *b; >> @@ -1370,9 +1371,10 @@ again_nocopy: >> } >> if (do_delete) { >> hlist_nulls_del_rcu(&l->hash_node); >> - if (is_lru_map) >> - bpf_lru_push_free(&htab->lru, &l->lru_node); >> - else >> + if (is_lru_map) { >> + l->hash_node.next = node_to_free; >> + node_to_free = &l->hash_node; > > Here, we change "next" pointer. How does this may impact the existing > parallel map lookup which does not need to take bucket pointer? Thanks for Martin for explanation! I think changing l->hash_node.next is unsafe here as another thread may execute on a different cpu and traverse the same list. It will see hash_node.next = NULL and it is unexpected. How about the following patch? diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 2d182c4ee9d9..246ef0f2e985 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -56,6 +56,7 @@ struct htab_elem { union { struct bpf_htab *htab; struct pcpu_freelist_node fnode; + struct htab_elem *link; }; }; }; @@ -1256,6 +1257,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map, void __user *ukeys = u64_to_user_ptr(attr->batch.keys); void *ubatch = u64_to_user_ptr(attr->batch.in_batch); u32 batch, max_count, size, bucket_size; + struct htab_elem *node_to_free = NULL; u64 elem_map_flags, map_flags; struct hlist_nulls_head *head; struct hlist_nulls_node *n; @@ -1370,9 +1372,14 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map, } if (do_delete) { hlist_nulls_del_rcu(&l->hash_node); - if (is_lru_map) - bpf_lru_push_free(&htab->lru, &l->lru_node); - else + if (is_lru_map) { + /* l->hnode overlaps with * l->hash_node.pprev + * in memory. l->hash_node.pprev has been + * poisoned and nobody should access it. + */ + l->link = node_to_free; + node_to_free = l; + } else free_htab_elem(htab, l); } dst_key += key_size; @@ -1380,6 +1387,13 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map, } raw_spin_unlock_irqrestore(&b->lock, flags); + + while (node_to_free) { + l = node_to_free; + node_to_free = node_to_free->link; + bpf_lru_push_free(&htab->lru, &l->lru_node); + } + /* If we are not copying data, we can go to next bucket and avoid * unlocking the rcu. */ ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: possible deadlock in bpf_lru_push_free 2020-02-18 23:55 ` Yonghong Song @ 2020-02-19 4:55 ` Brian Vazquez 0 siblings, 0 replies; 7+ messages in thread From: Brian Vazquez @ 2020-02-19 4:55 UTC (permalink / raw) To: Yonghong Song Cc: Hillf Danton, syzbot, andriin, Alexei Starovoitov, bpf, Daniel Borkmann, kafai, LKML, netdev, songliubraving, syzkaller-bugs On Tue, Feb 18, 2020 at 3:56 PM Yonghong Song <yhs@fb.com> wrote: > > > > On 2/18/20 9:44 AM, Yonghong Song wrote: > > > > > > On 2/16/20 9:23 PM, Hillf Danton wrote: > >> > >> On Sun, 16 Feb 2020 04:17:09 -0800 > >>> syzbot has found a reproducer for the following crash on: > >>> > >>> HEAD commit: 2019fc96 Merge > >>> git://git.kernel.org/pub/scm/linux/kernel/g.. > >>> git tree: net > >>> console output: > >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_log.txt-3Fx-3D1358bb11e00000&d=DwIDAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=npe_gMkFnfxt6F5dGLs6zsNHWkYM30LkMFOk1_ZR1w8&s=zrgWcBnddWkMWG2zm-9nC8EwvHMsuqw_-EEXwl23XLg&e= > >>> > >>> kernel config: > >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_.config-3Fx-3D735296e4dd620b10&d=DwIDAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=npe_gMkFnfxt6F5dGLs6zsNHWkYM30LkMFOk1_ZR1w8&s=kbT6Yw89JDoIWSQtlLJ7sjyNoP2Ulud27GNorna1zQk&e= > >>> > >>> dashboard link: > >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_bug-3Fextid-3D122b5421d14e68f29cd1&d=DwIDAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=npe_gMkFnfxt6F5dGLs6zsNHWkYM30LkMFOk1_ZR1w8&s=U3pdUmrcroaeNsJ9DgFbTlvftQUCUcJ1CW_0NxS8yGA&e= > >>> > >>> compiler: gcc (GCC) 9.0.0 20181231 (experimental) > >>> syz repro: > >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_repro.syz-3Fx-3D14b67d6ee00000&d=DwIDAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=npe_gMkFnfxt6F5dGLs6zsNHWkYM30LkMFOk1_ZR1w8&s=TuSfjosRFQW3ArpQwikTtx-dgLLBSMgJfVKtUltqQBM&e= > >>> > >>> > >>> IMPORTANT: if you fix the bug, please add the following tag to the > >>> commit: > >>> Reported-by: syzbot+122b5421d14e68f29cd1@syzkaller.appspotmail.com > >>> > >>> ====================================================== > >>> WARNING: possible circular locking dependency detected > >>> 5.6.0-rc1-syzkaller #0 Not tainted > >>> ------------------------------------------------------ > >>> syz-executor.4/13544 is trying to acquire lock: > >>> ffffe8ffffcba0b8 (&loc_l->lock){....}, at: bpf_common_lru_push_free > >>> kernel/bpf/bpf_lru_list.c:516 [inline] > >>> ffffe8ffffcba0b8 (&loc_l->lock){....}, at: > >>> bpf_lru_push_free+0x250/0x5b0 kernel/bpf/bpf_lru_list.c:555 > >>> > >>> but task is already holding lock: > >>> ffff888094985960 (&htab->buckets[i].lock){....}, at: > >>> __htab_map_lookup_and_delete_batch+0x617/0x1540 > >>> kernel/bpf/hashtab.c:1322 > >>> > >>> which lock already depends on the new lock. > >>> > >>> > >>> the existing dependency chain (in reverse order) is: > >>> > >>> -> #2 (&htab->buckets[i].lock){....}: > >>> __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 > >>> [inline] > >>> _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 > >>> htab_lru_map_delete_node+0xce/0x2f0 kernel/bpf/hashtab.c:593 > >>> __bpf_lru_list_shrink_inactive kernel/bpf/bpf_lru_list.c:220 > >>> [inline] > >>> __bpf_lru_list_shrink+0xf9/0x470 kernel/bpf/bpf_lru_list.c:266 > >>> bpf_lru_list_pop_free_to_local kernel/bpf/bpf_lru_list.c:340 > >>> [inline] > >>> bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:447 [inline] > >>> bpf_lru_pop_free+0x87c/0x1670 kernel/bpf/bpf_lru_list.c:499 > >>> prealloc_lru_pop+0x2c/0xa0 kernel/bpf/hashtab.c:132 > >>> __htab_lru_percpu_map_update_elem+0x67e/0xa90 > >>> kernel/bpf/hashtab.c:1069 > >>> bpf_percpu_hash_update+0x16e/0x210 kernel/bpf/hashtab.c:1585 > >>> bpf_map_update_value.isra.0+0x2d7/0x8e0 kernel/bpf/syscall.c:181 > >>> generic_map_update_batch+0x41f/0x610 kernel/bpf/syscall.c:1319 > >>> bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 > >>> __do_sys_bpf+0x9b7/0x41e0 kernel/bpf/syscall.c:3460 > >>> __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] > >>> __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 > >>> do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 > >>> entry_SYSCALL_64_after_hwframe+0x49/0xbe > >>> > >>> -> #1 (&l->lock){....}: > >>> __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] > >>> _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151 > >>> bpf_lru_list_pop_free_to_local kernel/bpf/bpf_lru_list.c:325 > >>> [inline] > >>> bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:447 [inline] > >>> bpf_lru_pop_free+0x67f/0x1670 kernel/bpf/bpf_lru_list.c:499 > >>> prealloc_lru_pop+0x2c/0xa0 kernel/bpf/hashtab.c:132 > >>> __htab_lru_percpu_map_update_elem+0x67e/0xa90 > >>> kernel/bpf/hashtab.c:1069 > >>> bpf_percpu_hash_update+0x16e/0x210 kernel/bpf/hashtab.c:1585 > >>> bpf_map_update_value.isra.0+0x2d7/0x8e0 kernel/bpf/syscall.c:181 > >>> generic_map_update_batch+0x41f/0x610 kernel/bpf/syscall.c:1319 > >>> bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 > >>> __do_sys_bpf+0x9b7/0x41e0 kernel/bpf/syscall.c:3460 > >>> __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] > >>> __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 > >>> do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 > >>> entry_SYSCALL_64_after_hwframe+0x49/0xbe > >>> > >>> -> #0 (&loc_l->lock){....}: > >>> check_prev_add kernel/locking/lockdep.c:2475 [inline] > >>> check_prevs_add kernel/locking/lockdep.c:2580 [inline] > >>> validate_chain kernel/locking/lockdep.c:2970 [inline] > >>> __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3954 > >>> lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484 > >>> __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 > >>> [inline] > >>> _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 > >>> bpf_common_lru_push_free kernel/bpf/bpf_lru_list.c:516 [inline] > >>> bpf_lru_push_free+0x250/0x5b0 kernel/bpf/bpf_lru_list.c:555 > >>> __htab_map_lookup_and_delete_batch+0x8d4/0x1540 > >>> kernel/bpf/hashtab.c:1374 > >>> htab_lru_map_lookup_and_delete_batch+0x34/0x40 > >>> kernel/bpf/hashtab.c:1491 > >>> bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 > >>> __do_sys_bpf+0x1f7d/0x41e0 kernel/bpf/syscall.c:3456 > >>> __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] > >>> __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 > >>> do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 > >>> entry_SYSCALL_64_after_hwframe+0x49/0xbe > >>> > >>> other info that might help us debug this: > >>> > >>> Chain exists of: > >>> &loc_l->lock --> &l->lock --> &htab->buckets[i].lock > >>> > >>> Possible unsafe locking scenario: > >>> > >>> CPU0 CPU1 > >>> ---- ---- > >>> lock(&htab->buckets[i].lock); > >>> lock(&l->lock); > >>> lock(&htab->buckets[i].lock); > >>> lock(&loc_l->lock); > >>> > >>> *** DEADLOCK *** > >>> > >>> 2 locks held by syz-executor.4/13544: > >>> #0: ffffffff89bac240 (rcu_read_lock){....}, at: > >>> __htab_map_lookup_and_delete_batch+0x54b/0x1540 > >>> kernel/bpf/hashtab.c:1308 > >>> #1: ffff888094985960 (&htab->buckets[i].lock){....}, at: > >>> __htab_map_lookup_and_delete_batch+0x617/0x1540 > >>> kernel/bpf/hashtab.c:1322 > >>> > >>> stack backtrace: > >>> CPU: 0 PID: 13544 Comm: syz-executor.4 Not tainted > >>> 5.6.0-rc1-syzkaller #0 > >>> Hardware name: Google Google Compute Engine/Google Compute Engine, > >>> BIOS Google 01/01/2011 > >>> Call Trace: > >>> __dump_stack lib/dump_stack.c:77 [inline] > >>> dump_stack+0x197/0x210 lib/dump_stack.c:118 > >>> print_circular_bug.isra.0.cold+0x163/0x172 > >>> kernel/locking/lockdep.c:1684 > >>> check_noncircular+0x32e/0x3e0 kernel/locking/lockdep.c:1808 > >>> check_prev_add kernel/locking/lockdep.c:2475 [inline] > >>> check_prevs_add kernel/locking/lockdep.c:2580 [inline] > >>> validate_chain kernel/locking/lockdep.c:2970 [inline] > >>> __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3954 > >>> lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484 > >>> __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] > >>> _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 > >>> bpf_common_lru_push_free kernel/bpf/bpf_lru_list.c:516 [inline] > >>> bpf_lru_push_free+0x250/0x5b0 kernel/bpf/bpf_lru_list.c:555 > >>> __htab_map_lookup_and_delete_batch+0x8d4/0x1540 > >>> kernel/bpf/hashtab.c:1374 > >>> htab_lru_map_lookup_and_delete_batch+0x34/0x40 > >>> kernel/bpf/hashtab.c:1491 > >>> bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 > >>> __do_sys_bpf+0x1f7d/0x41e0 kernel/bpf/syscall.c:3456 > >>> __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] > >>> __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 > >>> do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 > >>> entry_SYSCALL_64_after_hwframe+0x49/0xbe > >> > >> Reclaim hash table elememt outside bucket lock. > > > > Thanks for the following patch. Yes, we do have an potential issue > > with the above deadlock if LRU hash map is not preallocated. > > > > I am not a RCU expert, but maybe you could you help clarify > > one thing below? > > > >> > >> --- a/kernel/bpf/hashtab.c > >> +++ b/kernel/bpf/hashtab.c > >> @@ -1259,6 +1259,7 @@ __htab_map_lookup_and_delete_batch(struc > >> u64 elem_map_flags, map_flags; > >> struct hlist_nulls_head *head; > >> struct hlist_nulls_node *n; > >> + struct hlist_nulls_node *node_to_free = NULL; > >> unsigned long flags; > >> struct htab_elem *l; > >> struct bucket *b; > >> @@ -1370,9 +1371,10 @@ again_nocopy: > >> } > >> if (do_delete) { > >> hlist_nulls_del_rcu(&l->hash_node); > >> - if (is_lru_map) > >> - bpf_lru_push_free(&htab->lru, &l->lru_node); > >> - else > >> + if (is_lru_map) { > >> + l->hash_node.next = node_to_free; > >> + node_to_free = &l->hash_node; > > > > Here, we change "next" pointer. How does this may impact the existing > > parallel map lookup which does not need to take bucket pointer? > > Thanks for Martin for explanation! I think changing l->hash_node.next is > unsafe here as another thread may execute on a different cpu and > traverse the same list. It will see hash_node.next = NULL and it is > unexpected. > > How about the following patch? I think I'm missing some emails here, but overall the patch looks good to me. > > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c > index 2d182c4ee9d9..246ef0f2e985 100644 > --- a/kernel/bpf/hashtab.c > +++ b/kernel/bpf/hashtab.c > @@ -56,6 +56,7 @@ struct htab_elem { > union { > struct bpf_htab *htab; > struct pcpu_freelist_node fnode; > + struct htab_elem *link; > }; > }; > }; > @@ -1256,6 +1257,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map > *map, > void __user *ukeys = u64_to_user_ptr(attr->batch.keys); > void *ubatch = u64_to_user_ptr(attr->batch.in_batch); > u32 batch, max_count, size, bucket_size; > + struct htab_elem *node_to_free = NULL; > u64 elem_map_flags, map_flags; > struct hlist_nulls_head *head; > struct hlist_nulls_node *n; > @@ -1370,9 +1372,14 @@ __htab_map_lookup_and_delete_batch(struct bpf_map > *map, > } > if (do_delete) { > hlist_nulls_del_rcu(&l->hash_node); > - if (is_lru_map) > - bpf_lru_push_free(&htab->lru, &l->lru_node); > - else > + if (is_lru_map) { > + /* l->hnode overlaps with * > l->hash_node.pprev > + * in memory. l->hash_node.pprev has been > + * poisoned and nobody should access it. > + */ > + l->link = node_to_free; > + node_to_free = l; > + } else > free_htab_elem(htab, l); > } > dst_key += key_size; > @@ -1380,6 +1387,13 @@ __htab_map_lookup_and_delete_batch(struct bpf_map > *map, > } > > raw_spin_unlock_irqrestore(&b->lock, flags); > + > + while (node_to_free) { > + l = node_to_free; > + node_to_free = node_to_free->link; > + bpf_lru_push_free(&htab->lru, &l->lru_node); > + } > + > /* If we are not copying data, we can go to next bucket and avoid > * unlocking the rcu. > */ > > ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20200219021542.3304-1-hdanton@sina.com>]
* Re: possible deadlock in bpf_lru_push_free [not found] ` <20200219021542.3304-1-hdanton@sina.com> @ 2020-02-19 4:03 ` Yonghong Song 0 siblings, 0 replies; 7+ messages in thread From: Yonghong Song @ 2020-02-19 4:03 UTC (permalink / raw) To: Hillf Danton Cc: syzbot, andriin, ast, bpf, daniel, kafai, linux-kernel, netdev, songliubraving, syzkaller-bugs On 2/18/20 6:15 PM, Hillf Danton wrote: > > Hey > > On Tue, 18 Feb 2020 15:55:02 -0800 Yonghong Song wrote: >> >> Thanks for Martin for explanation! I think changing l->hash_node.next is >> unsafe here as another thread may execute on a different cpu and >> traverse the same list. It will see hash_node.next = NULL and it is > > Good catch. > >> unexpected. >> >> How about the following patch? >> > Looks nicer, thanks :P > >> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c >> index 2d182c4ee9d9..246ef0f2e985 100644 >> --- a/kernel/bpf/hashtab.c >> +++ b/kernel/bpf/hashtab.c >> @@ -56,6 +56,7 @@ struct htab_elem { >> union { >> struct bpf_htab *htab; >> struct pcpu_freelist_node fnode; >> + struct htab_elem *link; >> }; >> }; >> }; >> @@ -1256,6 +1257,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map, >> void __user *ukeys = u64_to_user_ptr(attr->batch.keys); >> void *ubatch = u64_to_user_ptr(attr->batch.in_batch); >> u32 batch, max_count, size, bucket_size; >> + struct htab_elem *node_to_free = NULL; >> u64 elem_map_flags, map_flags; >> struct hlist_nulls_head *head; >> struct hlist_nulls_node *n; >> @@ -1370,9 +1372,14 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map, >> } >> if (do_delete) { >> hlist_nulls_del_rcu(&l->hash_node); >> - if (is_lru_map) >> - bpf_lru_push_free(&htab->lru, &l->lru_node); >> - else >> + if (is_lru_map) { >> + /* l->hnode overlaps with *l->hash_node.pprev > > nit: looks like you mean l->link Yes, my previous attempt uses "hnode" and later changed to "link" but forget to change the comments. Will post a patch soon. > >> + * in memory. l->hash_node.pprev has been >> + * poisoned and nobody should access it. >> + */ >> + l->link = node_to_free; >> + node_to_free = l; >> + } else >> free_htab_elem(htab, l); >> } >> dst_key += key_size; > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-02-19 4:55 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-02-16 10:57 possible deadlock in bpf_lru_push_free syzbot 2020-02-16 12:17 ` syzbot 2020-02-19 0:37 ` syzbot [not found] <20200217052336.5556-1-hdanton@sina.com> 2020-02-18 17:44 ` Yonghong Song 2020-02-18 23:55 ` Yonghong Song 2020-02-19 4:55 ` Brian Vazquez [not found] ` <20200219021542.3304-1-hdanton@sina.com> 2020-02-19 4:03 ` Yonghong Song
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).