bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* corrupted pvqspinlock in htab_map_update_elem
@ 2021-01-31  8:42 Dmitry Vyukov
  2021-02-01  9:50 ` Peter Zijlstra
  0 siblings, 1 reply; 7+ messages in thread
From: Dmitry Vyukov @ 2021-01-31  8:42 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, andrii, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, kpsingh, netdev, bpf,
	LKML, Peter Zijlstra, Ingo Molnar

Hi,

I am testing the following the program:
https://gist.github.com/dvyukov/e5c0a8ef220ef856363c1080b0936a9e
on the latest upstream 6642d600b541b81931fb1ab0c041b0d68f77be7e and
getting the following crash. Config is:
https://gist.github.com/dvyukov/16d9905e5ef35e44285451f1d330ddbc

The program updates a bpf map from a program called on hw breakpoint
hit. Not sure if it's a bpf issue or a perf issue. This time it is not
a fuzzer workload, I am trying to do something useful :)

------------[ cut here ]------------
pvqspinlock: lock 0xffffffff8f371d80 has corrupted value 0x0!
WARNING: CPU: 3 PID: 8771 at kernel/locking/qspinlock_paravirt.h:498
__pv_queued_spin_unlock_slowpath+0x22e/0x2b0
kernel/locking/qspinlock_paravirt.h:498
Modules linked in:
CPU: 3 PID: 8771 Comm: a.out Not tainted 5.11.0-rc5+ #71
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.13.0-44-g88ab0c15525c-prebuilt.qemu.org 04/01/2014
RIP: 0010:__pv_queued_spin_unlock_slowpath+0x22e/0x2b0
kernel/locking/qspinlock_paravirt.h:498
Code: ea 03 0f b6 14 02 4c 89 e8 83 e0 07 83 c0 03 38 d0 7c 04 84 d2
75 62 41 8b 55 00 4c 89 ee 48 c7 c7 20 6b 4c 89 e8 72 d3 5f 07 <0f> 0b
e9 6cc
RSP: 0018:fffffe00000c17b0 EFLAGS: 00010086
RAX: 0000000000000000 RBX: ffffffff8f3b5660 RCX: 0000000000000000
RDX: ffff8880150222c0 RSI: ffffffff815b624d RDI: fffffbc0000182e8
RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
R10: ffffffff817de94f R11: 0000000000000000 R12: ffff8880150222c0
R13: ffffffff8f371d80 R14: ffff8880181fead8 R15: 0000000000000000
FS:  00007fa5b51f0700(0000) GS:ffff88802cf80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000002286908 CR3: 0000000015b24000 CR4: 0000000000750ee0
DR0: 00000000004cb3d4 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 <#DB>
 __raw_callee_save___pv_queued_spin_unlock_slowpath+0x11/0x20
 .slowpath+0x9/0xe
 pv_queued_spin_unlock arch/x86/include/asm/paravirt.h:559 [inline]
 queued_spin_unlock arch/x86/include/asm/qspinlock.h:56 [inline]
 lockdep_unlock+0x10e/0x290 kernel/locking/lockdep.c:124
 debug_locks_off_graph_unlock kernel/locking/lockdep.c:165 [inline]
 print_usage_bug kernel/locking/lockdep.c:3710 [inline]
 verify_lock_unused kernel/locking/lockdep.c:5374 [inline]
 lock_acquire kernel/locking/lockdep.c:5433 [inline]
 lock_acquire+0x471/0x720 kernel/locking/lockdep.c:5407
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
 _raw_spin_lock_irqsave+0x39/0x50 kernel/locking/spinlock.c:159
 htab_lock_bucket kernel/bpf/hashtab.c:175 [inline]
 htab_map_update_elem+0x1f0/0x790 kernel/bpf/hashtab.c:1023
 bpf_prog_60236c52b8017ad1+0x8e/0xab4
 bpf_dispatcher_nop_func include/linux/bpf.h:651 [inline]
 bpf_overflow_handler+0x192/0x5b0 kernel/events/core.c:9755
 __perf_event_overflow+0x13c/0x370 kernel/events/core.c:8979
 perf_swevent_overflow kernel/events/core.c:9055 [inline]
 perf_swevent_event+0x347/0x550 kernel/events/core.c:9083
 perf_bp_event+0x1a2/0x1c0 kernel/events/core.c:9932
 hw_breakpoint_handler arch/x86/kernel/hw_breakpoint.c:535 [inline]
 hw_breakpoint_exceptions_notify+0x18a/0x3b0 arch/x86/kernel/hw_breakpoint.c:567
 notifier_call_chain+0xb5/0x200 kernel/notifier.c:83
 atomic_notifier_call_chain+0x8d/0x170 kernel/notifier.c:217
 notify_die+0xda/0x170 kernel/notifier.c:548
 notify_debug+0x20/0x30 arch/x86/kernel/traps.c:842
 exc_debug_kernel arch/x86/kernel/traps.c:902 [inline]
 exc_debug+0x103/0x140 arch/x86/kernel/traps.c:998
 asm_exc_debug+0x19/0x30 arch/x86/include/asm/idtentry.h:598
RIP: 0010:copy_user_generic_unrolled+0xa2/0xc0 arch/x86/lib/copy_user_64.S:102
Code: ff c9 75 b6 89 d1 83 e2 07 c1 e9 03 74 12 4c 8b 06 4c 89 07 48
8d 76 08 48 8d 7f 08 ff c9 75 ee 21 d2 74 10 89 d1 8a 06 88 07 <48> ff
c6 484
RSP: 0018:ffffc90000d67af0 EFLAGS: 00040202
RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000001
RDX: 0000000000000001 RSI: ffff88801341d001 RDI: 00000000004cb3d4
RBP: 00000000004cb3d4 R08: 0000000000000000 R09: ffff88801341d001
R10: ffffed1002683a00 R11: 0000000000000000 R12: ffff88801341d001
R13: 00000000004cb3d5 R14: 0000000000000000 R15: 0000000000000001
 </#DB>
 copy_user_generic arch/x86/include/asm/uaccess_64.h:37 [inline]
 raw_copy_to_user arch/x86/include/asm/uaccess_64.h:58 [inline]
 copyout.part.0+0xe4/0x110 lib/iov_iter.c:148
 copyout lib/iov_iter.c:182 [inline]
 copy_page_to_iter_iovec lib/iov_iter.c:219 [inline]
 copy_page_to_iter+0x416/0xed0 lib/iov_iter.c:926
 pipe_read+0x4a4/0x13a0 fs/pipe.c:290
 call_read_iter include/linux/fs.h:1895 [inline]
 new_sync_read+0x5b7/0x6e0 fs/read_write.c:415
 vfs_read+0x35c/0x570 fs/read_write.c:496
 ksys_read+0x1ee/0x250 fs/read_write.c:634
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x406c1c
Code: ec 28 48 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 f9 fc ff ff
48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 31 c0 0f 05 <48> 3d
00 f08
RSP: 002b:00007fa5b51f02d0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 00000000004cb3d4 RCX: 0000000000406c1c
RDX: 0000000000000001 RSI: 00000000004cb3d4 RDI: 0000000000000004
RBP: 00007fa5b51f030c R08: 0000000000000000 R09: 0000000000000000
R10: 00007fa5b51f0700 R11: 0000000000000246 R12: 00000000004cb3d0
R13: cccccccccccccccd R14: 00007fa5b51f0400 R15: 0000000000802000

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: corrupted pvqspinlock in htab_map_update_elem
  2021-01-31  8:42 corrupted pvqspinlock in htab_map_update_elem Dmitry Vyukov
@ 2021-02-01  9:50 ` Peter Zijlstra
  2021-02-01 10:06   ` Dmitry Vyukov
  2021-02-01 11:23   ` Peter Zijlstra
  0 siblings, 2 replies; 7+ messages in thread
From: Peter Zijlstra @ 2021-02-01  9:50 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Alexei Starovoitov, Daniel Borkmann, andrii, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, kpsingh, netdev, bpf,
	LKML, Ingo Molnar

On Sun, Jan 31, 2021 at 09:42:53AM +0100, Dmitry Vyukov wrote:
> Hi,
> 
> I am testing the following the program:
> https://gist.github.com/dvyukov/e5c0a8ef220ef856363c1080b0936a9e
> on the latest upstream 6642d600b541b81931fb1ab0c041b0d68f77be7e and
> getting the following crash. Config is:
> https://gist.github.com/dvyukov/16d9905e5ef35e44285451f1d330ddbc
> 
> The program updates a bpf map from a program called on hw breakpoint
> hit. Not sure if it's a bpf issue or a perf issue. This time it is not
> a fuzzer workload, I am trying to do something useful :)

Something useful and BPF don't go together as far as I'm concerned.

> ------------[ cut here ]------------
> pvqspinlock: lock 0xffffffff8f371d80 has corrupted value 0x0!
> WARNING: CPU: 3 PID: 8771 at kernel/locking/qspinlock_paravirt.h:498
> __pv_queued_spin_unlock_slowpath+0x22e/0x2b0
> kernel/locking/qspinlock_paravirt.h:498
> Modules linked in:
> CPU: 3 PID: 8771 Comm: a.out Not tainted 5.11.0-rc5+ #71
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> rel-1.13.0-44-g88ab0c15525c-prebuilt.qemu.org 04/01/2014
> RIP: 0010:__pv_queued_spin_unlock_slowpath+0x22e/0x2b0
> kernel/locking/qspinlock_paravirt.h:498
> Code: ea 03 0f b6 14 02 4c 89 e8 83 e0 07 83 c0 03 38 d0 7c 04 84 d2
> 75 62 41 8b 55 00 4c 89 ee 48 c7 c7 20 6b 4c 89 e8 72 d3 5f 07 <0f> 0b
> e9 6cc
> RSP: 0018:fffffe00000c17b0 EFLAGS: 00010086
> RAX: 0000000000000000 RBX: ffffffff8f3b5660 RCX: 0000000000000000
> RDX: ffff8880150222c0 RSI: ffffffff815b624d RDI: fffffbc0000182e8
> RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
> R10: ffffffff817de94f R11: 0000000000000000 R12: ffff8880150222c0
> R13: ffffffff8f371d80 R14: ffff8880181fead8 R15: 0000000000000000
> FS:  00007fa5b51f0700(0000) GS:ffff88802cf80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000002286908 CR3: 0000000015b24000 CR4: 0000000000750ee0
> DR0: 00000000004cb3d4 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> PKRU: 55555554
> Call Trace:
>  <#DB>
>  __raw_callee_save___pv_queued_spin_unlock_slowpath+0x11/0x20
>  .slowpath+0x9/0xe
>  pv_queued_spin_unlock arch/x86/include/asm/paravirt.h:559 [inline]
>  queued_spin_unlock arch/x86/include/asm/qspinlock.h:56 [inline]
>  lockdep_unlock+0x10e/0x290 kernel/locking/lockdep.c:124
>  debug_locks_off_graph_unlock kernel/locking/lockdep.c:165 [inline]
>  print_usage_bug kernel/locking/lockdep.c:3710 [inline]

Ha, I think you hit a bug in lockdep. But it was about to tell you you
can't go take locks from NMI context that are also used outside of it.

>  verify_lock_unused kernel/locking/lockdep.c:5374 [inline]
>  lock_acquire kernel/locking/lockdep.c:5433 [inline]
>  lock_acquire+0x471/0x720 kernel/locking/lockdep.c:5407
>  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>  _raw_spin_lock_irqsave+0x39/0x50 kernel/locking/spinlock.c:159
>  htab_lock_bucket kernel/bpf/hashtab.c:175 [inline]
>  htab_map_update_elem+0x1f0/0x790 kernel/bpf/hashtab.c:1023
>  bpf_prog_60236c52b8017ad1+0x8e/0xab4
>  bpf_dispatcher_nop_func include/linux/bpf.h:651 [inline]
>  bpf_overflow_handler+0x192/0x5b0 kernel/events/core.c:9755
>  __perf_event_overflow+0x13c/0x370 kernel/events/core.c:8979
>  perf_swevent_overflow kernel/events/core.c:9055 [inline]
>  perf_swevent_event+0x347/0x550 kernel/events/core.c:9083
>  perf_bp_event+0x1a2/0x1c0 kernel/events/core.c:9932
>  hw_breakpoint_handler arch/x86/kernel/hw_breakpoint.c:535 [inline]
>  hw_breakpoint_exceptions_notify+0x18a/0x3b0 arch/x86/kernel/hw_breakpoint.c:567
>  notifier_call_chain+0xb5/0x200 kernel/notifier.c:83
>  atomic_notifier_call_chain+0x8d/0x170 kernel/notifier.c:217
>  notify_die+0xda/0x170 kernel/notifier.c:548
>  notify_debug+0x20/0x30 arch/x86/kernel/traps.c:842
>  exc_debug_kernel arch/x86/kernel/traps.c:902 [inline]
>  exc_debug+0x103/0x140 arch/x86/kernel/traps.c:998
>  asm_exc_debug+0x19/0x30 arch/x86/include/asm/idtentry.h:598

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: corrupted pvqspinlock in htab_map_update_elem
  2021-02-01  9:50 ` Peter Zijlstra
@ 2021-02-01 10:06   ` Dmitry Vyukov
  2021-02-01 11:23   ` Peter Zijlstra
  1 sibling, 0 replies; 7+ messages in thread
From: Dmitry Vyukov @ 2021-02-01 10:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Alexei Starovoitov, Daniel Borkmann, andrii, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, kpsingh, netdev, bpf,
	LKML, Ingo Molnar

On Mon, Feb 1, 2021 at 10:51 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Sun, Jan 31, 2021 at 09:42:53AM +0100, Dmitry Vyukov wrote:
> > Hi,
> >
> > I am testing the following the program:
> > https://gist.github.com/dvyukov/e5c0a8ef220ef856363c1080b0936a9e
> > on the latest upstream 6642d600b541b81931fb1ab0c041b0d68f77be7e and
> > getting the following crash. Config is:
> > https://gist.github.com/dvyukov/16d9905e5ef35e44285451f1d330ddbc
> >
> > The program updates a bpf map from a program called on hw breakpoint
> > hit. Not sure if it's a bpf issue or a perf issue. This time it is not
> > a fuzzer workload, I am trying to do something useful :)
>
> Something useful and BPF don't go together as far as I'm concerned.
>
> > ------------[ cut here ]------------
> > pvqspinlock: lock 0xffffffff8f371d80 has corrupted value 0x0!
> > WARNING: CPU: 3 PID: 8771 at kernel/locking/qspinlock_paravirt.h:498
> > __pv_queued_spin_unlock_slowpath+0x22e/0x2b0
> > kernel/locking/qspinlock_paravirt.h:498
> > Modules linked in:
> > CPU: 3 PID: 8771 Comm: a.out Not tainted 5.11.0-rc5+ #71
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> > rel-1.13.0-44-g88ab0c15525c-prebuilt.qemu.org 04/01/2014
> > RIP: 0010:__pv_queued_spin_unlock_slowpath+0x22e/0x2b0
> > kernel/locking/qspinlock_paravirt.h:498
> > Code: ea 03 0f b6 14 02 4c 89 e8 83 e0 07 83 c0 03 38 d0 7c 04 84 d2
> > 75 62 41 8b 55 00 4c 89 ee 48 c7 c7 20 6b 4c 89 e8 72 d3 5f 07 <0f> 0b
> > e9 6cc
> > RSP: 0018:fffffe00000c17b0 EFLAGS: 00010086
> > RAX: 0000000000000000 RBX: ffffffff8f3b5660 RCX: 0000000000000000
> > RDX: ffff8880150222c0 RSI: ffffffff815b624d RDI: fffffbc0000182e8
> > RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
> > R10: ffffffff817de94f R11: 0000000000000000 R12: ffff8880150222c0
> > R13: ffffffff8f371d80 R14: ffff8880181fead8 R15: 0000000000000000
> > FS:  00007fa5b51f0700(0000) GS:ffff88802cf80000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000000002286908 CR3: 0000000015b24000 CR4: 0000000000750ee0
> > DR0: 00000000004cb3d4 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > PKRU: 55555554
> > Call Trace:
> >  <#DB>
> >  __raw_callee_save___pv_queued_spin_unlock_slowpath+0x11/0x20
> >  .slowpath+0x9/0xe
> >  pv_queued_spin_unlock arch/x86/include/asm/paravirt.h:559 [inline]
> >  queued_spin_unlock arch/x86/include/asm/qspinlock.h:56 [inline]
> >  lockdep_unlock+0x10e/0x290 kernel/locking/lockdep.c:124
> >  debug_locks_off_graph_unlock kernel/locking/lockdep.c:165 [inline]
> >  print_usage_bug kernel/locking/lockdep.c:3710 [inline]
>
> Ha, I think you hit a bug in lockdep. But it was about to tell you you
> can't go take locks from NMI context that are also used outside of it.

Mkay. Perf calls a BPF program from NMI context. Should that program
type be significantly restricted? But even if maps can't be used, is
there anything useful a program invoked from such context can do at
all?

> >  verify_lock_unused kernel/locking/lockdep.c:5374 [inline]
> >  lock_acquire kernel/locking/lockdep.c:5433 [inline]
> >  lock_acquire+0x471/0x720 kernel/locking/lockdep.c:5407
> >  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
> >  _raw_spin_lock_irqsave+0x39/0x50 kernel/locking/spinlock.c:159
> >  htab_lock_bucket kernel/bpf/hashtab.c:175 [inline]
> >  htab_map_update_elem+0x1f0/0x790 kernel/bpf/hashtab.c:1023
> >  bpf_prog_60236c52b8017ad1+0x8e/0xab4
> >  bpf_dispatcher_nop_func include/linux/bpf.h:651 [inline]
> >  bpf_overflow_handler+0x192/0x5b0 kernel/events/core.c:9755
> >  __perf_event_overflow+0x13c/0x370 kernel/events/core.c:8979
> >  perf_swevent_overflow kernel/events/core.c:9055 [inline]
> >  perf_swevent_event+0x347/0x550 kernel/events/core.c:9083
> >  perf_bp_event+0x1a2/0x1c0 kernel/events/core.c:9932
> >  hw_breakpoint_handler arch/x86/kernel/hw_breakpoint.c:535 [inline]
> >  hw_breakpoint_exceptions_notify+0x18a/0x3b0 arch/x86/kernel/hw_breakpoint.c:567
> >  notifier_call_chain+0xb5/0x200 kernel/notifier.c:83
> >  atomic_notifier_call_chain+0x8d/0x170 kernel/notifier.c:217
> >  notify_die+0xda/0x170 kernel/notifier.c:548
> >  notify_debug+0x20/0x30 arch/x86/kernel/traps.c:842
> >  exc_debug_kernel arch/x86/kernel/traps.c:902 [inline]
> >  exc_debug+0x103/0x140 arch/x86/kernel/traps.c:998
> >  asm_exc_debug+0x19/0x30 arch/x86/include/asm/idtentry.h:598

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: corrupted pvqspinlock in htab_map_update_elem
  2021-02-01  9:50 ` Peter Zijlstra
  2021-02-01 10:06   ` Dmitry Vyukov
@ 2021-02-01 11:23   ` Peter Zijlstra
  2021-02-01 17:53     ` Waiman Long
  1 sibling, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2021-02-01 11:23 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Alexei Starovoitov, Daniel Borkmann, andrii, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, kpsingh, netdev, bpf,
	LKML, Ingo Molnar

On Mon, Feb 01, 2021 at 10:50:58AM +0100, Peter Zijlstra wrote:

> >  queued_spin_unlock arch/x86/include/asm/qspinlock.h:56 [inline]
> >  lockdep_unlock+0x10e/0x290 kernel/locking/lockdep.c:124
> >  debug_locks_off_graph_unlock kernel/locking/lockdep.c:165 [inline]
> >  print_usage_bug kernel/locking/lockdep.c:3710 [inline]
> 
> Ha, I think you hit a bug in lockdep.

Something like so I suppose.

---
Subject: locking/lockdep: Avoid unmatched unlock
From: Peter Zijlstra <peterz@infradead.org>
Date: Mon Feb 1 11:55:38 CET 2021

Commit f6f48e180404 ("lockdep: Teach lockdep about "USED" <- "IN-NMI"
inversions") overlooked that print_usage_bug() releases the graph_lock
and called it without the graph lock held.

Fixes: f6f48e180404 ("lockdep: Teach lockdep about "USED" <- "IN-NMI" inversions")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/locking/lockdep.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -3773,7 +3773,7 @@ static void
 print_usage_bug(struct task_struct *curr, struct held_lock *this,
 		enum lock_usage_bit prev_bit, enum lock_usage_bit new_bit)
 {
-	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
+	if (!debug_locks_off() || debug_locks_silent)
 		return;
 
 	pr_warn("\n");
@@ -3814,6 +3814,7 @@ valid_state(struct task_struct *curr, st
 	    enum lock_usage_bit new_bit, enum lock_usage_bit bad_bit)
 {
 	if (unlikely(hlock_class(this)->usage_mask & (1 << bad_bit))) {
+		graph_unlock()
 		print_usage_bug(curr, this, bad_bit, new_bit);
 		return 0;
 	}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: corrupted pvqspinlock in htab_map_update_elem
  2021-02-01 11:23   ` Peter Zijlstra
@ 2021-02-01 17:53     ` Waiman Long
  2021-02-01 18:09       ` Dmitry Vyukov
  0 siblings, 1 reply; 7+ messages in thread
From: Waiman Long @ 2021-02-01 17:53 UTC (permalink / raw)
  To: Peter Zijlstra, Dmitry Vyukov
  Cc: Alexei Starovoitov, Daniel Borkmann, andrii, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, kpsingh, netdev, bpf,
	LKML, Ingo Molnar

On 2/1/21 6:23 AM, Peter Zijlstra wrote:
> On Mon, Feb 01, 2021 at 10:50:58AM +0100, Peter Zijlstra wrote:
>
>>>   queued_spin_unlock arch/x86/include/asm/qspinlock.h:56 [inline]
>>>   lockdep_unlock+0x10e/0x290 kernel/locking/lockdep.c:124
>>>   debug_locks_off_graph_unlock kernel/locking/lockdep.c:165 [inline]
>>>   print_usage_bug kernel/locking/lockdep.c:3710 [inline]
>> Ha, I think you hit a bug in lockdep.
> Something like so I suppose.
>
> ---
> Subject: locking/lockdep: Avoid unmatched unlock
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Mon Feb 1 11:55:38 CET 2021
>
> Commit f6f48e180404 ("lockdep: Teach lockdep about "USED" <- "IN-NMI"
> inversions") overlooked that print_usage_bug() releases the graph_lock
> and called it without the graph lock held.
>
> Fixes: f6f48e180404 ("lockdep: Teach lockdep about "USED" <- "IN-NMI" inversions")
> Reported-by: Dmitry Vyukov <dvyukov@google.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>   kernel/locking/lockdep.c |    3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -3773,7 +3773,7 @@ static void
>   print_usage_bug(struct task_struct *curr, struct held_lock *this,
>   		enum lock_usage_bit prev_bit, enum lock_usage_bit new_bit)
>   {
> -	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
> +	if (!debug_locks_off() || debug_locks_silent)
>   		return;
>   
>   	pr_warn("\n");
> @@ -3814,6 +3814,7 @@ valid_state(struct task_struct *curr, st
>   	    enum lock_usage_bit new_bit, enum lock_usage_bit bad_bit)
>   {
>   	if (unlikely(hlock_class(this)->usage_mask & (1 << bad_bit))) {
> +		graph_unlock()
>   		print_usage_bug(curr, this, bad_bit, new_bit);
>   		return 0;
>   	}

I have also suspected doing unlock without a corresponding lock. This 
patch looks good to me.

Acked-by: Waiman Long <longman@redhat.com>

Cheers,
Longman


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: corrupted pvqspinlock in htab_map_update_elem
  2021-02-01 17:53     ` Waiman Long
@ 2021-02-01 18:09       ` Dmitry Vyukov
  2021-02-01 18:14         ` Waiman Long
  0 siblings, 1 reply; 7+ messages in thread
From: Dmitry Vyukov @ 2021-02-01 18:09 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann, andrii,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	kpsingh, netdev, bpf, LKML, Ingo Molnar

On Mon, Feb 1, 2021 at 6:54 PM Waiman Long <longman@redhat.com> wrote:
>
> On 2/1/21 6:23 AM, Peter Zijlstra wrote:
> > On Mon, Feb 01, 2021 at 10:50:58AM +0100, Peter Zijlstra wrote:
> >
> >>>   queued_spin_unlock arch/x86/include/asm/qspinlock.h:56 [inline]
> >>>   lockdep_unlock+0x10e/0x290 kernel/locking/lockdep.c:124
> >>>   debug_locks_off_graph_unlock kernel/locking/lockdep.c:165 [inline]
> >>>   print_usage_bug kernel/locking/lockdep.c:3710 [inline]
> >> Ha, I think you hit a bug in lockdep.
> > Something like so I suppose.
> >
> > ---
> > Subject: locking/lockdep: Avoid unmatched unlock
> > From: Peter Zijlstra <peterz@infradead.org>
> > Date: Mon Feb 1 11:55:38 CET 2021
> >
> > Commit f6f48e180404 ("lockdep: Teach lockdep about "USED" <- "IN-NMI"
> > inversions") overlooked that print_usage_bug() releases the graph_lock
> > and called it without the graph lock held.
> >
> > Fixes: f6f48e180404 ("lockdep: Teach lockdep about "USED" <- "IN-NMI" inversions")
> > Reported-by: Dmitry Vyukov <dvyukov@google.com>
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > ---
> >   kernel/locking/lockdep.c |    3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > --- a/kernel/locking/lockdep.c
> > +++ b/kernel/locking/lockdep.c
> > @@ -3773,7 +3773,7 @@ static void
> >   print_usage_bug(struct task_struct *curr, struct held_lock *this,
> >               enum lock_usage_bit prev_bit, enum lock_usage_bit new_bit)
> >   {
> > -     if (!debug_locks_off_graph_unlock() || debug_locks_silent)
> > +     if (!debug_locks_off() || debug_locks_silent)
> >               return;
> >
> >       pr_warn("\n");
> > @@ -3814,6 +3814,7 @@ valid_state(struct task_struct *curr, st
> >           enum lock_usage_bit new_bit, enum lock_usage_bit bad_bit)
> >   {
> >       if (unlikely(hlock_class(this)->usage_mask & (1 << bad_bit))) {
> > +             graph_unlock()
> >               print_usage_bug(curr, this, bad_bit, new_bit);
> >               return 0;
> >       }
>
> I have also suspected doing unlock without a corresponding lock. This
> patch looks good to me.
>
> Acked-by: Waiman Long <longman@redhat.com>

Just so that it's not lost: there is still a bug related to bpf map lock, right?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: corrupted pvqspinlock in htab_map_update_elem
  2021-02-01 18:09       ` Dmitry Vyukov
@ 2021-02-01 18:14         ` Waiman Long
  0 siblings, 0 replies; 7+ messages in thread
From: Waiman Long @ 2021-02-01 18:14 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann, andrii,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	kpsingh, netdev, bpf, LKML, Ingo Molnar

On 2/1/21 1:09 PM, Dmitry Vyukov wrote:
> On Mon, Feb 1, 2021 at 6:54 PM Waiman Long <longman@redhat.com> wrote:
>> On 2/1/21 6:23 AM, Peter Zijlstra wrote:
>>> On Mon, Feb 01, 2021 at 10:50:58AM +0100, Peter Zijlstra wrote:
>>>
>>>>>    queued_spin_unlock arch/x86/include/asm/qspinlock.h:56 [inline]
>>>>>    lockdep_unlock+0x10e/0x290 kernel/locking/lockdep.c:124
>>>>>    debug_locks_off_graph_unlock kernel/locking/lockdep.c:165 [inline]
>>>>>    print_usage_bug kernel/locking/lockdep.c:3710 [inline]
>>>> Ha, I think you hit a bug in lockdep.
>>> Something like so I suppose.
>>>
>>> ---
>>> Subject: locking/lockdep: Avoid unmatched unlock
>>> From: Peter Zijlstra <peterz@infradead.org>
>>> Date: Mon Feb 1 11:55:38 CET 2021
>>>
>>> Commit f6f48e180404 ("lockdep: Teach lockdep about "USED" <- "IN-NMI"
>>> inversions") overlooked that print_usage_bug() releases the graph_lock
>>> and called it without the graph lock held.
>>>
>>> Fixes: f6f48e180404 ("lockdep: Teach lockdep about "USED" <- "IN-NMI" inversions")
>>> Reported-by: Dmitry Vyukov <dvyukov@google.com>
>>> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>>> ---
>>>    kernel/locking/lockdep.c |    3 ++-
>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> --- a/kernel/locking/lockdep.c
>>> +++ b/kernel/locking/lockdep.c
>>> @@ -3773,7 +3773,7 @@ static void
>>>    print_usage_bug(struct task_struct *curr, struct held_lock *this,
>>>                enum lock_usage_bit prev_bit, enum lock_usage_bit new_bit)
>>>    {
>>> -     if (!debug_locks_off_graph_unlock() || debug_locks_silent)
>>> +     if (!debug_locks_off() || debug_locks_silent)
>>>                return;
>>>
>>>        pr_warn("\n");
>>> @@ -3814,6 +3814,7 @@ valid_state(struct task_struct *curr, st
>>>            enum lock_usage_bit new_bit, enum lock_usage_bit bad_bit)
>>>    {
>>>        if (unlikely(hlock_class(this)->usage_mask & (1 << bad_bit))) {
>>> +             graph_unlock()
>>>                print_usage_bug(curr, this, bad_bit, new_bit);
>>>                return 0;
>>>        }
>> I have also suspected doing unlock without a corresponding lock. This
>> patch looks good to me.
>>
>> Acked-by: Waiman Long <longman@redhat.com>
> Just so that it's not lost: there is still a bug related to bpf map lock, right?
>
That is right. This patch just fixes the bug in lockdep.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-02-01 18:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-31  8:42 corrupted pvqspinlock in htab_map_update_elem Dmitry Vyukov
2021-02-01  9:50 ` Peter Zijlstra
2021-02-01 10:06   ` Dmitry Vyukov
2021-02-01 11:23   ` Peter Zijlstra
2021-02-01 17:53     ` Waiman Long
2021-02-01 18:09       ` Dmitry Vyukov
2021-02-01 18:14         ` Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).