bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [syzbot] possible deadlock in ktime_get_coarse_ts64
       [not found] ` <87lf224uki.ffs@tglx>
@ 2021-11-05 15:53   ` Alexei Starovoitov
  2021-11-05 17:03     ` Dmitrii Banshchikov
  0 siblings, 1 reply; 6+ messages in thread
From: Alexei Starovoitov @ 2021-11-05 15:53 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: syzbot, John Stultz, LKML, sboyd, syzkaller-bugs, Peter Zijlstra,
	Mark Rutland, Steven Rostedt, Alexei Starovoitov,
	Daniel Borkmann, bpf, Dmitrii Banshchikov

On Fri, Nov 5, 2021 at 6:10 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> >
> > -> #0 (tk_core.seq.seqcount){----}-{0:0}:
> >        check_prev_add kernel/locking/lockdep.c:3051 [inline]
> >        check_prevs_add kernel/locking/lockdep.c:3174 [inline]
> >        validate_chain+0x1dfb/0x8240 kernel/locking/lockdep.c:3789
> >        __lock_acquire+0x1382/0x2b00 kernel/locking/lockdep.c:5015
> >        lock_acquire+0x19f/0x4d0 kernel/locking/lockdep.c:5625
> >        seqcount_lockdep_reader_access+0xfe/0x230 include/linux/seqlock.h:103
> >        ktime_get_coarse_ts64+0x25/0x110 kernel/time/timekeeping.c:2255
> >        ktime_get_coarse include/linux/timekeeping.h:120 [inline]
> >        ktime_get_coarse_ns include/linux/timekeeping.h:126 [inline]
>
> --> this call is invalid
>
> >        ____bpf_ktime_get_coarse_ns kernel/bpf/helpers.c:173 [inline]
> >        bpf_ktime_get_coarse_ns+0x7e/0x130 kernel/bpf/helpers.c:171
> >        bpf_prog_a99735ebafdda2f1+0x10/0xb50
> >        bpf_dispatcher_nop_func include/linux/bpf.h:721 [inline]
> >        __bpf_prog_run include/linux/filter.h:626 [inline]
> >        bpf_prog_run include/linux/filter.h:633 [inline]
> >        BPF_PROG_RUN_ARRAY include/linux/bpf.h:1294 [inline]
> >        trace_call_bpf+0x2cf/0x5d0 kernel/trace/bpf_trace.c:127
> >        perf_trace_run_bpf_submit+0x7b/0x1d0 kernel/events/core.c:9708
> >        perf_trace_lock+0x37c/0x440 include/trace/events/lock.h:39
> >        trace_lock_release+0x128/0x150 include/trace/events/lock.h:58
>
> Timestamps from within a tracepoint can only be taken with:
>
>          1) jiffies
>          2) sched_clock()
>          3) ktime_get_*_fast_ns()
>
> Those are NMI safe and can be invoked from anywhere.
>
> All other time getters which have to use the timekeeping seqcount
> protection are prone to live locks and _cannot_ be used from
> tracepoints ever.

Obviously.
That helper was added for networking use cases and accidentally
enabled for tracing.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] possible deadlock in ktime_get_coarse_ts64
  2021-11-05 15:53   ` [syzbot] possible deadlock in ktime_get_coarse_ts64 Alexei Starovoitov
@ 2021-11-05 17:03     ` Dmitrii Banshchikov
  2021-11-05 17:24       ` Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitrii Banshchikov @ 2021-11-05 17:03 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Thomas Gleixner, syzbot, John Stultz, LKML, sboyd,
	syzkaller-bugs, Peter Zijlstra, Mark Rutland, Steven Rostedt,
	Alexei Starovoitov, Daniel Borkmann, bpf

On Fri, Nov 05, 2021 at 08:53:06AM -0700, Alexei Starovoitov wrote:
> On Fri, Nov 5, 2021 at 6:10 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > >
> > > -> #0 (tk_core.seq.seqcount){----}-{0:0}:
> > >        check_prev_add kernel/locking/lockdep.c:3051 [inline]
> > >        check_prevs_add kernel/locking/lockdep.c:3174 [inline]
> > >        validate_chain+0x1dfb/0x8240 kernel/locking/lockdep.c:3789
> > >        __lock_acquire+0x1382/0x2b00 kernel/locking/lockdep.c:5015
> > >        lock_acquire+0x19f/0x4d0 kernel/locking/lockdep.c:5625
> > >        seqcount_lockdep_reader_access+0xfe/0x230 include/linux/seqlock.h:103
> > >        ktime_get_coarse_ts64+0x25/0x110 kernel/time/timekeeping.c:2255
> > >        ktime_get_coarse include/linux/timekeeping.h:120 [inline]
> > >        ktime_get_coarse_ns include/linux/timekeeping.h:126 [inline]
> >
> > --> this call is invalid
> >
> > >        ____bpf_ktime_get_coarse_ns kernel/bpf/helpers.c:173 [inline]
> > >        bpf_ktime_get_coarse_ns+0x7e/0x130 kernel/bpf/helpers.c:171
> > >        bpf_prog_a99735ebafdda2f1+0x10/0xb50
> > >        bpf_dispatcher_nop_func include/linux/bpf.h:721 [inline]
> > >        __bpf_prog_run include/linux/filter.h:626 [inline]
> > >        bpf_prog_run include/linux/filter.h:633 [inline]
> > >        BPF_PROG_RUN_ARRAY include/linux/bpf.h:1294 [inline]
> > >        trace_call_bpf+0x2cf/0x5d0 kernel/trace/bpf_trace.c:127
> > >        perf_trace_run_bpf_submit+0x7b/0x1d0 kernel/events/core.c:9708
> > >        perf_trace_lock+0x37c/0x440 include/trace/events/lock.h:39
> > >        trace_lock_release+0x128/0x150 include/trace/events/lock.h:58
> >
> > Timestamps from within a tracepoint can only be taken with:
> >
> >          1) jiffies
> >          2) sched_clock()
> >          3) ktime_get_*_fast_ns()
> >
> > Those are NMI safe and can be invoked from anywhere.
> >
> > All other time getters which have to use the timekeeping seqcount
> > protection are prone to live locks and _cannot_ be used from
> > tracepoints ever.
> 
> Obviously.
> That helper was added for networking use cases and accidentally
> enabled for tracing.

Sorry for that.
I'm preparing a patch that will forbid using bpf_ktime_get_coarse_ns()
helper in BPF_LINK_TYPE_RAW_TRACEPOINT.



-- 

Dmitrii Banshchikov

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] possible deadlock in ktime_get_coarse_ts64
  2021-11-05 17:03     ` Dmitrii Banshchikov
@ 2021-11-05 17:24       ` Thomas Gleixner
  2021-11-06 20:07         ` Dmitrii Banshchikov
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2021-11-05 17:24 UTC (permalink / raw)
  To: Dmitrii Banshchikov, Alexei Starovoitov
  Cc: syzbot, John Stultz, LKML, sboyd, syzkaller-bugs, Peter Zijlstra,
	Mark Rutland, Steven Rostedt, Alexei Starovoitov,
	Daniel Borkmann, bpf

On Fri, Nov 05 2021 at 21:03, Dmitrii Banshchikov wrote:
> On Fri, Nov 05, 2021 at 08:53:06AM -0700, Alexei Starovoitov wrote:
>> > Timestamps from within a tracepoint can only be taken with:
>> >
>> >          1) jiffies
>> >          2) sched_clock()
>> >          3) ktime_get_*_fast_ns()
>> >
>> > Those are NMI safe and can be invoked from anywhere.
>> >
>> > All other time getters which have to use the timekeeping seqcount
>> > protection are prone to live locks and _cannot_ be used from
>> > tracepoints ever.
>> 
>> Obviously.
>> That helper was added for networking use cases and accidentally
>> enabled for tracing.
>
> Sorry for that.
> I'm preparing a patch that will forbid using bpf_ktime_get_coarse_ns()
> helper in BPF_LINK_TYPE_RAW_TRACEPOINT.

It cannot be used in TRACING and PERF_EVENT either. But those contexts
have to exclude other functions as well:

     bpf_ktime_get_ns
     bpf_ktime_get_boot_ns

along with

    bpf_spin_lock/unlock
    bpf_timer_*

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] possible deadlock in ktime_get_coarse_ts64
  2021-11-05 17:24       ` Thomas Gleixner
@ 2021-11-06 20:07         ` Dmitrii Banshchikov
  2021-11-07 10:32           ` Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitrii Banshchikov @ 2021-11-06 20:07 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Alexei Starovoitov, syzbot, John Stultz, LKML, sboyd,
	syzkaller-bugs, Peter Zijlstra, Mark Rutland, Steven Rostedt,
	Alexei Starovoitov, Daniel Borkmann, bpf

On Fri, Nov 05, 2021 at 06:24:30PM +0100, Thomas Gleixner wrote:
> On Fri, Nov 05 2021 at 21:03, Dmitrii Banshchikov wrote:
> > On Fri, Nov 05, 2021 at 08:53:06AM -0700, Alexei Starovoitov wrote:
> >> > Timestamps from within a tracepoint can only be taken with:
> >> >
> >> >          1) jiffies
> >> >          2) sched_clock()
> >> >          3) ktime_get_*_fast_ns()
> >> >
> >> > Those are NMI safe and can be invoked from anywhere.
> >> >
> >> > All other time getters which have to use the timekeeping seqcount
> >> > protection are prone to live locks and _cannot_ be used from
> >> > tracepoints ever.
> >> 
> >> Obviously.
> >> That helper was added for networking use cases and accidentally
> >> enabled for tracing.
> >
> > Sorry for that.
> > I'm preparing a patch that will forbid using bpf_ktime_get_coarse_ns()
> > helper in BPF_LINK_TYPE_RAW_TRACEPOINT.
> 
> It cannot be used in TRACING and PERF_EVENT either. But those contexts
> have to exclude other functions as well:
> 
>      bpf_ktime_get_ns
>      bpf_ktime_get_boot_ns
> 
> along with
> 
>     bpf_spin_lock/unlock
>     bpf_timer_*

1) bpf_ktime_get_ns and bpf_ktime_get_boot_ns use
ktime_get_{mono,boot}_fast_ns.
2) bpf_spin_lock/unlock have notrace attribute set.
3) bpf_timer_* helpers fail early if they are in NMI.

Why they have to be excluded?



-- 

Dmitrii Banshchikov

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] possible deadlock in ktime_get_coarse_ts64
  2021-11-06 20:07         ` Dmitrii Banshchikov
@ 2021-11-07 10:32           ` Thomas Gleixner
  2021-11-07 13:51             ` Dmitrii Banshchikov
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2021-11-07 10:32 UTC (permalink / raw)
  To: Dmitrii Banshchikov
  Cc: Alexei Starovoitov, syzbot, John Stultz, LKML, sboyd,
	syzkaller-bugs, Peter Zijlstra, Mark Rutland, Steven Rostedt,
	Alexei Starovoitov, Daniel Borkmann, bpf

Dmitrii,

On Sun, Nov 07 2021 at 00:07, Dmitrii Banshchikov wrote:
> On Fri, Nov 05, 2021 at 06:24:30PM +0100, Thomas Gleixner wrote:
>> It cannot be used in TRACING and PERF_EVENT either. But those contexts
>> have to exclude other functions as well:
>> 
>>      bpf_ktime_get_ns
>>      bpf_ktime_get_boot_ns
>> 
>> along with
>> 
>>     bpf_spin_lock/unlock
>>     bpf_timer_*
>
> 1) bpf_ktime_get_ns and bpf_ktime_get_boot_ns use
> ktime_get_{mono,boot}_fast_ns.

Ok. That's fine then. I was just going from the bpf function names and
missed the implementation detail.

> 2) bpf_spin_lock/unlock have notrace attribute set.

How is that supposed to help?

You cannot take a spinlock from NMI context if that same lock can be
taken by other contexts as well.

Also notrace on the public function is not guaranteeing that the inlines
(as defined) are not traceable and it does not exclude it from being
kprobed.

> 3) bpf_timer_* helpers fail early if they are in NMI.
>
> Why they have to be excluded?

Because timers take locks and you can just end up in the very same
situation that you create invers lock dependencies or deadlocks when you
use that from a tracepoint.

hrtimer_start()
  lock_base();
  trace_hrtimer...()
    perf_event()
      bpf_run()
        bpf_timer_start()
          hrtimer_start()
            lock_base()         <- DEADLOCK

Tracepoints and perf events are very limited in what they can actually
do. Just because it's BPF these rules are not magically going away.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] possible deadlock in ktime_get_coarse_ts64
  2021-11-07 10:32           ` Thomas Gleixner
@ 2021-11-07 13:51             ` Dmitrii Banshchikov
  0 siblings, 0 replies; 6+ messages in thread
From: Dmitrii Banshchikov @ 2021-11-07 13:51 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Alexei Starovoitov, syzbot, John Stultz, LKML, sboyd,
	syzkaller-bugs, Peter Zijlstra, Mark Rutland, Steven Rostedt,
	Alexei Starovoitov, Daniel Borkmann, bpf

On Sun, Nov 07, 2021 at 11:32:07AM +0100, Thomas Gleixner wrote:
> > 2) bpf_spin_lock/unlock have notrace attribute set.
> 
> How is that supposed to help?
> 
> You cannot take a spinlock from NMI context if that same lock can be
> taken by other contexts as well.
> 
> Also notrace on the public function is not guaranteeing that the inlines
> (as defined) are not traceable and it does not exclude it from being
> kprobed.
> 
> > 3) bpf_timer_* helpers fail early if they are in NMI.
> >
> > Why they have to be excluded?
> 
> Because timers take locks and you can just end up in the very same
> situation that you create invers lock dependencies or deadlocks when you
> use that from a tracepoint.
> 
> hrtimer_start()
>   lock_base();
>   trace_hrtimer...()
>     perf_event()
>       bpf_run()
>         bpf_timer_start()
>           hrtimer_start()
>             lock_base()         <- DEADLOCK
> 
> Tracepoints and perf events are very limited in what they can actually
> do. Just because it's BPF these rules are not magically going away.
> 

Thanks for the clarification.


-- 

Dmitrii Banshchikov

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-11-07 13:51 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <00000000000013aebd05cff8e064@google.com>
     [not found] ` <87lf224uki.ffs@tglx>
2021-11-05 15:53   ` [syzbot] possible deadlock in ktime_get_coarse_ts64 Alexei Starovoitov
2021-11-05 17:03     ` Dmitrii Banshchikov
2021-11-05 17:24       ` Thomas Gleixner
2021-11-06 20:07         ` Dmitrii Banshchikov
2021-11-07 10:32           ` Thomas Gleixner
2021-11-07 13:51             ` Dmitrii Banshchikov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).