bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@fb.com>
To: Alexei Starovoitov <ast@fb.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	<davem@davemloft.net>
Cc: <daniel@iogearbox.net>, <andrii@kernel.org>,
	<netdev@vger.kernel.org>, <bpf@vger.kernel.org>,
	<kernel-team@fb.com>
Subject: Re: [PATCH v3 bpf-next 1/8] bpf: Introduce bpf timers.
Date: Fri, 25 Jun 2021 08:54:55 -0700	[thread overview]
Message-ID: <cfa10fa1-9ee4-95de-109d-a24cd5d43a98@fb.com> (raw)
In-Reply-To: <de1204cc-8c20-0e09-8880-e39c9ee6d889@fb.com>



On 6/25/21 7:57 AM, Alexei Starovoitov wrote:
> On 6/24/21 11:25 PM, Yonghong Song wrote:
>>
>>> +
>>> +    ____bpf_spin_lock(&timer->lock);
>>
>> I think we may still have some issues.
>> Case 1:
>>    1. one bpf program is running in process context,
>>       bpf_timer_start() is called and timer->lock is taken
>>    2. timer softirq is triggered and this callback is called
> 
> ___bpf_spin_lock is actually irqsave version of spin_lock.
> So this race is not possible.

Sorry I missed that ____bpf_spin_lock() has local_irq_save(),
so yes. the above situation cannot happen.

> 
>> Case 2:
>>    1. this callback is called, timer->lock is taken
>>    2. a nmi happens and some bpf program is called (kprobe, tracepoint,
>>       fentry/fexit or perf_event, etc.) and that program calls
>>       bpf_timer_start()
>>
>> So we could have deadlock in both above cases?
> 
> Shouldn't be possible either because bpf timers are not allowed
> in nmi-bpf-progs. I'll double check that it's the case.
> Pretty much the same restrictions are with bpf_spin_lock.

The patch added bpf_base_func_proto() to bpf_tracing_func_proto:

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 7a52bc172841..80f6e6dafd5e 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1057,7 +1057,7 @@ bpf_tracing_func_proto(enum bpf_func_id func_id, 
const struct bpf_prog *prog)
  	case BPF_FUNC_snprintf:
  		return &bpf_snprintf_proto;
  	default:
-		return NULL;
+		return bpf_base_func_proto(func_id);
  	}
  }

and timer helpers are added to bpf_base_func_proto:
@@ -1055,6 +1330,12 @@ bpf_base_func_proto(enum bpf_func_id func_id)
  		return &bpf_per_cpu_ptr_proto;
  	case BPF_FUNC_this_cpu_ptr:
  		return &bpf_this_cpu_ptr_proto;
+	case BPF_FUNC_timer_init:
+		return &bpf_timer_init_proto;
+	case BPF_FUNC_timer_start:
+		return &bpf_timer_start_proto;
+	case BPF_FUNC_timer_cancel:
+		return &bpf_timer_cancel_proto;
  	default:
  		break;
  	}

static const struct bpf_func_proto *
pe_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
{
         switch (func_id) {
...
         default:
                 return bpf_tracing_func_proto(func_id, prog);
         }
}

static const struct bpf_func_proto *
kprobe_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog 
*prog)
{
...
         default:
                 return bpf_tracing_func_proto(func_id, prog);
         }
}

Also, we have some functions inside ____bpf_spin_lock() e.g., 
bpf_prog_inc(), hrtimer_start(), etc. If we want to be absolutely safe,
we need to mark them not tracable for kprobe/kretprobe/fentry/fexit/...
But I am not sure whether this is really needed or not.

> 
>>
>>> +    /* callback_fn and prog need to match. They're updated together
>>> +     * and have to be read under lock.
>>> +     */
>>> +    prog = t->prog;
>>> +    callback_fn = t->callback_fn;
>>> +
>>> +    /* wrap bpf subprog invocation with prog->refcnt++ and -- to make
>>> +     * sure that refcnt doesn't become zero when subprog is executing.
>>> +     * Do it under lock to make sure that bpf_timer_start doesn't drop
>>> +     * prev prog refcnt to zero before timer_cb has a chance to bump 
>>> it.
>>> +     */
>>> +    bpf_prog_inc(prog);
>>> +    ____bpf_spin_unlock(&timer->lock);
>>> +
>>> +    /* bpf_timer_cb() runs in hrtimer_run_softirq. It doesn't 
>>> migrate and
>>> +     * cannot be preempted by another bpf_timer_cb() on the same cpu.
>>> +     * Remember the timer this callback is servicing to prevent
>>> +     * deadlock if callback_fn() calls bpf_timer_cancel() on the 
>>> same timer.
>>> +     */
>>> +    this_cpu_write(hrtimer_running, t);
>>
>> This is not protected by spinlock, in bpf_timer_cancel() and
>> bpf_timer_cancel_and_free(), we have spinlock protected read, so
>> there is potential race conditions if callback function and 
>> helper/bpf_timer_cancel_and_free run in different context?
> 
> what kind of race do you see?
> This is per-cpu var and bpf_timer_cb is in softirq
> while timer_cancel/cancel_and_free are calling it under
> spin_lock_irqsave... so they cannot race because softirq
> and bpf_timer_cb will run after start/canel/cancel_free
> will do unlock_irqrestore.

Again, I missed local_irq_save(). With irqsave, this indeed
won't happen. The same for a few comments below.

> 
>>> +    prev = t->prog;
>>> +    if (prev != prog) {
>>> +        if (prev)
>>> +            /* Drop pref prog refcnt when swapping with new prog */
>>
>> pref -> prev
>>
>>> +            bpf_prog_put(prev);
>>
>> Maybe we want to put the above two lines with {}?
> 
> you mean add {} because there is a comment ?
> I don't think the kernel coding style considers comment as a statement.
> 
>>> +    if (this_cpu_read(hrtimer_running) != t)
>>> +        hrtimer_cancel(&t->timer);
>>
>> We could still have race conditions here when 
>> bpf_timer_cancel_and_free() runs in process context and callback in
>> softirq context. I guess we might be okay.
> 
> No, since this check is under spin_lock_irsave.
> 
>> But if bpf_timer_cancel_and_free() in nmi context, not 100% sure
>> whether we have issues or not.
> 
> timers shouldn't be available to nmi-bpf progs.
> There will be all sorts of issues.
> The underlying hrtimer implementation cannot deal with nmi either.

  reply	other threads:[~2021-06-25 15:55 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-24  2:25 [PATCH v3 bpf-next 0/8] bpf: Introduce BPF timers Alexei Starovoitov
2021-06-24  2:25 ` [PATCH v3 bpf-next 1/8] bpf: Introduce bpf timers Alexei Starovoitov
2021-06-25  6:25   ` Yonghong Song
2021-06-25 14:57     ` Alexei Starovoitov
2021-06-25 15:54       ` Yonghong Song [this message]
2021-06-29  1:39         ` Alexei Starovoitov
2021-06-25 16:54   ` Yonghong Song
2021-06-29  1:46     ` Alexei Starovoitov
2021-06-29  2:24       ` Yonghong Song
2021-06-29  3:32         ` Alexei Starovoitov
2021-06-29  6:34       ` Andrii Nakryiko
2021-06-29 13:28         ` Alexei Starovoitov
2021-06-30 10:08           ` Andrii Nakryiko
2021-06-30 17:38             ` Alexei Starovoitov
2021-07-01  5:40   ` Alexei Starovoitov
2021-07-01 11:51     ` Toke Høiland-Jørgensen
2021-07-01 15:34       ` Alexei Starovoitov
2021-06-24  2:25 ` [PATCH v3 bpf-next 2/8] bpf: Add map side support for " Alexei Starovoitov
2021-06-25 19:46   ` Yonghong Song
2021-06-29  1:49     ` Alexei Starovoitov
2021-06-24  2:25 ` [PATCH v3 bpf-next 3/8] bpf: Remember BTF of inner maps Alexei Starovoitov
2021-06-29  1:45   ` Yonghong Song
2021-06-24  2:25 ` [PATCH v3 bpf-next 4/8] bpf: Relax verifier recursion check Alexei Starovoitov
2021-06-24  2:25 ` [PATCH v3 bpf-next 5/8] bpf: Implement verifier support for validation of async callbacks Alexei Starovoitov
2021-06-24  2:25 ` [PATCH v3 bpf-next 6/8] bpf: Teach stack depth check about " Alexei Starovoitov
2021-06-24  2:25 ` [PATCH v3 bpf-next 7/8] selftests/bpf: Add bpf_timer test Alexei Starovoitov
2021-06-24  2:25 ` [PATCH v3 bpf-next 8/8] selftests/bpf: Add a test with bpf_timer in inner map Alexei Starovoitov
2021-06-24 11:27 ` [PATCH v3 bpf-next 0/8] bpf: Introduce BPF timers Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cfa10fa1-9ee4-95de-109d-a24cd5d43a98@fb.com \
    --to=yhs@fb.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).