netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH v2 bpf-next 1/3] bpf: Introduce bpf_timer
Date: Mon, 14 Jun 2021 21:48:58 -0700	[thread overview]
Message-ID: <CAEf4BzYo9HioS1BumMMTaKX7QpgC8D2SrQbrESfcbfqpvesXhw@mail.gmail.com> (raw)
In-Reply-To: <20210611042442.65444-2-alexei.starovoitov@gmail.com>

On Thu, Jun 10, 2021 at 9:24 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> From: Alexei Starovoitov <ast@kernel.org>
>
> Introduce 'struct bpf_timer { __u64 :64; __u64 :64; };' that can be embedded
> in hash/array/lru maps as regular field and helpers to operate on it:
>
> // Initialize the timer to call 'callback_fn' static function
> // First 4 bits of 'flags' specify clockid.
> // Only CLOCK_MONOTONIC, CLOCK_REALTIME, CLOCK_BOOTTIME are allowed.
> long bpf_timer_init(struct bpf_timer *timer, void *callback_fn, int flags);
>
> // Start the timer and set its expiration 'nsec' nanoseconds from the current time.
> long bpf_timer_start(struct bpf_timer *timer, u64 nsec);
>
> // Cancel the timer and wait for callback_fn to finish if it was running.
> long bpf_timer_cancel(struct bpf_timer *timer);
>
> Here is how BPF program might look like:
> struct map_elem {
>     int counter;
>     struct bpf_timer timer;
> };
>
> struct {
>     __uint(type, BPF_MAP_TYPE_HASH);
>     __uint(max_entries, 1000);
>     __type(key, int);
>     __type(value, struct map_elem);
> } hmap SEC(".maps");
>
> static int timer_cb(void *map, int *key, struct map_elem *val);
> /* val points to particular map element that contains bpf_timer. */
>
> SEC("fentry/bpf_fentry_test1")
> int BPF_PROG(test1, int a)
> {
>     struct map_elem *val;
>     int key = 0;
>
>     val = bpf_map_lookup_elem(&hmap, &key);
>     if (val) {
>         bpf_timer_init(&val->timer, timer_cb, CLOCK_REALTIME);
>         bpf_timer_start(&val->timer, 1000 /* call timer_cb2 in 1 usec */);
>     }
> }
>
> This patch adds helper implementations that rely on hrtimers
> to call bpf functions as timers expire.
> The following patch adds necessary safety checks.
>
> Only programs with CAP_BPF are allowed to use bpf_timer.
>
> The amount of timers used by the program is constrained by
> the memcg recorded at map creation time.
>
> The bpf_timer_init() helper is receiving hidden 'map' and 'prog' arguments
> supplied by the verifier. The prog pointer is needed to do refcnting of bpf
> program to make sure that program doesn't get freed while timer is armed.
>
> The bpf_map_delete_elem() and bpf_map_update_elem() operations cancel
> and free the timer if given map element had it allocated.
> "bpftool map update" command can be used to cancel timers.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---

Looks great!

Acked-by: Andrii Nakryiko <andrii@kernel.org>

>  include/linux/bpf.h            |   2 +
>  include/uapi/linux/bpf.h       |  40 ++++++
>  kernel/bpf/helpers.c           | 227 +++++++++++++++++++++++++++++++++
>  kernel/bpf/verifier.c          | 109 ++++++++++++++++
>  kernel/trace/bpf_trace.c       |   2 +-
>  scripts/bpf_doc.py             |   2 +
>  tools/include/uapi/linux/bpf.h |  40 ++++++
>  7 files changed, 421 insertions(+), 1 deletion(-)
>

[...]

> + *
> + * long bpf_timer_init(struct bpf_timer *timer, void *callback_fn, int flags)
> + *     Description
> + *             Initialize the timer to call *callback_fn* static function.
> + *             First 4 bits of *flags* specify clockid. Only CLOCK_MONOTONIC,
> + *             CLOCK_REALTIME, CLOCK_BOOTTIME are allowed.
> + *             All other bits of *flags* are reserved.
> + *     Return
> + *             0 on success.
> + *             **-EBUSY** if *timer* is already initialized.
> + *             **-EINVAL** if invalid *flags* are passed.
> + *
> + * long bpf_timer_start(struct bpf_timer *timer, u64 nsecs)
> + *     Description
> + *             Start the timer and set its expiration N nanoseconds from the
> + *             current time. The timer callback_fn will be invoked in soft irq
> + *             context on some cpu and will not repeat unless another
> + *             bpf_timer_start() is made. In such case the next invocation can
> + *             migrate to a different cpu.

This is a nice description, thanks.

> + *     Return
> + *             0 on success.
> + *             **-EINVAL** if *timer* was not initialized with bpf_timer_init() earlier.
> + *
> + * long bpf_timer_cancel(struct bpf_timer *timer)
> + *     Description
> + *             Cancel the timer and wait for callback_fn to finish if it was running.
> + *     Return
> + *             0 if the timer was not active.
> + *             1 if the timer was active.
> + *             **-EINVAL** if *timer* was not initialized with bpf_timer_init() earlier.
> + *             **-EDEADLK** if callback_fn tried to call bpf_timer_cancel() on its own timer
> + *             which would have led to a deadlock otherwise.
>   */

[...]

> +       ret = BPF_CAST_CALL(t->callback_fn)((u64)(long)map,
> +                                           (u64)(long)key,
> +                                           (u64)(long)t->value, 0, 0);
> +       WARN_ON(ret != 0); /* Next patch disallows 1 in the verifier */
> +
> +       /* The bpf function finished executed. Drop the prog refcnt.

typo: execution

> +        * It could reach zero here and trigger free of bpf_prog
> +        * and subsequent free of the maps that were holding timers.
> +        * If callback_fn called bpf_timer_start on this timer
> +        * the prog refcnt will be > 0.
> +        *
> +        * If callback_fn deleted map element the 't' could have been freed,
> +        * hence t->prog deref is done earlier.
> +        */
> +       bpf_prog_put(prog);
> +       this_cpu_write(hrtimer_running, NULL);
> +       return HRTIMER_NORESTART;
> +}
> +

[...]

  parent reply	other threads:[~2021-06-15  4:49 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-11  4:24 [PATCH v2 bpf-next 0/3] bpf: Introduce BPF timers Alexei Starovoitov
2021-06-11  4:24 ` [PATCH v2 bpf-next 1/3] bpf: Introduce bpf_timer Alexei Starovoitov
2021-06-11  6:42   ` Cong Wang
2021-06-11 18:45     ` Alexei Starovoitov
2021-06-15  6:10       ` Cong Wang
2021-06-16  4:53         ` Alexei Starovoitov
2021-06-11  7:05   ` Cong Wang
2021-06-11 22:12   ` Yonghong Song
2021-06-15  3:33     ` Alexei Starovoitov
2021-06-15  4:21       ` Yonghong Song
2021-06-14 16:51   ` Yonghong Song
2021-06-15  3:29     ` Alexei Starovoitov
2021-06-15  5:31       ` Andrii Nakryiko
2021-06-15  5:40         ` Alexei Starovoitov
2021-06-15 15:24           ` Andrii Nakryiko
2021-06-16  4:26             ` Alexei Starovoitov
2021-06-16  5:54               ` Andrii Nakryiko
2021-06-16 16:52                 ` Alexei Starovoitov
2021-06-15  4:48   ` Andrii Nakryiko [this message]
2021-06-11  4:24 ` [PATCH v2 bpf-next 2/3] bpf: Add verifier checks for bpf_timer Alexei Starovoitov
2021-06-11  4:24 ` [PATCH v2 bpf-next 3/3] selftests/bpf: Add bpf_timer test Alexei Starovoitov
2021-06-11  6:47 ` [PATCH v2 bpf-next 0/3] bpf: Introduce BPF timers Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEf4BzYo9HioS1BumMMTaKX7QpgC8D2SrQbrESfcbfqpvesXhw@mail.gmail.com \
    --to=andrii.nakryiko@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).