All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH v6 bpf-next 03/11] bpf: Introduce bpf timers.
Date: Wed, 14 Jul 2021 16:59:53 -0700	[thread overview]
Message-ID: <CAEf4BzbJb3q0=LwHs_JXXB2a7wsY=rCF7E+nxsM2SgcC6KK8jA@mail.gmail.com> (raw)
In-Reply-To: <20210714010519.37922-4-alexei.starovoitov@gmail.com>

On Tue, Jul 13, 2021 at 6:05 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> From: Alexei Starovoitov <ast@kernel.org>
>
> Introduce 'struct bpf_timer { __u64 :64; __u64 :64; };' that can be embedded
> in hash/array/lru maps as a regular field and helpers to operate on it:
>
> // Initialize the timer.
> // First 4 bits of 'flags' specify clockid.
> // Only CLOCK_MONOTONIC, CLOCK_REALTIME, CLOCK_BOOTTIME are allowed.
> long bpf_timer_init(struct bpf_timer *timer, struct bpf_map *map, int flags);
>
> // Configure the timer to call 'callback_fn' static function.
> long bpf_timer_set_callback(struct bpf_timer *timer, void *callback_fn);
>
> // Arm the timer to expire 'nsec' nanoseconds from the current time.
> long bpf_timer_start(struct bpf_timer *timer, u64 nsec, u64 flags);
>
> // Cancel the timer and wait for callback_fn to finish if it was running.
> long bpf_timer_cancel(struct bpf_timer *timer);
>
> Here is how BPF program might look like:
> struct map_elem {
>     int counter;
>     struct bpf_timer timer;
> };
>
> struct {
>     __uint(type, BPF_MAP_TYPE_HASH);
>     __uint(max_entries, 1000);
>     __type(key, int);
>     __type(value, struct map_elem);
> } hmap SEC(".maps");
>
> static int timer_cb(void *map, int *key, struct map_elem *val);
> /* val points to particular map element that contains bpf_timer. */
>
> SEC("fentry/bpf_fentry_test1")
> int BPF_PROG(test1, int a)
> {
>     struct map_elem *val;
>     int key = 0;
>
>     val = bpf_map_lookup_elem(&hmap, &key);
>     if (val) {
>         bpf_timer_init(&val->timer, &hmap, CLOCK_REALTIME);
>         bpf_timer_set_callback(&val->timer, timer_cb);
>         bpf_timer_start(&val->timer, 1000 /* call timer_cb2 in 1 usec */, 0);
>     }
> }
>
> This patch adds helper implementations that rely on hrtimers
> to call bpf functions as timers expire.
> The following patches add necessary safety checks.
>
> Only programs with CAP_BPF are allowed to use bpf_timer.
>
> The amount of timers used by the program is constrained by
> the memcg recorded at map creation time.
>
> The bpf_timer_init() helper needs explicit 'map' argument because inner maps
> are dynamic and not known at load time. While the bpf_timer_set_callback() is
> receiving hidden 'aux->prog' argument supplied by the verifier.
>
> The prog pointer is needed to do refcnting of bpf program to make sure that
> program doesn't get freed while the timer is armed. This approach relies on
> "user refcnt" scheme used in prog_array that stores bpf programs for
> bpf_tail_call. The bpf_timer_set_callback() will increment the prog refcnt which is
> paired with bpf_timer_cancel() that will drop the prog refcnt. The
> ops->map_release_uref is responsible for cancelling the timers and dropping
> prog refcnt when user space reference to a map reaches zero.
> This uref approach is done to make sure that Ctrl-C of user space process will
> not leave timers running forever unless the user space explicitly pinned a map
> that contained timers in bpffs.
>
> bpf_timer_init() and bpf_timer_set_callback() will return -EPERM if map doesn't
> have user references (is not held by open file descriptor from user space and
> not pinned in bpffs).
>
> The bpf_map_delete_elem() and bpf_map_update_elem() operations cancel
> and free the timer if given map element had it allocated.
> "bpftool map update" command can be used to cancel timers.
>
> The 'struct bpf_timer' is explicitly __attribute__((aligned(8))) because
> '__u64 :64' has 1 byte alignment of 8 byte padding.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> Acked-by: Martin KaFai Lau <kafai@fb.com>
> ---
>  include/linux/bpf.h            |   3 +
>  include/uapi/linux/bpf.h       |  73 ++++++++
>  kernel/bpf/helpers.c           | 325 +++++++++++++++++++++++++++++++++
>  kernel/bpf/verifier.c          | 109 +++++++++++
>  kernel/trace/bpf_trace.c       |   2 +-
>  scripts/bpf_doc.py             |   2 +
>  tools/include/uapi/linux/bpf.h |  73 ++++++++
>  7 files changed, 586 insertions(+), 1 deletion(-)
>

[...]

> +       if (!atomic64_read(&(map->usercnt))) {
> +               /* maps with timers must be either held by user space
> +                * or pinned in bpffs.
> +                */
> +               ret = -EPERM;
> +               goto out;
> +       }
> +       /* allocate hrtimer via map_kmalloc to use memcg accounting */
> +       t = bpf_map_kmalloc_node(map, sizeof(*t), GFP_ATOMIC, NUMA_NO_NODE);

I wonder if it would make sense to use map->numa_node here to keep map
value and timer data in the same NUMA node?

> +       if (!t) {
> +               ret = -ENOMEM;
> +               goto out;
> +       }

[...]

> +
> +/* This function is called by map_delete/update_elem for individual element.
> + * By ops->map_release_uref when the user space reference to a map reaches zero
> + * and by ops->map_free when the kernel reference reaches zero.

is ops->map_free part still valid?

> + */
> +void bpf_timer_cancel_and_free(void *val)
> +{
> +       struct bpf_timer_kern *timer = val;
> +       struct bpf_hrtimer *t;
> +

[...]

  reply	other threads:[~2021-07-15  0:00 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-14  1:05 [PATCH v6 bpf-next 00/11] bpf: Introduce BPF timers Alexei Starovoitov
2021-07-14  1:05 ` [PATCH v6 bpf-next 01/11] bpf: Prepare bpf_prog_put() to be called from irq context Alexei Starovoitov
2021-07-14  1:05 ` [PATCH v6 bpf-next 02/11] bpf: Factor out bpf_spin_lock into helpers Alexei Starovoitov
2021-07-14  1:05 ` [PATCH v6 bpf-next 03/11] bpf: Introduce bpf timers Alexei Starovoitov
2021-07-14 23:59   ` Andrii Nakryiko [this message]
2021-07-15  0:43     ` Alexei Starovoitov
2021-07-14  1:05 ` [PATCH v6 bpf-next 04/11] bpf: Add map side support for " Alexei Starovoitov
2021-07-14  1:05 ` [PATCH v6 bpf-next 05/11] bpf: Prevent pointer mismatch in bpf_timer_init Alexei Starovoitov
2021-07-14  1:05 ` [PATCH v6 bpf-next 06/11] bpf: Remember BTF of inner maps Alexei Starovoitov
2021-07-14  1:05 ` [PATCH v6 bpf-next 07/11] bpf: Relax verifier recursion check Alexei Starovoitov
2021-07-14  1:05 ` [PATCH v6 bpf-next 08/11] bpf: Implement verifier support for validation of async callbacks Alexei Starovoitov
2021-07-14  1:05 ` [PATCH v6 bpf-next 09/11] bpf: Teach stack depth check about " Alexei Starovoitov
2021-07-14  1:05 ` [PATCH v6 bpf-next 10/11] selftests/bpf: Add bpf_timer test Alexei Starovoitov
2021-07-14  1:05 ` [PATCH v6 bpf-next 11/11] selftests/bpf: Add a test with bpf_timer in inner map Alexei Starovoitov
2021-07-14 23:59 ` [PATCH v6 bpf-next 00/11] bpf: Introduce BPF timers Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEf4BzbJb3q0=LwHs_JXXB2a7wsY=rCF7E+nxsM2SgcC6KK8jA@mail.gmail.com' \
    --to=andrii.nakryiko@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.