All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Martin KaFai Lau <kafai@fb.com>
Cc: davem@davemloft.net, daniel@iogearbox.net, andrii@kernel.org,
	netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH v5 bpf-next 04/11] bpf: Add map side support for bpf timers.
Date: Thu, 8 Jul 2021 20:52:23 -0700	[thread overview]
Message-ID: <20210709035223.s2ni6phkdajhdg2i@ast-mbp.dhcp.thefacebook.com> (raw)
In-Reply-To: <20210709015119.l5kxp5kao24bjft7@kafai-mbp.dhcp.thefacebook.com>

On Thu, Jul 08, 2021 at 06:51:19PM -0700, Martin KaFai Lau wrote:
> > +
> >  /* Called when map->refcnt goes to zero, either from workqueue or from syscall */
> >  static void array_map_free(struct bpf_map *map)
> >  {
> > @@ -382,6 +402,7 @@ static void array_map_free(struct bpf_map *map)
> >  	if (array->map.map_type == BPF_MAP_TYPE_PERCPU_ARRAY)
> >  		bpf_array_free_percpu(array);
> >  
> > +	array_map_free_timers(map);
> array_map_free() is called when map->refcnt reached 0.
> By then, map->usercnt should have reached 0 before
> and array_map_free_timers() should have already been called,
> so no need to call it here again?  The same goes for hashtab.

Not sure it's that simple.
Currently map->usercnt > 0 check is done for bpf_timer_set_callback only,
because prog refcnting is what matters to usercnt and map_release_uref scheme.
bpf_map_init doesn't have this check because there is no circular dependency
prog->map->timer->prog to worry about.
So after usercnt reached zero the prog can still do bpf_timer_init.
I guess we can add usercnt > 0 to bpf_timer_init as well.
Need to think whether it's enough and the race between atomic64_read(usercnt)
and atomic64_dec_and_test(usercnt) is addressed the same way as the race
in set_callback and cancel_and_free. So far looks like it. Hmm.

> > +static int btf_find_field(const struct btf *btf, const struct btf_type *t,
> > +			  const char *name, int sz, int align)
> > +{
> > +
> > +	if (__btf_type_is_struct(t))
> > +		return btf_find_struct_field(btf, t, name, sz, align);
> > +	else if (btf_type_is_datasec(t))
> > +		return btf_find_datasec_var(btf, t, name, sz, align);
> iiuc, a global struct bpf_timer is not supported.  I am not sure
> why it needs to find the timer in datasec here. or it meant to error out
> and potentially give a verifier log?  I don't see where is the verifier
> reporting error though.

yes. Initially I coded it up to support timers in the global data, but
later Andrii convinced me that single global timer could be surprising to
users unless I hack it up in libbpf to create a bunch of global data maps
one map for each timer and teach libbpf to avoid doing mmap on them.
That's too hacky and can be done later.
So the datasec parsing code stayed only to have a meaningful error
in the verifier if the user wrote a program with global timer.
Without this code the verifier error:
 24: (85) call bpf_timer_init#169
 map 'my_test.bss' is not a struct type or bpf_timer is mangled
With:
 24: (85) call bpf_timer_init#169
 timer pointer in R1 map_uid=0 doesn't match map pointer in R2 map_uid=0
Imo that's much clearer. Since to use global bpf_timer_init() the
user would have to pass some map value in R2
and it wouldn't match the timer in R1.
So I prefer to keep this btf_find_datasec_var() code.

> 
> > +static void htab_free_malloced_timers(struct bpf_htab *htab)
> > +{
> > +	int i;
> > +
> > +	rcu_read_lock();
> > +	for (i = 0; i < htab->n_buckets; i++) {
> > +		struct hlist_nulls_head *head = select_bucket(htab, i);
> > +		struct hlist_nulls_node *n;
> > +		struct htab_elem *l;
> > +
> > +		hlist_nulls_for_each_entry(l, n, head, hash_node)
> need the _rcu() variant here.

argh. right.

> May be put rcu_read_lock/unlock() in the loop and do a
> cond_resched() in case the hashtab is large.

Feels a bit like premature optimization. delete_all_elements()
loop right above is doing similar work without cond_resched.
I don't mind cond_resched. I just don't see how to cleanly add it
without breaking rcu_read_lock and overcomplicating the code.

  reply	other threads:[~2021-07-09  3:52 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-08  1:18 [PATCH v5 bpf-next 00/11] bpf: Introduce BPF timers Alexei Starovoitov
2021-07-08  1:18 ` [PATCH v5 bpf-next 01/11] bpf: Prepare bpf_prog_put() to be called from irq context Alexei Starovoitov
2021-07-08  1:18 ` [PATCH v5 bpf-next 02/11] bpf: Factor out bpf_spin_lock into helpers Alexei Starovoitov
2021-07-08  1:18 ` [PATCH v5 bpf-next 03/11] bpf: Introduce bpf timers Alexei Starovoitov
2021-07-08  1:18 ` [PATCH v5 bpf-next 04/11] bpf: Add map side support for " Alexei Starovoitov
2021-07-09  1:51   ` Martin KaFai Lau
2021-07-09  3:52     ` Alexei Starovoitov [this message]
2021-07-09  6:04       ` Martin KaFai Lau
2021-07-09  7:00         ` Alexei Starovoitov
2021-07-08  1:18 ` [PATCH v5 bpf-next 05/11] bpf: Prevent pointer mismatch in bpf_timer_init Alexei Starovoitov
2021-07-08  1:18 ` [PATCH v5 bpf-next 06/11] bpf: Remember BTF of inner maps Alexei Starovoitov
2021-07-08  1:18 ` [PATCH v5 bpf-next 07/11] bpf: Relax verifier recursion check Alexei Starovoitov
2021-07-08  1:18 ` [PATCH v5 bpf-next 08/11] bpf: Implement verifier support for validation of async callbacks Alexei Starovoitov
2021-07-08  1:18 ` [PATCH v5 bpf-next 09/11] bpf: Teach stack depth check about " Alexei Starovoitov
2021-07-08  1:18 ` [PATCH v5 bpf-next 10/11] selftests/bpf: Add bpf_timer test Alexei Starovoitov
2021-07-08  1:18 ` [PATCH v5 bpf-next 11/11] selftests/bpf: Add a test with bpf_timer in inner map Alexei Starovoitov
2021-07-09  1:59 ` [PATCH v5 bpf-next 00/11] bpf: Introduce BPF timers Martin KaFai Lau
2021-07-09  3:52   ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210709035223.s2ni6phkdajhdg2i@ast-mbp.dhcp.thefacebook.com \
    --to=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=kafai@fb.com \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.