All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Joanne Koong <joannekoong@fb.com>,
	Martin KaFai Lau <kafai@fb.com>, bpf <bpf@vger.kernel.org>,
	Kernel Team <Kernel-team@fb.com>
Subject: Re: [PATCH v3 bpf-next 1/5] bpf: Add bloom filter map implementation
Date: Mon, 27 Sep 2021 14:14:09 -0700	[thread overview]
Message-ID: <CAEf4Bzb7bASQ3jXJ3JiKBinnb4ct9Y5pAhn-eVsdCY7rRus8Fg@mail.gmail.com> (raw)
In-Reply-To: <20210927164110.gg33uypguty45huk@ast-mbp.dhcp.thefacebook.com>

On Mon, Sep 27, 2021 at 9:41 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Sep 24, 2021 at 04:12:11PM -0700, Andrii Nakryiko wrote:
> >
> > That's not what I proposed. So let's say somewhere in the kernel we
> > have this variable:
> >
> > static int bpf_bloom_exists = 1;
> >
> > Now, for bpf_map_lookup_elem() helper we pass data as key pointer. If
> > all its hashed bits are set in Bloom filter (it "exists"), we return
> > &bpf_bloom_exists. So it's not a NULL pointer.
>
> imo that's too much of a hack.

too bad, because this feels pretty natural in BPF code:

int my_key = 1234;

if (bpf_map_lookup_elem(&my_bloom_filter, &my_key)) {
    /* handle "maybe exists" */
} else {
    /* handle "definitely doesn't exist" */
}

With map->value_size == 0 verifier should be able to ensure that
whatever non-NULL pointer is returned isn't readable/writable. We
might need to fix some BPF verifier internals if that's not supported
yet, I haven't checked thoroughly.

>
> > Now, despite returning a valid pointer, it would be good to prevent
> > reading/writing from/to it in a valid BPF program. I'm hoping it is as
> > easy as just seetting map->value_size = 0 during map creation. But
> > worst case, we can let BPF program just overwrite that 1 with whatever
> > they want. It doesn't matter because the contract is that
> > bpf_map_lookup_elem() returns non-NULL for "exists" and NULL
> > otherwise.
> >
> > Now, for MAP_LOOKUP_ELEM command on user-space side. I'd say the
> > contract should be 0 return code on "exists" (and nothing is written
> > to value pointer, perhaps even disallow to specify the non-NULL value
> > pointer altogether); -ENOENT, otherwise. Again, worst-case we can
> > specify that "value_size" is 1 or 4 bytes and write 1 to it.
> >
> > Does it make sense?
> >
> >
> > BTW, for STACK/QUEUE maps, I'm not clear why we had to add
> > map_peek_elem/map_pop_elem/map_push_elem operations, to be honest.
> > They could have been pretty reasonably mapped to map_lookup_elem (peek
> > is non-destructive lookup), map_lookup_elem + some flag (pop is
> > lookup, but destructive, so flag allows "destruction"), and
> > map_update_elem (for push). map_delete_elem would be pop for when
> > value can be ignored.
> >
> > That's to say that I don't consider STACK/QUEUE maps as good examples,
> > it's rather a counter-example of maps that barely anyone is using, yet
> > it just adds clutter to BPF internals.
>
> Repurposing lookup/update ops for stack/queue and forcing bpf progs
> to pass dummy key would have looked quite ugly.

The idea was to pass non-NULL *key* pointer. For bpf_map_update_elem()
you'd pass non-NULL key and NULL value. But it's fine, we have
peek/pop/push now, might as well use them. From user-space side it's
still mapped to LOOKUP/UPDATE/DELETE commands, right? BTW, seems like
we have LOOKUP_AND_DELETE command as well which is implemented for
STACK/QUEUE maps, but not BLOOM_FILTER. I guess for consistency we'd
have to implement that one for user-space as well?

> peek/pop/push have one pointer. That pointer points to value.
> For bitset we have single pointer as well.
> So it makes perfect sense to reuse push/pop/peek and keep bitset
> as a value from the verifier side.

Ok.

> bpf_helper_defs.h could have static inline functions:
> bool bpf_bitset_clear(map, key);
> bool bpf_bitset_set(map, key);
> bool bpf_bitset_test(map, key);
>
> But they will map to FUNC_map_pop_elem/push/peek as func ids
> and will be seen as values from the verifier pov.
>
> The bpf progs might think of them as keys, but they will be value_size.
> The bitset creation could use key_size as an input, but internally
> set it as value_size.
> Not sure whether such internal vs uapi remaping will not lead
> to confusion and bugs though.

I think it will and I wouldn't do it.

> I agree that key as an input to bitset_clear/set/test make more sense,
> but the verifier needs value_size to plug into peek/pop infra.

Which is why I initially proposed to not use peek/pop infra and
instead do lookup/update/delete, but if a NULL value pointer passed
into bpf_map_update_elem() disqualifies this, then it's fine. But I
didn't ask to repurpose peek/pop for working with key pointers.

>
> I don't think it's worth to extend ops with yet another 3 callbacks
> and have clean ARG_PTR_TO_UNINIT_MAP_KEY there.
> That is probably too much clutter.
>
> I think
> bool bpf_bitset_clear(map, value);
> bool bpf_bitset_set(map, value);
> bool bpf_bitset_test(map, value);
> doesn't look too bad either.
> At least this way the bit set map_create op will use value_size
> and keep it as value_size. The bpf prog won't be confusing.
> That's my preference, so far.

  reply	other threads:[~2021-09-27 21:14 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-21 21:02 [PATCH v3 bpf-next 0/5] Implement bloom filter map Joanne Koong
2021-09-21 21:02 ` [PATCH v3 bpf-next 1/5] bpf: Add bloom filter map implementation Joanne Koong
2021-09-21 23:44   ` Andrii Nakryiko
2021-09-22 19:06     ` Joanne Koong
2021-09-22 19:38       ` Martin KaFai Lau
2021-09-22 20:52         ` Andrii Nakryiko
2021-09-22 22:08           ` Martin KaFai Lau
2021-09-22 23:07             ` Andrii Nakryiko
2021-09-23  1:28               ` Martin KaFai Lau
2021-09-23 18:42                 ` Andrii Nakryiko
2021-09-23 19:42                   ` Martin KaFai Lau
2021-09-23 20:30                     ` Alexei Starovoitov
2021-09-23 21:12                       ` Andrii Nakryiko
2021-09-23 22:28                         ` Joanne Koong
2021-09-23 23:46                           ` Martin KaFai Lau
2021-09-24  2:23                           ` Andrii Nakryiko
2021-09-24 16:32                             ` Joanne Koong
2021-09-24 23:12                               ` Andrii Nakryiko
2021-09-27 16:41                                 ` Alexei Starovoitov
2021-09-27 21:14                                   ` Andrii Nakryiko [this message]
2021-09-27 23:51                                     ` Alexei Starovoitov
2021-09-28  0:36                                       ` Andrii Nakryiko
2021-09-28 16:21                                         ` Alexei Starovoitov
     [not found]                                           ` <aa967ed2-a958-f995-3a09-bbd6b6e775d4@fb.com>
2021-09-28 23:54                                             ` Andrii Nakryiko
2021-09-29  1:54                                               ` Joanne Koong
2021-09-29  0:14                                           ` Andrii Nakryiko
2021-09-29  3:17                                             ` Alexei Starovoitov
2021-09-29  3:38                                               ` Joanne Koong
2021-09-28  1:09                                 ` Martin KaFai Lau
2021-09-22 20:44       ` Andrii Nakryiko
2021-09-21 21:02 ` [PATCH v3 bpf-next 2/5] libbpf: Allow the number of hashes in bloom filter maps to be configurable Joanne Koong
2021-09-21 22:24   ` Joanne Koong
2021-09-22 23:14     ` Andrii Nakryiko
2021-09-21 23:57   ` Andrii Nakryiko
2021-09-21 21:02 ` [PATCH v3 bpf-next 3/5] selftests/bpf: Add bloom filter map test cases Joanne Koong
2021-09-21 21:02 ` [PATCH v3 bpf-next 4/5] bpf/benchs: Add benchmark test for bloom filter maps Joanne Koong
2021-09-21 21:02 ` [PATCH v3 bpf-next 5/5] bpf/benchs: Add benchmarks for comparing hashmap lookups with vs. without bloom filter Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEf4Bzb7bASQ3jXJ3JiKBinnb4ct9Y5pAhn-eVsdCY7rRus8Fg@mail.gmail.com \
    --to=andrii.nakryiko@gmail.com \
    --cc=Kernel-team@fb.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=joannekoong@fb.com \
    --cc=kafai@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.