bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Joanne Koong <joannekoong@fb.com>,
	Martin KaFai Lau <kafai@fb.com>, bpf <bpf@vger.kernel.org>,
	Kernel Team <Kernel-team@fb.com>
Subject: Re: [PATCH v3 bpf-next 1/5] bpf: Add bloom filter map implementation
Date: Mon, 27 Sep 2021 14:14:09 -0700	[thread overview]
Message-ID: <CAEf4Bzb7bASQ3jXJ3JiKBinnb4ct9Y5pAhn-eVsdCY7rRus8Fg@mail.gmail.com> (raw)
In-Reply-To: <20210927164110.gg33uypguty45huk@ast-mbp.dhcp.thefacebook.com>

On Mon, Sep 27, 2021 at 9:41 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Sep 24, 2021 at 04:12:11PM -0700, Andrii Nakryiko wrote:
> >
> > That's not what I proposed. So let's say somewhere in the kernel we
> > have this variable:
> >
> > static int bpf_bloom_exists = 1;
> >
> > Now, for bpf_map_lookup_elem() helper we pass data as key pointer. If
> > all its hashed bits are set in Bloom filter (it "exists"), we return
> > &bpf_bloom_exists. So it's not a NULL pointer.
>
> imo that's too much of a hack.

too bad, because this feels pretty natural in BPF code:

int my_key = 1234;

if (bpf_map_lookup_elem(&my_bloom_filter, &my_key)) {
    /* handle "maybe exists" */
} else {
    /* handle "definitely doesn't exist" */
}

With map->value_size == 0 verifier should be able to ensure that
whatever non-NULL pointer is returned isn't readable/writable. We
might need to fix some BPF verifier internals if that's not supported
yet, I haven't checked thoroughly.

>
> > Now, despite returning a valid pointer, it would be good to prevent
> > reading/writing from/to it in a valid BPF program. I'm hoping it is as
> > easy as just seetting map->value_size = 0 during map creation. But
> > worst case, we can let BPF program just overwrite that 1 with whatever
> > they want. It doesn't matter because the contract is that
> > bpf_map_lookup_elem() returns non-NULL for "exists" and NULL
> > otherwise.
> >
> > Now, for MAP_LOOKUP_ELEM command on user-space side. I'd say the
> > contract should be 0 return code on "exists" (and nothing is written
> > to value pointer, perhaps even disallow to specify the non-NULL value
> > pointer altogether); -ENOENT, otherwise. Again, worst-case we can
> > specify that "value_size" is 1 or 4 bytes and write 1 to it.
> >
> > Does it make sense?
> >
> >
> > BTW, for STACK/QUEUE maps, I'm not clear why we had to add
> > map_peek_elem/map_pop_elem/map_push_elem operations, to be honest.
> > They could have been pretty reasonably mapped to map_lookup_elem (peek
> > is non-destructive lookup), map_lookup_elem + some flag (pop is
> > lookup, but destructive, so flag allows "destruction"), and
> > map_update_elem (for push). map_delete_elem would be pop for when
> > value can be ignored.
> >
> > That's to say that I don't consider STACK/QUEUE maps as good examples,
> > it's rather a counter-example of maps that barely anyone is using, yet
> > it just adds clutter to BPF internals.
>
> Repurposing lookup/update ops for stack/queue and forcing bpf progs
> to pass dummy key would have looked quite ugly.

The idea was to pass non-NULL *key* pointer. For bpf_map_update_elem()
you'd pass non-NULL key and NULL value. But it's fine, we have
peek/pop/push now, might as well use them. From user-space side it's
still mapped to LOOKUP/UPDATE/DELETE commands, right? BTW, seems like
we have LOOKUP_AND_DELETE command as well which is implemented for
STACK/QUEUE maps, but not BLOOM_FILTER. I guess for consistency we'd
have to implement that one for user-space as well?

> peek/pop/push have one pointer. That pointer points to value.
> For bitset we have single pointer as well.
> So it makes perfect sense to reuse push/pop/peek and keep bitset
> as a value from the verifier side.

Ok.

> bpf_helper_defs.h could have static inline functions:
> bool bpf_bitset_clear(map, key);
> bool bpf_bitset_set(map, key);
> bool bpf_bitset_test(map, key);
>
> But they will map to FUNC_map_pop_elem/push/peek as func ids
> and will be seen as values from the verifier pov.
>
> The bpf progs might think of them as keys, but they will be value_size.
> The bitset creation could use key_size as an input, but internally
> set it as value_size.
> Not sure whether such internal vs uapi remaping will not lead
> to confusion and bugs though.

I think it will and I wouldn't do it.

> I agree that key as an input to bitset_clear/set/test make more sense,
> but the verifier needs value_size to plug into peek/pop infra.

Which is why I initially proposed to not use peek/pop infra and
instead do lookup/update/delete, but if a NULL value pointer passed
into bpf_map_update_elem() disqualifies this, then it's fine. But I
didn't ask to repurpose peek/pop for working with key pointers.

>
> I don't think it's worth to extend ops with yet another 3 callbacks
> and have clean ARG_PTR_TO_UNINIT_MAP_KEY there.
> That is probably too much clutter.
>
> I think
> bool bpf_bitset_clear(map, value);
> bool bpf_bitset_set(map, value);
> bool bpf_bitset_test(map, value);
> doesn't look too bad either.
> At least this way the bit set map_create op will use value_size
> and keep it as value_size. The bpf prog won't be confusing.
> That's my preference, so far.

  reply	other threads:[~2021-09-27 21:14 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-21 21:02 [PATCH v3 bpf-next 0/5] Implement bloom filter map Joanne Koong
2021-09-21 21:02 ` [PATCH v3 bpf-next 1/5] bpf: Add bloom filter map implementation Joanne Koong
2021-09-21 23:44   ` Andrii Nakryiko
2021-09-22 19:06     ` Joanne Koong
2021-09-22 19:38       ` Martin KaFai Lau
2021-09-22 20:52         ` Andrii Nakryiko
2021-09-22 22:08           ` Martin KaFai Lau
2021-09-22 23:07             ` Andrii Nakryiko
2021-09-23  1:28               ` Martin KaFai Lau
2021-09-23 18:42                 ` Andrii Nakryiko
2021-09-23 19:42                   ` Martin KaFai Lau
2021-09-23 20:30                     ` Alexei Starovoitov
2021-09-23 21:12                       ` Andrii Nakryiko
2021-09-23 22:28                         ` Joanne Koong
2021-09-23 23:46                           ` Martin KaFai Lau
2021-09-24  2:23                           ` Andrii Nakryiko
2021-09-24 16:32                             ` Joanne Koong
2021-09-24 23:12                               ` Andrii Nakryiko
2021-09-27 16:41                                 ` Alexei Starovoitov
2021-09-27 21:14                                   ` Andrii Nakryiko [this message]
2021-09-27 23:51                                     ` Alexei Starovoitov
2021-09-28  0:36                                       ` Andrii Nakryiko
2021-09-28 16:21                                         ` Alexei Starovoitov
     [not found]                                           ` <aa967ed2-a958-f995-3a09-bbd6b6e775d4@fb.com>
2021-09-28 23:54                                             ` Andrii Nakryiko
2021-09-29  1:54                                               ` Joanne Koong
2021-09-29  0:14                                           ` Andrii Nakryiko
2021-09-29  3:17                                             ` Alexei Starovoitov
2021-09-29  3:38                                               ` Joanne Koong
2021-09-28  1:09                                 ` Martin KaFai Lau
2021-09-22 20:44       ` Andrii Nakryiko
2021-09-21 21:02 ` [PATCH v3 bpf-next 2/5] libbpf: Allow the number of hashes in bloom filter maps to be configurable Joanne Koong
2021-09-21 22:24   ` Joanne Koong
2021-09-22 23:14     ` Andrii Nakryiko
2021-09-21 23:57   ` Andrii Nakryiko
2021-09-21 21:02 ` [PATCH v3 bpf-next 3/5] selftests/bpf: Add bloom filter map test cases Joanne Koong
2021-09-21 21:02 ` [PATCH v3 bpf-next 4/5] bpf/benchs: Add benchmark test for bloom filter maps Joanne Koong
2021-09-21 21:02 ` [PATCH v3 bpf-next 5/5] bpf/benchs: Add benchmarks for comparing hashmap lookups with vs. without bloom filter Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEf4Bzb7bASQ3jXJ3JiKBinnb4ct9Y5pAhn-eVsdCY7rRus8Fg@mail.gmail.com \
    --to=andrii.nakryiko@gmail.com \
    --cc=Kernel-team@fb.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=joannekoong@fb.com \
    --cc=kafai@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).