bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Martin KaFai Lau <kafai@fb.com>
To: Joanne Koong <joannekoong@fb.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	bpf <bpf@vger.kernel.org>, Kernel Team <Kernel-team@fb.com>
Subject: Re: [PATCH v3 bpf-next 1/5] bpf: Add bloom filter map implementation
Date: Thu, 23 Sep 2021 16:46:02 -0700	[thread overview]
Message-ID: <20210923234602.3pxuw3kzdayc675j@kafai-mbp> (raw)
In-Reply-To: <7957a053-8b98-1e09-26c8-882df6920e6e@fb.com>

On Thu, Sep 23, 2021 at 03:28:01PM -0700, Joanne Koong wrote:
> > > As far as map based bloom filter I think it can combine bitset
> > > and bloomfilter features into one. delete_elem from user space
> > > can be mapped into pop() to clear bits.
> > > Some special value of nr_hashes could mean that no hashing
> > > is applied and 4 or 8 byte key gets modulo of max_entries
> > > and treated as a bit index. Both bpf prog and user space will
> > > have uniform access into such bitset. With nr_hashes >= 1
> > > it will become a bloom filter.
> > > In that sense may be max_entries should be specified in bits
> > > and the map is called bitset. With nr_hashes >= 1 the kernel
> > > would accept key_size > 8 and convert it to bloom filter
> > > peek/pop/push. In other words
> > > nr_hash == 0 bit_idx == key for set/read/clear
> > > nr_hashes >= 1 bit_idx[1..N] = hash(key, N) for set/read/clear.
I like this bitset+nr_hash semantic, then max_entries logcially follows
the number of bits.

> > If we do the map, though, regardless if it's bitset or bloom
> > specifically. Maybe we should consider modeling as actual
> > bpf_map_lookup_elem(), where the key is a pointer to whatever we are
> > hashing and looking up? It makes much more sense, that's how people
> > model sets based on maps: key is the element you are looking up, value
> > is either true/false or meaningless (at least for me it felt much more
> > natural that you are looking up by key, not by value). In this case,
> > what if on successful lookup we return a pointer to some fixed
> > u8/u32/u64 location in the kernel, some dedicated static variable
> > shared between all maps. So NULL means "element is not in a set",
> > non-NULL means it is in the set.
> I think this would then also require that the bpf_map_update_elem() API from
> the userspace side would have to pass in a valid memory address for the
> "value".  I understand what you're saying though about it feeling more natural
> that the "key" is the element here; I agree but there doesn't seem to be a
> clean way of doing this - I think maybe one viable approach would be allowing
> map_update_elem to pass in a NULL value in the kernel if the map is a non-associative map,
> and refactoring the push_elem/peek_elem API so that the element can represent either the key or
> the value.
> > Ideally we'd prevent such element to
> > be written to, but it might be too hard to do that as just one
> > exception here, don't know.
I don't mind key or value also.  With nr_hash == 0 and key is
the bit_idx, it may be more correct to say that bit is
indeed 0/1 instead of returning the bit_idx back.

  reply	other threads:[~2021-09-23 23:46 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-21 21:02 [PATCH v3 bpf-next 0/5] Implement bloom filter map Joanne Koong
2021-09-21 21:02 ` [PATCH v3 bpf-next 1/5] bpf: Add bloom filter map implementation Joanne Koong
2021-09-21 23:44   ` Andrii Nakryiko
2021-09-22 19:06     ` Joanne Koong
2021-09-22 19:38       ` Martin KaFai Lau
2021-09-22 20:52         ` Andrii Nakryiko
2021-09-22 22:08           ` Martin KaFai Lau
2021-09-22 23:07             ` Andrii Nakryiko
2021-09-23  1:28               ` Martin KaFai Lau
2021-09-23 18:42                 ` Andrii Nakryiko
2021-09-23 19:42                   ` Martin KaFai Lau
2021-09-23 20:30                     ` Alexei Starovoitov
2021-09-23 21:12                       ` Andrii Nakryiko
2021-09-23 22:28                         ` Joanne Koong
2021-09-23 23:46                           ` Martin KaFai Lau [this message]
2021-09-24  2:23                           ` Andrii Nakryiko
2021-09-24 16:32                             ` Joanne Koong
2021-09-24 23:12                               ` Andrii Nakryiko
2021-09-27 16:41                                 ` Alexei Starovoitov
2021-09-27 21:14                                   ` Andrii Nakryiko
2021-09-27 23:51                                     ` Alexei Starovoitov
2021-09-28  0:36                                       ` Andrii Nakryiko
2021-09-28 16:21                                         ` Alexei Starovoitov
     [not found]                                           ` <aa967ed2-a958-f995-3a09-bbd6b6e775d4@fb.com>
2021-09-28 23:54                                             ` Andrii Nakryiko
2021-09-29  1:54                                               ` Joanne Koong
2021-09-29  0:14                                           ` Andrii Nakryiko
2021-09-29  3:17                                             ` Alexei Starovoitov
2021-09-29  3:38                                               ` Joanne Koong
2021-09-28  1:09                                 ` Martin KaFai Lau
2021-09-22 20:44       ` Andrii Nakryiko
2021-09-21 21:02 ` [PATCH v3 bpf-next 2/5] libbpf: Allow the number of hashes in bloom filter maps to be configurable Joanne Koong
2021-09-21 22:24   ` Joanne Koong
2021-09-22 23:14     ` Andrii Nakryiko
2021-09-21 23:57   ` Andrii Nakryiko
2021-09-21 21:02 ` [PATCH v3 bpf-next 3/5] selftests/bpf: Add bloom filter map test cases Joanne Koong
2021-09-21 21:02 ` [PATCH v3 bpf-next 4/5] bpf/benchs: Add benchmark test for bloom filter maps Joanne Koong
2021-09-21 21:02 ` [PATCH v3 bpf-next 5/5] bpf/benchs: Add benchmarks for comparing hashmap lookups with vs. without bloom filter Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210923234602.3pxuw3kzdayc675j@kafai-mbp \
    --to=kafai@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=joannekoong@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).