From: John Fastabend <john.fastabend@gmail.com>
To: Joanne Koong <joannekoong@fb.com>, bpf@vger.kernel.org
Cc: Joanne Koong <joannekoong@fb.com>
Subject: RE: [PATCH bpf-next 1/5] bpf: Add bloom filter map implementation
Date: Wed, 01 Sep 2021 20:16:40 -0700 [thread overview]
Message-ID: <61304218227e8_1aed208dd@john-XPS-13-9370.notmuch> (raw)
In-Reply-To: <20210831225005.2762202-2-joannekoong@fb.com>
Joanne Koong wrote:
> Bloom filters are a space-efficient probabilistic data structure
> used to quickly test whether an element exists in a set.
> In a bloom filter, false positives are possible whereas false
> negatives are not.
>
> This patch adds a bloom filter map for bpf programs.
> The bloom filter map supports peek (determining whether an element
> is present in the map) and push (adding an element to the map)
> operations.These operations are exposed to userspace applications
> through the already existing syscalls in the following way:
>
> BPF_MAP_LOOKUP_ELEM -> peek
> BPF_MAP_UPDATE_ELEM -> push
>
> The bloom filter map does not have keys, only values. In light of
> this, the bloom filter map's API matches that of queue stack maps:
> user applications use BPF_MAP_LOOKUP_ELEM/BPF_MAP_UPDATE_ELEM
> which correspond internally to bpf_map_peek_elem/bpf_map_push_elem,
> and bpf programs must use the bpf_map_peek_elem and bpf_map_push_elem
> APIs to query or add an element to the bloom filter map. When the
> bloom filter map is created, it must be created with a key_size of 0.
>
> For updates, the user will pass in the element to add to the map
> as the value, wih a NULL key. For lookups, the user will pass in the
> element to query in the map as the value. In the verifier layer, this
> requires us to modify the argument type of a bloom filter's
> BPF_FUNC_map_peek_elem call to ARG_PTR_TO_MAP_VALUE; as well, in
> the syscall layer, we need to copy over the user value so that in
> bpf_map_peek_elem, we know which specific value to query.
>
> The maximum number of entries in the bloom filter is not enforced; if
> the user wishes to insert more entries into the bloom filter than they
> specified as the max entries size of the bloom filter, that is permitted
> but the performance of their bloom filter will have a higher false
> positive rate.
hmm I'm wondering if this means the memory footprint can grow without
bounds? Typically maps have an upper bound on memory established at
alloc time.
In queue_stack_map_alloc() we have,
queue_size = sizeof(*qs) + size * attr->value_size);
bpf_map_area_alloc(queue_size, numa_node)
In hashmap (not preallocated) we have, alloc_htab_elem() that will
give us an upper bound.
Is there a practical value in allowing these to grow endlessly? And
should we be charging the value memory against something? In
bpf_map_kmalloc_node we set_active_memcg() for example.
I'll review code as well, but think above is worth some thought.
>
> The number of hashes to use for the bloom filter is configurable from
> userspace. The benchmarks later in this patchset can help compare the
> performances of different number of hashes on different entry
> sizes. In general, using more hashes decreases the speed of a lookup,
> but increases the false positive rate of an element being detected in the
> bloom filter.
>
> Signed-off-by: Joanne Koong <joannekoong@fb.com>
next prev parent reply other threads:[~2021-09-02 3:16 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-31 22:50 [PATCH bpf-next 0/5] Implement bloom filter map Joanne Koong
2021-08-31 22:50 ` [PATCH bpf-next 1/5] bpf: Add bloom filter map implementation Joanne Koong
2021-09-01 2:55 ` Alexei Starovoitov
2021-09-02 19:11 ` Joanne Koong
2021-09-02 1:44 ` Andrii Nakryiko
2021-09-02 5:11 ` John Fastabend
2021-09-02 6:16 ` Martin KaFai Lau
2021-09-02 22:07 ` Joanne Koong
2021-09-03 0:56 ` Martin KaFai Lau
2021-09-03 7:13 ` Joanne Koong
2021-09-03 17:19 ` Andrii Nakryiko
2021-09-03 17:22 ` John Fastabend
2021-09-08 19:10 ` Joanne Koong
2021-09-02 3:16 ` John Fastabend [this message]
2021-09-02 3:28 ` Andrii Nakryiko
2021-09-02 4:40 ` John Fastabend
2021-08-31 22:50 ` [PATCH bpf-next 2/5] libbpf: Allow the number of hashes in bloom filter maps to be configurable Joanne Koong
2021-09-02 3:30 ` Andrii Nakryiko
2021-08-31 22:50 ` [PATCH bpf-next 3/5] selftests/bpf: Add bloom filter map test cases Joanne Koong
2021-09-01 2:55 ` Alexei Starovoitov
2021-08-31 22:50 ` [PATCH bpf-next 4/5] bpf/benchs: Add benchmark test for bloom filter maps Joanne Koong
2021-09-02 3:35 ` Andrii Nakryiko
2021-08-31 22:50 ` [PATCH bpf-next 5/5] bpf/benchs: Add benchmarks for comparing hashmap lookups with vs. without bloom filter Joanne Koong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=61304218227e8_1aed208dd@john-XPS-13-9370.notmuch \
--to=john.fastabend@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=joannekoong@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).