bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joanne Koong <joannekoong@fb.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Kernel Team <Kernel-team@fb.com>
Subject: Re: [PATCH v3 bpf-next 1/5] bpf: Add bloom filter map implementation
Date: Wed, 22 Sep 2021 12:06:02 -0700	[thread overview]
Message-ID: <517a137d-66aa-8aa8-a064-fad8ae0c7fa8@fb.com> (raw)
In-Reply-To: <CAEf4BzZfeGGv+gBbfBJq5W8eQESgdqeNaByk-agOgMaB8BjQhA@mail.gmail.com>


On 9/21/21 4:44 PM, Andrii Nakryiko wrote:
> On Tue, Sep 21, 2021 at 2:30 PM Joanne Koong <joannekoong@fb.com> wrote:
>> Bloom filters are a space-efficient probabilistic data structure
>> used to quickly test whether an element exists in a set.
>> In a bloom filter, false positives are possible whereas false
>> negatives should never be.
>>
>> This patch adds a bloom filter map for bpf programs.
>> The bloom filter map supports peek (determining whether an element
>> is present in the map) and push (adding an element to the map)
>> operations.These operations are exposed to userspace applications
>> through the already existing syscalls in the following way:
>>
>> BPF_MAP_LOOKUP_ELEM -> peek
>> BPF_MAP_UPDATE_ELEM -> push
>>
>> The bloom filter map does not have keys, only values. In light of
>> this, the bloom filter map's API matches that of queue stack maps:
>> user applications use BPF_MAP_LOOKUP_ELEM/BPF_MAP_UPDATE_ELEM
>> which correspond internally to bpf_map_peek_elem/bpf_map_push_elem,
>> and bpf programs must use the bpf_map_peek_elem and bpf_map_push_elem
>> APIs to query or add an element to the bloom filter map. When the
>> bloom filter map is created, it must be created with a key_size of 0.
>>
>> For updates, the user will pass in the element to add to the map
>> as the value, with a NULL key. For lookups, the user will pass in the
>> element to query in the map as the value. In the verifier layer, this
>> requires us to modify the argument type of a bloom filter's
>> BPF_FUNC_map_peek_elem call to ARG_PTR_TO_MAP_VALUE; as well, in
>> the syscall layer, we need to copy over the user value so that in
>> bpf_map_peek_elem, we know which specific value to query.
>>
>> A few things to please take note of:
>>   * If there are any concurrent lookups + updates, the user is
>> responsible for synchronizing this to ensure no false negative lookups
>> occur.
>>   * The number of hashes to use for the bloom filter is configurable from
>> userspace. If no number is specified, the default used will be 5 hash
>> functions. The benchmarks later in this patchset can help compare the
>> performance of using different number of hashes on different entry
>> sizes. In general, using more hashes decreases the speed of a lookup,
>> but increases the false positive rate of an element being detected in the
>> bloom filter.
>>   * Deleting an element in the bloom filter map is not supported.
>>   * The bloom filter map may be used as an inner map.
>>   * The "max_entries" size that is specified at map creation time is used to
>> approximate a reasonable bitmap size for the bloom filter, and is not
>> otherwise strictly enforced. If the user wishes to insert more entries into
>> the bloom filter than "max_entries", they may do so but they should be
>> aware that this may lead to a higher false positive rate.
>>
>> Signed-off-by: Joanne Koong <joannekoong@fb.com>
>> ---
>>   include/linux/bpf_types.h      |   1 +
>>   include/uapi/linux/bpf.h       |   1 +
>>   kernel/bpf/Makefile            |   2 +-
>>   kernel/bpf/bloom_filter.c      | 185 +++++++++++++++++++++++++++++++++
>>   kernel/bpf/syscall.c           |  14 ++-
>>   kernel/bpf/verifier.c          |  19 +++-
>>   tools/include/uapi/linux/bpf.h |   1 +
>>   7 files changed, 217 insertions(+), 6 deletions(-)
>>   create mode 100644 kernel/bpf/bloom_filter.c
>>
> See some stylistic nitpicking below (and not a nitpicking about BTF).
>
> But I just wanted to say that I'm a bit amazed by how much special
> casing this BLOOM_FILTER map requires in syscall.c and verifier.c. I
> still believe that starting with a BPF helper for hashing would be a
> better approach, but oh well.
>
> [...]
I liked your comment on v1 regarding using a BPF helper and I agree with 
the benefits
you outlined. I'm curious to see what the performance differences 
between that approach
and this one end up being, if any. I plan to test out the BPF helper 
approach in a few weeks,
and if the performance is comparable or better, I am definitely open to 
reverting this code
and just going with the BPF helper approach :)
>> +
>> +static inline u32 hash(struct bpf_bloom_filter *bloom_filter, void *value,
>> +                      u64 value_size, u32 index)
>> +{
>> +       if (bloom_filter->aligned_u32_count)
>> +               return jhash2(value, bloom_filter->aligned_u32_count,
>> +                             bloom_filter->hash_seed + index) &
>> +                       bloom_filter->bit_array_mask;
>> +
>> +       return jhash(value, value_size, bloom_filter->hash_seed + index) &
>> +               bloom_filter->bit_array_mask;
> stylistic nit, but this feels way to dense text-wise, this seems
> easier to follow
>
> u32 h;
>
> if (bloom_filter->aligned_u32_count)
>      h = jhash2(...);
> else
>      h = jhash(...);
> return h & bloom_filter->bit_array_mask;
>
> WDYT?
I think this sounds great! I will make these changes for v4.
>> +}
>> +
>> +static int bloom_filter_map_peek_elem(struct bpf_map *map, void *value)
>> +{
>> +       struct bpf_bloom_filter *bloom_filter =
>> +               container_of(map, struct bpf_bloom_filter, map);
>> +       u32 i;
>> +
>> +       for (i = 0; i < bloom_filter->nr_hash_funcs; i++) {
>> +               if (!test_bit(hash(bloom_filter, value, map->value_size, i),
>> +                             bloom_filter->bit_array))
>> +                       return -ENOENT;
> same here, I think the hash calculation deserves a separate statement
> and a local variable
>
>> +       }
>> +
>> +       return 0;
>> +}
>> +
> [...]
>
>> +static void bloom_filter_map_free(struct bpf_map *map)
>> +{
>> +       struct bpf_bloom_filter *bloom_filter =
>> +               container_of(map, struct bpf_bloom_filter, map);
>> +
>> +       bpf_map_area_free(bloom_filter);
>> +}
>> +
>> +static int bloom_filter_map_push_elem(struct bpf_map *map, void *value,
>> +                                     u64 flags)
>> +{
>> +       struct bpf_bloom_filter *bloom_filter =
>> +               container_of(map, struct bpf_bloom_filter, map);
>> +       u32 i;
>> +
>> +       if (flags != BPF_ANY)
>> +               return -EINVAL;
>> +
>> +       for (i = 0; i < bloom_filter->nr_hash_funcs; i++)
>> +               set_bit(hash(bloom_filter, value, map->value_size, i),
>> +                       bloom_filter->bit_array);
> same as above about hash() call on separate line
>
>> +
>> +       return 0;
>> +}
>> +
>> +static void *bloom_filter_map_lookup_elem(struct bpf_map *map, void *key)
>> +{
>> +       /* The eBPF program should use map_peek_elem instead */
>> +       return ERR_PTR(-EINVAL);
>> +}
>> +
>> +static int bloom_filter_map_update_elem(struct bpf_map *map, void *key,
>> +                                       void *value, u64 flags)
>> +{
>> +       /* The eBPF program should use map_push_elem instead */
>> +       return -EINVAL;
>> +}
>> +
>> +static int bloom_filter_map_delete_elem(struct bpf_map *map, void *key)
>> +{
>> +       return -EOPNOTSUPP;
>> +}
>> +
>> +static int bloom_filter_map_get_next_key(struct bpf_map *map, void *key,
>> +                                        void *next_key)
>> +{
>> +       return -EOPNOTSUPP;
>> +}
>> +
>> +static int bloom_filter_map_btf_id;
>> +const struct bpf_map_ops bloom_filter_map_ops = {
>> +       .map_meta_equal = bpf_map_meta_equal,
>> +       .map_alloc = bloom_filter_map_alloc,
>> +       .map_free = bloom_filter_map_free,
>> +       .map_push_elem = bloom_filter_map_push_elem,
>> +       .map_peek_elem = bloom_filter_map_peek_elem,
>> +       .map_lookup_elem = bloom_filter_map_lookup_elem,
>> +       .map_update_elem = bloom_filter_map_update_elem,
>> +       .map_delete_elem = bloom_filter_map_delete_elem,
>> +       .map_get_next_key = bloom_filter_map_get_next_key,
>> +       .map_check_btf = map_check_no_btf,
> can you please implement basically a no-op callback here to allow
> specifying btf_value_id, there is no good reason to restrict this new
> map to not allow BTF type being specified for its value
Sounds great, will add this in v4.
>> +       .map_btf_name = "bpf_bloom_filter",
>> +       .map_btf_id = &bloom_filter_map_btf_id,
>> +};
> [...]

  reply	other threads:[~2021-09-22 19:06 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-21 21:02 [PATCH v3 bpf-next 0/5] Implement bloom filter map Joanne Koong
2021-09-21 21:02 ` [PATCH v3 bpf-next 1/5] bpf: Add bloom filter map implementation Joanne Koong
2021-09-21 23:44   ` Andrii Nakryiko
2021-09-22 19:06     ` Joanne Koong [this message]
2021-09-22 19:38       ` Martin KaFai Lau
2021-09-22 20:52         ` Andrii Nakryiko
2021-09-22 22:08           ` Martin KaFai Lau
2021-09-22 23:07             ` Andrii Nakryiko
2021-09-23  1:28               ` Martin KaFai Lau
2021-09-23 18:42                 ` Andrii Nakryiko
2021-09-23 19:42                   ` Martin KaFai Lau
2021-09-23 20:30                     ` Alexei Starovoitov
2021-09-23 21:12                       ` Andrii Nakryiko
2021-09-23 22:28                         ` Joanne Koong
2021-09-23 23:46                           ` Martin KaFai Lau
2021-09-24  2:23                           ` Andrii Nakryiko
2021-09-24 16:32                             ` Joanne Koong
2021-09-24 23:12                               ` Andrii Nakryiko
2021-09-27 16:41                                 ` Alexei Starovoitov
2021-09-27 21:14                                   ` Andrii Nakryiko
2021-09-27 23:51                                     ` Alexei Starovoitov
2021-09-28  0:36                                       ` Andrii Nakryiko
2021-09-28 16:21                                         ` Alexei Starovoitov
     [not found]                                           ` <aa967ed2-a958-f995-3a09-bbd6b6e775d4@fb.com>
2021-09-28 23:54                                             ` Andrii Nakryiko
2021-09-29  1:54                                               ` Joanne Koong
2021-09-29  0:14                                           ` Andrii Nakryiko
2021-09-29  3:17                                             ` Alexei Starovoitov
2021-09-29  3:38                                               ` Joanne Koong
2021-09-28  1:09                                 ` Martin KaFai Lau
2021-09-22 20:44       ` Andrii Nakryiko
2021-09-21 21:02 ` [PATCH v3 bpf-next 2/5] libbpf: Allow the number of hashes in bloom filter maps to be configurable Joanne Koong
2021-09-21 22:24   ` Joanne Koong
2021-09-22 23:14     ` Andrii Nakryiko
2021-09-21 23:57   ` Andrii Nakryiko
2021-09-21 21:02 ` [PATCH v3 bpf-next 3/5] selftests/bpf: Add bloom filter map test cases Joanne Koong
2021-09-21 21:02 ` [PATCH v3 bpf-next 4/5] bpf/benchs: Add benchmark test for bloom filter maps Joanne Koong
2021-09-21 21:02 ` [PATCH v3 bpf-next 5/5] bpf/benchs: Add benchmarks for comparing hashmap lookups with vs. without bloom filter Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=517a137d-66aa-8aa8-a064-fad8ae0c7fa8@fb.com \
    --to=joannekoong@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=bpf@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).