All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joanne Koong <joannekoong@fb.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Kernel Team <Kernel-team@fb.com>
Subject: Re: [PATCH v3 bpf-next 1/5] bpf: Add bloom filter map implementation
Date: Wed, 22 Sep 2021 12:06:02 -0700	[thread overview]
Message-ID: <517a137d-66aa-8aa8-a064-fad8ae0c7fa8@fb.com> (raw)
In-Reply-To: <CAEf4BzZfeGGv+gBbfBJq5W8eQESgdqeNaByk-agOgMaB8BjQhA@mail.gmail.com>


On 9/21/21 4:44 PM, Andrii Nakryiko wrote:
> On Tue, Sep 21, 2021 at 2:30 PM Joanne Koong <joannekoong@fb.com> wrote:
>> Bloom filters are a space-efficient probabilistic data structure
>> used to quickly test whether an element exists in a set.
>> In a bloom filter, false positives are possible whereas false
>> negatives should never be.
>>
>> This patch adds a bloom filter map for bpf programs.
>> The bloom filter map supports peek (determining whether an element
>> is present in the map) and push (adding an element to the map)
>> operations.These operations are exposed to userspace applications
>> through the already existing syscalls in the following way:
>>
>> BPF_MAP_LOOKUP_ELEM -> peek
>> BPF_MAP_UPDATE_ELEM -> push
>>
>> The bloom filter map does not have keys, only values. In light of
>> this, the bloom filter map's API matches that of queue stack maps:
>> user applications use BPF_MAP_LOOKUP_ELEM/BPF_MAP_UPDATE_ELEM
>> which correspond internally to bpf_map_peek_elem/bpf_map_push_elem,
>> and bpf programs must use the bpf_map_peek_elem and bpf_map_push_elem
>> APIs to query or add an element to the bloom filter map. When the
>> bloom filter map is created, it must be created with a key_size of 0.
>>
>> For updates, the user will pass in the element to add to the map
>> as the value, with a NULL key. For lookups, the user will pass in the
>> element to query in the map as the value. In the verifier layer, this
>> requires us to modify the argument type of a bloom filter's
>> BPF_FUNC_map_peek_elem call to ARG_PTR_TO_MAP_VALUE; as well, in
>> the syscall layer, we need to copy over the user value so that in
>> bpf_map_peek_elem, we know which specific value to query.
>>
>> A few things to please take note of:
>>   * If there are any concurrent lookups + updates, the user is
>> responsible for synchronizing this to ensure no false negative lookups
>> occur.
>>   * The number of hashes to use for the bloom filter is configurable from
>> userspace. If no number is specified, the default used will be 5 hash
>> functions. The benchmarks later in this patchset can help compare the
>> performance of using different number of hashes on different entry
>> sizes. In general, using more hashes decreases the speed of a lookup,
>> but increases the false positive rate of an element being detected in the
>> bloom filter.
>>   * Deleting an element in the bloom filter map is not supported.
>>   * The bloom filter map may be used as an inner map.
>>   * The "max_entries" size that is specified at map creation time is used to
>> approximate a reasonable bitmap size for the bloom filter, and is not
>> otherwise strictly enforced. If the user wishes to insert more entries into
>> the bloom filter than "max_entries", they may do so but they should be
>> aware that this may lead to a higher false positive rate.
>>
>> Signed-off-by: Joanne Koong <joannekoong@fb.com>
>> ---
>>   include/linux/bpf_types.h      |   1 +
>>   include/uapi/linux/bpf.h       |   1 +
>>   kernel/bpf/Makefile            |   2 +-
>>   kernel/bpf/bloom_filter.c      | 185 +++++++++++++++++++++++++++++++++
>>   kernel/bpf/syscall.c           |  14 ++-
>>   kernel/bpf/verifier.c          |  19 +++-
>>   tools/include/uapi/linux/bpf.h |   1 +
>>   7 files changed, 217 insertions(+), 6 deletions(-)
>>   create mode 100644 kernel/bpf/bloom_filter.c
>>
> See some stylistic nitpicking below (and not a nitpicking about BTF).
>
> But I just wanted to say that I'm a bit amazed by how much special
> casing this BLOOM_FILTER map requires in syscall.c and verifier.c. I
> still believe that starting with a BPF helper for hashing would be a
> better approach, but oh well.
>
> [...]
I liked your comment on v1 regarding using a BPF helper and I agree with 
the benefits
you outlined. I'm curious to see what the performance differences 
between that approach
and this one end up being, if any. I plan to test out the BPF helper 
approach in a few weeks,
and if the performance is comparable or better, I am definitely open to 
reverting this code
and just going with the BPF helper approach :)
>> +
>> +static inline u32 hash(struct bpf_bloom_filter *bloom_filter, void *value,
>> +                      u64 value_size, u32 index)
>> +{
>> +       if (bloom_filter->aligned_u32_count)
>> +               return jhash2(value, bloom_filter->aligned_u32_count,
>> +                             bloom_filter->hash_seed + index) &
>> +                       bloom_filter->bit_array_mask;
>> +
>> +       return jhash(value, value_size, bloom_filter->hash_seed + index) &
>> +               bloom_filter->bit_array_mask;
> stylistic nit, but this feels way to dense text-wise, this seems
> easier to follow
>
> u32 h;
>
> if (bloom_filter->aligned_u32_count)
>      h = jhash2(...);
> else
>      h = jhash(...);
> return h & bloom_filter->bit_array_mask;
>
> WDYT?
I think this sounds great! I will make these changes for v4.
>> +}
>> +
>> +static int bloom_filter_map_peek_elem(struct bpf_map *map, void *value)
>> +{
>> +       struct bpf_bloom_filter *bloom_filter =
>> +               container_of(map, struct bpf_bloom_filter, map);
>> +       u32 i;
>> +
>> +       for (i = 0; i < bloom_filter->nr_hash_funcs; i++) {
>> +               if (!test_bit(hash(bloom_filter, value, map->value_size, i),
>> +                             bloom_filter->bit_array))
>> +                       return -ENOENT;
> same here, I think the hash calculation deserves a separate statement
> and a local variable
>
>> +       }
>> +
>> +       return 0;
>> +}
>> +
> [...]
>
>> +static void bloom_filter_map_free(struct bpf_map *map)
>> +{
>> +       struct bpf_bloom_filter *bloom_filter =
>> +               container_of(map, struct bpf_bloom_filter, map);
>> +
>> +       bpf_map_area_free(bloom_filter);
>> +}
>> +
>> +static int bloom_filter_map_push_elem(struct bpf_map *map, void *value,
>> +                                     u64 flags)
>> +{
>> +       struct bpf_bloom_filter *bloom_filter =
>> +               container_of(map, struct bpf_bloom_filter, map);
>> +       u32 i;
>> +
>> +       if (flags != BPF_ANY)
>> +               return -EINVAL;
>> +
>> +       for (i = 0; i < bloom_filter->nr_hash_funcs; i++)
>> +               set_bit(hash(bloom_filter, value, map->value_size, i),
>> +                       bloom_filter->bit_array);
> same as above about hash() call on separate line
>
>> +
>> +       return 0;
>> +}
>> +
>> +static void *bloom_filter_map_lookup_elem(struct bpf_map *map, void *key)
>> +{
>> +       /* The eBPF program should use map_peek_elem instead */
>> +       return ERR_PTR(-EINVAL);
>> +}
>> +
>> +static int bloom_filter_map_update_elem(struct bpf_map *map, void *key,
>> +                                       void *value, u64 flags)
>> +{
>> +       /* The eBPF program should use map_push_elem instead */
>> +       return -EINVAL;
>> +}
>> +
>> +static int bloom_filter_map_delete_elem(struct bpf_map *map, void *key)
>> +{
>> +       return -EOPNOTSUPP;
>> +}
>> +
>> +static int bloom_filter_map_get_next_key(struct bpf_map *map, void *key,
>> +                                        void *next_key)
>> +{
>> +       return -EOPNOTSUPP;
>> +}
>> +
>> +static int bloom_filter_map_btf_id;
>> +const struct bpf_map_ops bloom_filter_map_ops = {
>> +       .map_meta_equal = bpf_map_meta_equal,
>> +       .map_alloc = bloom_filter_map_alloc,
>> +       .map_free = bloom_filter_map_free,
>> +       .map_push_elem = bloom_filter_map_push_elem,
>> +       .map_peek_elem = bloom_filter_map_peek_elem,
>> +       .map_lookup_elem = bloom_filter_map_lookup_elem,
>> +       .map_update_elem = bloom_filter_map_update_elem,
>> +       .map_delete_elem = bloom_filter_map_delete_elem,
>> +       .map_get_next_key = bloom_filter_map_get_next_key,
>> +       .map_check_btf = map_check_no_btf,
> can you please implement basically a no-op callback here to allow
> specifying btf_value_id, there is no good reason to restrict this new
> map to not allow BTF type being specified for its value
Sounds great, will add this in v4.
>> +       .map_btf_name = "bpf_bloom_filter",
>> +       .map_btf_id = &bloom_filter_map_btf_id,
>> +};
> [...]

  reply	other threads:[~2021-09-22 19:06 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-21 21:02 [PATCH v3 bpf-next 0/5] Implement bloom filter map Joanne Koong
2021-09-21 21:02 ` [PATCH v3 bpf-next 1/5] bpf: Add bloom filter map implementation Joanne Koong
2021-09-21 23:44   ` Andrii Nakryiko
2021-09-22 19:06     ` Joanne Koong [this message]
2021-09-22 19:38       ` Martin KaFai Lau
2021-09-22 20:52         ` Andrii Nakryiko
2021-09-22 22:08           ` Martin KaFai Lau
2021-09-22 23:07             ` Andrii Nakryiko
2021-09-23  1:28               ` Martin KaFai Lau
2021-09-23 18:42                 ` Andrii Nakryiko
2021-09-23 19:42                   ` Martin KaFai Lau
2021-09-23 20:30                     ` Alexei Starovoitov
2021-09-23 21:12                       ` Andrii Nakryiko
2021-09-23 22:28                         ` Joanne Koong
2021-09-23 23:46                           ` Martin KaFai Lau
2021-09-24  2:23                           ` Andrii Nakryiko
2021-09-24 16:32                             ` Joanne Koong
2021-09-24 23:12                               ` Andrii Nakryiko
2021-09-27 16:41                                 ` Alexei Starovoitov
2021-09-27 21:14                                   ` Andrii Nakryiko
2021-09-27 23:51                                     ` Alexei Starovoitov
2021-09-28  0:36                                       ` Andrii Nakryiko
2021-09-28 16:21                                         ` Alexei Starovoitov
     [not found]                                           ` <aa967ed2-a958-f995-3a09-bbd6b6e775d4@fb.com>
2021-09-28 23:54                                             ` Andrii Nakryiko
2021-09-29  1:54                                               ` Joanne Koong
2021-09-29  0:14                                           ` Andrii Nakryiko
2021-09-29  3:17                                             ` Alexei Starovoitov
2021-09-29  3:38                                               ` Joanne Koong
2021-09-28  1:09                                 ` Martin KaFai Lau
2021-09-22 20:44       ` Andrii Nakryiko
2021-09-21 21:02 ` [PATCH v3 bpf-next 2/5] libbpf: Allow the number of hashes in bloom filter maps to be configurable Joanne Koong
2021-09-21 22:24   ` Joanne Koong
2021-09-22 23:14     ` Andrii Nakryiko
2021-09-21 23:57   ` Andrii Nakryiko
2021-09-21 21:02 ` [PATCH v3 bpf-next 3/5] selftests/bpf: Add bloom filter map test cases Joanne Koong
2021-09-21 21:02 ` [PATCH v3 bpf-next 4/5] bpf/benchs: Add benchmark test for bloom filter maps Joanne Koong
2021-09-21 21:02 ` [PATCH v3 bpf-next 5/5] bpf/benchs: Add benchmarks for comparing hashmap lookups with vs. without bloom filter Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=517a137d-66aa-8aa8-a064-fad8ae0c7fa8@fb.com \
    --to=joannekoong@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=bpf@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.